forked from probablycorey/baudy
55 lines
5.3 KiB
Markdown
55 lines
5.3 KiB
Markdown
# ggwave Gotchas
|
|
|
|
## WASM
|
|
|
|
- **WASM heap copies**: `encode()` returns Int8Array backed by WASM heap memory. Must copy data out (`new Uint8Array(rawBytes.length); copy.set(...)`) before using, or it gets corrupted.
|
|
- **Same instance required**: Using separate ggwave instances for encode/decode causes WASM heap pointer corruption. Use one instance for both. Max 4 instances allowed.
|
|
- **Encode output format**: `encode()` returns Int8Array of raw F32 *bytes*, not Float32Array. Reinterpret with `new Float32Array(bytesCopy.buffer)` for AudioBuffer.
|
|
- **API naming**: README says `TxProtocolId` but actual API uses `ProtocolId`. Check `Object.keys(ggwave)` when in doubt.
|
|
|
|
## iOS Safari Audio
|
|
|
|
- **AudioContext must be created synchronously** inside a user gesture handler (click/tap). Any `await` before `new AudioContext()` breaks the gesture chain and Safari blocks audio permanently for that context.
|
|
- **Silent mode bypass**: `navigator.audioSession.type = 'play-and-record'` (iOS 17+) bypasses the hardware mute switch. Do NOT use `'playback'` if you also need `getUserMedia` — the `'playback'` category tells iOS "speaker only" and blocks mic access with "AudioSession category is not compatible with audio capture".
|
|
- **Unlock pattern**: Create AudioContext → play a silent buffer → then await async init. Never reverse this order.
|
|
|
|
## macOS Microphone (sox + CoreAudio)
|
|
|
|
- **Explicit device names produce all-zero data**: `sox -t coreaudio "MacBook Air Microphone"` gets zeros due to macOS TCC permission scoping. Only `sox -d` (default device) gets actual mic access granted to the terminal app.
|
|
- **Workaround**: Change the default input device in System Settings > Sound > Input, then use `sox -d`. Or install `switchaudio-osx` (`brew install switchaudio-osx`) to change it programmatically.
|
|
- **MacBook Air mic disabled when lid is closed**: If using a monitor, the laptop mic won't work. Use the monitor's mic (e.g., Studio Display) instead.
|
|
- **Sample rate must match**: ggwave needs matching sample rates for encode/decode. Browser uses 48000Hz. Server must also use 48000Hz — sox will resample from the device's native rate automatically.
|
|
- **macOS keeps the original self-signed local HTTPS flow**: On macOS the app serves HTTPS directly again, using the original `localhost` self-signed cert behavior.
|
|
|
|
## Linux Audio (sox + ALSA)
|
|
|
|
- **Mac-only helpers don't exist**: `SwitchAudioSource` and `scutil` are macOS-only.
|
|
- **`sox -d` may be wrong on Linux**: ALSA's `default` PCM can be playback-only. In that case, `sox -d` can play audio but fail to open the mic. Prefer explicit devices such as `plughw:0,1` for playback and `plughw:0,0` for capture.
|
|
- **Use env overrides when needed**: `BAUDY_CAPTURE_DEVICE=plughw:0,0 BAUDY_PLAYBACK_DEVICE=plughw:0,1 bun run index.tsx`
|
|
- **Low mixer gain can look like a decode bug**: If loopback hears only a tiny signal, check `amixer` and raise the speaker path (for example `amixer -q sset 'Speaker Analog' 100%`).
|
|
- **Prefer Tailscale Serve for phone access**: Run the app on local HTTP and put it behind `tailscale serve <port>` so the phone gets a real HTTPS `*.ts.net` origin for `getUserMedia()`.
|
|
|
|
## Half-Duplex Audio
|
|
|
|
- Both sides (server and browser) use the same audible protocol (GGWAVE_PROTOCOL_AUDIBLE_FAST). A `playing` flag stops mic processing during playback to prevent self-hearing/feedback.
|
|
- The browser uses `ScriptProcessorNode` to feed mic audio frames to ggwave for decoding. Frames are accumulated to `samplesPerFrame` size before decoding.
|
|
- Browser mic requires `getUserMedia` with `echoCancellation: false`, `noiseSuppression: false`, `autoGainControl: false` to preserve signal integrity.
|
|
|
|
## Response Timing
|
|
|
|
- After sending a chirp, the sender must wait ~500ms before switching to listening mode. Without this gap, the receiver's response arrives while the sender is still in "playing" state and gets ignored.
|
|
- The receiver should also wait ~500ms after hearing a message before playing its response. This gives the sender time to finish playback, clear its buffer, and switch to listening.
|
|
- Both delays are needed: sender waits 500ms after its own playback, receiver waits 500ms before replying. Without both, the response window is missed.
|
|
- Use a `sendAndWait(text, timeout)` pattern on the client: set `playing=true`, play the chirp, sleep 500ms, set `playing=false`, then await a promise that resolves when the mic decoder receives a message (or times out).
|
|
|
|
## Sample Rate
|
|
|
|
- ggwave must be initialized with explicit sample rates matching the AudioContext: set `sampleRateInp`, `sampleRateOut`, and `sampleRate` on the params object. Using `getDefaultParameters()` without setting these may silently fail to decode if the AudioContext's actual rate differs from the default.
|
|
- iOS may not honor the requested `sampleRate` in the AudioContext constructor. Always read `audioContext.sampleRate` after creation and pass that to ggwave, don't assume 48000.
|
|
|
|
## getUserMedia Secure Context
|
|
|
|
- `navigator.mediaDevices` is `undefined` on non-secure origins. `*.local` mDNS addresses over HTTP are NOT treated as secure — only `localhost` and `127.0.0.1` are exempt.
|
|
- On Linux, the default fix is to run the app on local HTTP and expose it with `tailscale serve <port>` so the phone gets a valid HTTPS `*.ts.net` origin.
|
|
- On macOS, the app keeps its original self-signed local HTTPS flow. This works, but phones may show certificate warnings unless you trust the cert.
|