baudy/docs/ggwave-gotchas.md

5.3 KiB

ggwave Gotchas

WASM

  • WASM heap copies: encode() returns Int8Array backed by WASM heap memory. Must copy data out (new Uint8Array(rawBytes.length); copy.set(...)) before using, or it gets corrupted.
  • Same instance required: Using separate ggwave instances for encode/decode causes WASM heap pointer corruption. Use one instance for both. Max 4 instances allowed.
  • Encode output format: encode() returns Int8Array of raw F32 bytes, not Float32Array. Reinterpret with new Float32Array(bytesCopy.buffer) for AudioBuffer.
  • API naming: README says TxProtocolId but actual API uses ProtocolId. Check Object.keys(ggwave) when in doubt.

iOS Safari Audio

  • AudioContext must be created synchronously inside a user gesture handler (click/tap). Any await before new AudioContext() breaks the gesture chain and Safari blocks audio permanently for that context.
  • Silent mode bypass: navigator.audioSession.type = 'play-and-record' (iOS 17+) bypasses the hardware mute switch. Do NOT use 'playback' if you also need getUserMedia — the 'playback' category tells iOS "speaker only" and blocks mic access with "AudioSession category is not compatible with audio capture".
  • Unlock pattern: Create AudioContext → play a silent buffer → then await async init. Never reverse this order.

macOS Microphone (sox + CoreAudio)

  • Explicit device names produce all-zero data: sox -t coreaudio "MacBook Air Microphone" gets zeros due to macOS TCC permission scoping. Only sox -d (default device) gets actual mic access granted to the terminal app.
  • Workaround: Change the default input device in System Settings > Sound > Input, then use sox -d. Or install switchaudio-osx (brew install switchaudio-osx) to change it programmatically.
  • MacBook Air mic disabled when lid is closed: If using a monitor, the laptop mic won't work. Use the monitor's mic (e.g., Studio Display) instead.
  • Sample rate must match: ggwave needs matching sample rates for encode/decode. Browser uses 48000Hz. Server must also use 48000Hz — sox will resample from the device's native rate automatically.
  • macOS keeps the original self-signed local HTTPS flow: On macOS the app serves HTTPS directly again, using the original localhost self-signed cert behavior.

Linux Audio (sox + ALSA)

  • Mac-only helpers don't exist: SwitchAudioSource and scutil are macOS-only.
  • sox -d may be wrong on Linux: ALSA's default PCM can be playback-only. In that case, sox -d can play audio but fail to open the mic. Prefer explicit devices such as plughw:0,1 for playback and plughw:0,0 for capture.
  • Use env overrides when needed: BAUDY_CAPTURE_DEVICE=plughw:0,0 BAUDY_PLAYBACK_DEVICE=plughw:0,1 bun run index.tsx
  • Low mixer gain can look like a decode bug: If loopback hears only a tiny signal, check amixer and raise the speaker path (for example amixer -q sset 'Speaker Analog' 100%).
  • Prefer Tailscale Serve for phone access: Run the app on local HTTP and put it behind tailscale serve <port> so the phone gets a real HTTPS *.ts.net origin for getUserMedia().

Half-Duplex Audio

  • Both sides (server and browser) use the same audible protocol (GGWAVE_PROTOCOL_AUDIBLE_FAST). A playing flag stops mic processing during playback to prevent self-hearing/feedback.
  • The browser uses ScriptProcessorNode to feed mic audio frames to ggwave for decoding. Frames are accumulated to samplesPerFrame size before decoding.
  • Browser mic requires getUserMedia with echoCancellation: false, noiseSuppression: false, autoGainControl: false to preserve signal integrity.

Response Timing

  • After sending a chirp, the sender must wait ~500ms before switching to listening mode. Without this gap, the receiver's response arrives while the sender is still in "playing" state and gets ignored.
  • The receiver should also wait ~500ms after hearing a message before playing its response. This gives the sender time to finish playback, clear its buffer, and switch to listening.
  • Both delays are needed: sender waits 500ms after its own playback, receiver waits 500ms before replying. Without both, the response window is missed.
  • Use a sendAndWait(text, timeout) pattern on the client: set playing=true, play the chirp, sleep 500ms, set playing=false, then await a promise that resolves when the mic decoder receives a message (or times out).

Sample Rate

  • ggwave must be initialized with explicit sample rates matching the AudioContext: set sampleRateInp, sampleRateOut, and sampleRate on the params object. Using getDefaultParameters() without setting these may silently fail to decode if the AudioContext's actual rate differs from the default.
  • iOS may not honor the requested sampleRate in the AudioContext constructor. Always read audioContext.sampleRate after creation and pass that to ggwave, don't assume 48000.

getUserMedia Secure Context

  • navigator.mediaDevices is undefined on non-secure origins. *.local mDNS addresses over HTTP are NOT treated as secure — only localhost and 127.0.0.1 are exempt.
  • On Linux, the default fix is to run the app on local HTTP and expose it with tailscale serve <port> so the phone gets a valid HTTPS *.ts.net origin.
  • On macOS, the app keeps its original self-signed local HTTPS flow. This works, but phones may show certificate warnings unless you trust the cert.