Document response timing, sample rate, and secure context gotchas

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Corey Johnson 2026-03-12 13:06:53 -07:00
parent 37dd74c30d
commit cfdffa685f

View File

@ -10,7 +10,7 @@
## iOS Safari Audio ## iOS Safari Audio
- **AudioContext must be created synchronously** inside a user gesture handler (click/tap). Any `await` before `new AudioContext()` breaks the gesture chain and Safari blocks audio permanently for that context. - **AudioContext must be created synchronously** inside a user gesture handler (click/tap). Any `await` before `new AudioContext()` breaks the gesture chain and Safari blocks audio permanently for that context.
- **Silent mode bypass**: `navigator.audioSession.type = 'playback'` (iOS 17+) switches WebAudio from ringer channel to media channel, bypassing the hardware mute switch. Without this, the silent switch kills all WebAudio output. - **Silent mode bypass**: `navigator.audioSession.type = 'play-and-record'` (iOS 17+) bypasses the hardware mute switch. Do NOT use `'playback'` if you also need `getUserMedia` — the `'playback'` category tells iOS "speaker only" and blocks mic access with "AudioSession category is not compatible with audio capture".
- **Unlock pattern**: Create AudioContext → play a silent buffer → then await async init. Never reverse this order. - **Unlock pattern**: Create AudioContext → play a silent buffer → then await async init. Never reverse this order.
## macOS Microphone (sox + CoreAudio) ## macOS Microphone (sox + CoreAudio)
@ -22,6 +22,23 @@
## Half-Duplex Audio ## Half-Duplex Audio
- Both sides (server and browser) use the same audible protocol (GGWAVE_PROTOCOL_AUDIBLE_FAST). A `playing` flag stops mic processing during playback to prevent self-hearing/feedback. 300ms gap after playback before resuming listening. - Both sides (server and browser) use the same audible protocol (GGWAVE_PROTOCOL_AUDIBLE_FAST). A `playing` flag stops mic processing during playback to prevent self-hearing/feedback.
- The browser uses `ScriptProcessorNode` to feed mic audio frames to ggwave for decoding. Frames are accumulated to `samplesPerFrame` size before decoding. - The browser uses `ScriptProcessorNode` to feed mic audio frames to ggwave for decoding. Frames are accumulated to `samplesPerFrame` size before decoding.
- Browser mic requires `getUserMedia` with `echoCancellation: false`, `noiseSuppression: false`, `autoGainControl: false` to preserve signal integrity. - Browser mic requires `getUserMedia` with `echoCancellation: false`, `noiseSuppression: false`, `autoGainControl: false` to preserve signal integrity.
## Response Timing
- After sending a chirp, the sender must wait ~500ms before switching to listening mode. Without this gap, the receiver's response arrives while the sender is still in "playing" state and gets ignored.
- The receiver should also wait ~500ms after hearing a message before playing its response. This gives the sender time to finish playback, clear its buffer, and switch to listening.
- Both delays are needed: sender waits 500ms after its own playback, receiver waits 500ms before replying. Without both, the response window is missed.
- Use a `sendAndWait(text, timeout)` pattern on the client: set `playing=true`, play the chirp, sleep 500ms, set `playing=false`, then await a promise that resolves when the mic decoder receives a message (or times out).
## Sample Rate
- ggwave must be initialized with explicit sample rates matching the AudioContext: set `sampleRateInp`, `sampleRateOut`, and `sampleRate` on the params object. Using `getDefaultParameters()` without setting these may silently fail to decode if the AudioContext's actual rate differs from the default.
- iOS may not honor the requested `sampleRate` in the AudioContext constructor. Always read `audioContext.sampleRate` after creation and pass that to ggwave, don't assume 48000.
## getUserMedia Secure Context
- `navigator.mediaDevices` is `undefined` on non-secure origins. `*.local` mDNS addresses over HTTP are NOT treated as secure — only `localhost` and `127.0.0.1` are exempt.
- Self-signed certs work but cause "connection is not private" browser warnings. For local dev, hosting the phone page on a separate HTTPS server or using a tunnel is cleaner.