Add ggwave audio calculator POC and documentation
Includes a working Bun server and iOS Safari calculator UI that uses data-over-sound encoding/decoding via the ggwave library. Phone encodes expressions as audible chirps, Mac server decodes via microphone, evaluates, and sends result back via SSE. Tested reliably with ambient noise using Studio Display microphone. Includes gotchas documentation covering iOS audio, macOS mic permissions, WASM heap, and sample rate requirements. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
18b08aeba2
commit
8e3179f7d3
29
docs/ggwave-gotchas.md
Normal file
29
docs/ggwave-gotchas.md
Normal file
|
|
@ -0,0 +1,29 @@
|
|||
# ggwave Gotchas
|
||||
|
||||
## WASM
|
||||
|
||||
- **WASM heap copies**: `encode()` returns Int8Array backed by WASM heap memory. Must copy data out (`new Uint8Array(rawBytes.length); copy.set(...)`) before using, or it gets corrupted.
|
||||
- **Same instance required**: Using separate ggwave instances for encode/decode causes WASM heap pointer corruption. Use one instance for both. Max 4 instances allowed.
|
||||
- **Encode output format**: `encode()` returns Int8Array of raw F32 *bytes*, not Float32Array. Reinterpret with `new Float32Array(bytesCopy.buffer)` for AudioBuffer.
|
||||
- **API naming**: README says `TxProtocolId` but actual API uses `ProtocolId`. Check `Object.keys(ggwave)` when in doubt.
|
||||
|
||||
## iOS Safari Audio
|
||||
|
||||
- **AudioContext must be created synchronously** inside a user gesture handler (click/tap). Any `await` before `new AudioContext()` breaks the gesture chain and Safari blocks audio permanently for that context.
|
||||
- **Silent mode bypass**: `navigator.audioSession.type = 'playback'` (iOS 17+) switches WebAudio from ringer channel to media channel, bypassing the hardware mute switch. Without this, the silent switch kills all WebAudio output.
|
||||
- **Unlock pattern**: Create AudioContext → play a silent buffer → then await async init. Never reverse this order.
|
||||
|
||||
## macOS Microphone (sox + CoreAudio)
|
||||
|
||||
- **Explicit device names produce all-zero data**: `sox -t coreaudio "MacBook Air Microphone"` gets zeros due to macOS TCC permission scoping. Only `sox -d` (default device) gets actual mic access granted to the terminal app.
|
||||
- **Workaround**: Change the default input device in System Settings > Sound > Input, then use `sox -d`. Or install `switchaudio-osx` (`brew install switchaudio-osx`) to change it programmatically.
|
||||
- **MacBook Air mic disabled when lid is closed**: If using a monitor, the laptop mic won't work. Use the monitor's mic (e.g., Studio Display) instead.
|
||||
- **Sample rate must match**: ggwave needs matching sample rates for encode/decode. Browser uses 48000Hz. Server must also use 48000Hz — sox will resample from the device's native rate automatically.
|
||||
|
||||
## SSE with Bun
|
||||
|
||||
- Bun's default idle timeout is 10 seconds, which kills SSE connections. Set `idleTimeout: 255` (max value) on `Bun.serve()`.
|
||||
|
||||
## Half-Duplex Audio
|
||||
|
||||
- Both sides use the same audible protocol (GGWAVE_PROTOCOL_AUDIBLE_FAST). A `playing` flag stops mic processing during playback to prevent self-hearing/feedback. 300ms gap after playback before resuming listening.
|
||||
15
tmp/bun.lock
Normal file
15
tmp/bun.lock
Normal file
|
|
@ -0,0 +1,15 @@
|
|||
{
|
||||
"lockfileVersion": 1,
|
||||
"configVersion": 1,
|
||||
"workspaces": {
|
||||
"": {
|
||||
"name": "ggwave-poc",
|
||||
"dependencies": {
|
||||
"ggwave": "0.4.0",
|
||||
},
|
||||
},
|
||||
},
|
||||
"packages": {
|
||||
"ggwave": ["ggwave@0.4.0", "", {}, "sha512-+sKq0aIEVJ7zHj4Vw+Sj/RPa91xp76ihaG5gsOKZ8ojM5+uUu3NFzAspozwBx/zeaThxP5VeIkA2bbsfWpUd2g=="],
|
||||
}
|
||||
}
|
||||
146
tmp/index.html
Normal file
146
tmp/index.html
Normal file
|
|
@ -0,0 +1,146 @@
|
|||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=no">
|
||||
<title>Audio Calculator</title>
|
||||
<script src="/ggwave.js"></script>
|
||||
<style>
|
||||
* { box-sizing: border-box; margin: 0; padding: 0; }
|
||||
body { font-family: -apple-system, system-ui, sans-serif; background: #111; color: #fff; height: 100dvh; display: flex; flex-direction: column; }
|
||||
#display { padding: 20px; text-align: right; font-size: 48px; font-family: monospace; min-height: 100px; display: flex; align-items: flex-end; justify-content: flex-end; word-break: break-all; }
|
||||
#status { padding: 8px 20px; font-size: 14px; color: #888; text-align: center; }
|
||||
.grid { display: grid; grid-template-columns: repeat(4, 1fr); gap: 1px; flex: 1; padding: 1px; }
|
||||
.grid button {
|
||||
font-size: 28px; border: none; background: #333; color: #fff;
|
||||
cursor: pointer; min-height: 64px;
|
||||
-webkit-tap-highlight-color: transparent;
|
||||
}
|
||||
.grid button:active { background: #555; }
|
||||
.grid button.op { background: #f90; }
|
||||
.grid button.op:active { background: #fc3; }
|
||||
.grid button.eq { background: #2a2; }
|
||||
.grid button.eq:active { background: #3c3; }
|
||||
.grid button.clear { background: #c33; }
|
||||
.grid button.clear:active { background: #e55; }
|
||||
.result { color: #0f0; }
|
||||
.sending { color: #f90; }
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div id="display">0</div>
|
||||
<div id="status">Tap any button to start</div>
|
||||
<div class="grid">
|
||||
<button class="clear" data-action="clear">C</button>
|
||||
<button data-action="input" data-value="(">(</button>
|
||||
<button data-action="input" data-value=")">)</button>
|
||||
<button class="op" data-action="input" data-value="/">÷</button>
|
||||
|
||||
<button data-action="input" data-value="7">7</button>
|
||||
<button data-action="input" data-value="8">8</button>
|
||||
<button data-action="input" data-value="9">9</button>
|
||||
<button class="op" data-action="input" data-value="*">×</button>
|
||||
|
||||
<button data-action="input" data-value="4">4</button>
|
||||
<button data-action="input" data-value="5">5</button>
|
||||
<button data-action="input" data-value="6">6</button>
|
||||
<button class="op" data-action="input" data-value="-">−</button>
|
||||
|
||||
<button data-action="input" data-value="1">1</button>
|
||||
<button data-action="input" data-value="2">2</button>
|
||||
<button data-action="input" data-value="3">3</button>
|
||||
<button class="op" data-action="input" data-value="+">+</button>
|
||||
|
||||
<button data-action="input" data-value="0" style="grid-column: span 2">0</button>
|
||||
<button data-action="input" data-value=".">.</button>
|
||||
<button class="eq" data-action="send">=</button>
|
||||
</div>
|
||||
|
||||
<script>
|
||||
const display = document.getElementById('display')
|
||||
const statusEl = document.getElementById('status')
|
||||
let expression = ''
|
||||
let gg, ggInstance, audioContext
|
||||
|
||||
// iOS 17+: bypass hardware silent switch by routing WebAudio
|
||||
// through the media/playback channel instead of the ringer channel.
|
||||
function bypassSilentMode() {
|
||||
if ('audioSession' in navigator) navigator.audioSession.type = 'playback'
|
||||
}
|
||||
|
||||
async function sendExpression(expr) {
|
||||
if (!gg || !audioContext) return
|
||||
if (audioContext.state === 'suspended') await audioContext.resume()
|
||||
|
||||
display.textContent = expr
|
||||
display.className = 'sending'
|
||||
statusEl.textContent = 'Sending...'
|
||||
|
||||
// encode() returns Int8Array of raw F32 bytes on the WASM heap — must copy out
|
||||
const rawBytes = gg.encode(ggInstance, expr, gg.ProtocolId.GGWAVE_PROTOCOL_AUDIBLE_FAST, 50)
|
||||
const bytesCopy = new Uint8Array(rawBytes.length)
|
||||
bytesCopy.set(new Uint8Array(rawBytes.buffer, rawBytes.byteOffset, rawBytes.length))
|
||||
const floats = new Float32Array(bytesCopy.buffer)
|
||||
|
||||
const buf = audioContext.createBuffer(1, floats.length, 48000)
|
||||
buf.getChannelData(0).set(floats)
|
||||
const source = audioContext.createBufferSource()
|
||||
source.buffer = buf
|
||||
source.connect(audioContext.destination)
|
||||
|
||||
await new Promise(resolve => { source.onended = resolve; source.start() })
|
||||
statusEl.textContent = 'Waiting for result...'
|
||||
}
|
||||
|
||||
// SSE for results from server
|
||||
const events = new EventSource('/events')
|
||||
events.onmessage = (e) => {
|
||||
const data = JSON.parse(e.data)
|
||||
if (data.type === 'received') {
|
||||
display.textContent = data.result
|
||||
display.className = 'result'
|
||||
statusEl.textContent = data.expression + ' = ' + data.result
|
||||
expression = data.result
|
||||
}
|
||||
}
|
||||
|
||||
// iOS Safari requires AudioContext to be created synchronously inside a
|
||||
// user gesture handler. Any await before creation breaks the gesture chain.
|
||||
document.querySelector('.grid').addEventListener('click', async (e) => {
|
||||
const btn = e.target.closest('button')
|
||||
if (!btn) return
|
||||
|
||||
if (!audioContext) {
|
||||
audioContext = new (window.AudioContext || window.webkitAudioContext)({ sampleRate: 48000 })
|
||||
const src = audioContext.createBufferSource()
|
||||
src.buffer = audioContext.createBuffer(1, 1, 48000)
|
||||
src.connect(audioContext.destination)
|
||||
src.start()
|
||||
bypassSilentMode()
|
||||
}
|
||||
|
||||
if (!gg) {
|
||||
statusEl.textContent = 'Initializing...'
|
||||
gg = await ggwave_factory()
|
||||
gg.disableLog()
|
||||
ggInstance = gg.init(gg.getDefaultParameters())
|
||||
statusEl.textContent = 'Ready'
|
||||
}
|
||||
|
||||
const action = btn.dataset.action
|
||||
if (action === 'clear') {
|
||||
expression = ''
|
||||
display.textContent = '0'
|
||||
display.className = ''
|
||||
statusEl.textContent = 'Ready'
|
||||
} else if (action === 'input') {
|
||||
expression += btn.dataset.value
|
||||
display.textContent = expression
|
||||
display.className = ''
|
||||
} else if (action === 'send' && expression) {
|
||||
await sendExpression(expression)
|
||||
}
|
||||
})
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
7
tmp/package.json
Normal file
7
tmp/package.json
Normal file
|
|
@ -0,0 +1,7 @@
|
|||
{
|
||||
"name": "ggwave-poc",
|
||||
"type": "module",
|
||||
"dependencies": {
|
||||
"ggwave": "0.4.0"
|
||||
}
|
||||
}
|
||||
155
tmp/server.ts
Normal file
155
tmp/server.ts
Normal file
|
|
@ -0,0 +1,155 @@
|
|||
import factory from 'ggwave'
|
||||
|
||||
const PORT = 8888
|
||||
const SAMPLE_RATE = 48000
|
||||
|
||||
const ggwave = await factory()
|
||||
const params = ggwave.getDefaultParameters()
|
||||
params.sampleRateInp = SAMPLE_RATE
|
||||
params.sampleRateOut = SAMPLE_RATE
|
||||
params.sampleRate = SAMPLE_RATE
|
||||
const instance = ggwave.init(params)
|
||||
|
||||
// SSE clients waiting for results
|
||||
const clients = new Set<ReadableStreamDefaultController>()
|
||||
|
||||
// Half-duplex: don't process mic while playing
|
||||
let playing = false
|
||||
|
||||
function broadcast(data: object) {
|
||||
const msg = `data: ${JSON.stringify(data)}\n\n`
|
||||
for (const controller of clients) {
|
||||
try { controller.enqueue(new TextEncoder().encode(msg)) }
|
||||
catch { clients.delete(controller) }
|
||||
}
|
||||
}
|
||||
|
||||
function evaluate(expr: string): string {
|
||||
if (!/^[\d+\-*/.() ]+$/.test(expr)) return 'ERR'
|
||||
try { return String(new Function(`return (${expr})`)()) }
|
||||
catch { return 'ERR' }
|
||||
}
|
||||
|
||||
function decodeBytes(data: Int8Array): string {
|
||||
return Array.from(data).map(b => String.fromCharCode(b & 0xff)).join('')
|
||||
}
|
||||
|
||||
async function playResult(text: string) {
|
||||
playing = true
|
||||
broadcast({ type: 'playing', result: text })
|
||||
|
||||
const waveform = ggwave.encode(
|
||||
instance, text,
|
||||
ggwave.ProtocolId.GGWAVE_PROTOCOL_AUDIBLE_FAST,
|
||||
50
|
||||
)
|
||||
const rawBytes = new Uint8Array(waveform.length)
|
||||
rawBytes.set(new Uint8Array(waveform.buffer, waveform.byteOffset, waveform.length))
|
||||
|
||||
const play = Bun.spawn(
|
||||
['sox', '-t', 'raw', '-r', String(SAMPLE_RATE), '-c', '1', '-b', '32', '-e', 'floating-point', '-', '-d'],
|
||||
{ stdin: 'pipe', stdout: 'ignore', stderr: 'ignore' }
|
||||
)
|
||||
play.stdin.write(rawBytes)
|
||||
play.stdin.end()
|
||||
await play.exited
|
||||
|
||||
await new Promise(r => setTimeout(r, 300))
|
||||
playing = false
|
||||
broadcast({ type: 'ready' })
|
||||
console.log('Played result:', text)
|
||||
}
|
||||
|
||||
function startMicListener() {
|
||||
// Uses default device (-d). Explicit CoreAudio device names produce all-zero
|
||||
// data due to macOS TCC permission scoping.
|
||||
const sox = Bun.spawn(
|
||||
['sox', '-d', '-t', 'raw', '-r', String(SAMPLE_RATE), '-c', '1', '-b', '32', '-e', 'floating-point', '-'],
|
||||
{ stdout: 'pipe', stderr: 'pipe' }
|
||||
)
|
||||
|
||||
new Response(sox.stderr).text().then(err => {
|
||||
const deviceLine = err.split('\n').find(l => l.includes('Input File'))
|
||||
if (deviceLine) console.log('Mic:', deviceLine.trim())
|
||||
})
|
||||
|
||||
const reader = sox.stdout.getReader()
|
||||
const bytesPerFrame = params.samplesPerFrame * 4
|
||||
|
||||
let buffer = new Uint8Array(0)
|
||||
|
||||
async function processAudio() {
|
||||
while (true) {
|
||||
const { done, value } = await reader.read()
|
||||
if (done) break
|
||||
if (playing) continue
|
||||
|
||||
const newBuf = new Uint8Array(buffer.length + value.length)
|
||||
newBuf.set(buffer)
|
||||
newBuf.set(value, buffer.length)
|
||||
buffer = newBuf
|
||||
|
||||
while (buffer.length >= bytesPerFrame) {
|
||||
const frame = buffer.slice(0, bytesPerFrame)
|
||||
buffer = buffer.slice(bytesPerFrame)
|
||||
|
||||
const decoded = ggwave.decode(instance, frame)
|
||||
if (decoded && decoded.length > 0) {
|
||||
const text = decodeBytes(decoded)
|
||||
const result = evaluate(text)
|
||||
console.log(`${text} = ${result}`)
|
||||
broadcast({ type: 'received', expression: text, result })
|
||||
await playResult(result)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
processAudio().catch(err => console.error('Mic error:', err))
|
||||
}
|
||||
|
||||
Bun.serve({
|
||||
port: PORT,
|
||||
idleTimeout: 255,
|
||||
async fetch(req) {
|
||||
const url = new URL(req.url)
|
||||
|
||||
if (url.pathname === '/ok') return new Response('ok')
|
||||
|
||||
if (url.pathname === '/') {
|
||||
return new Response(Bun.file(import.meta.dir + '/index.html'), {
|
||||
headers: { 'Content-Type': 'text/html' },
|
||||
})
|
||||
}
|
||||
|
||||
if (url.pathname === '/ggwave.js') {
|
||||
return new Response(Bun.file(import.meta.dir + '/node_modules/ggwave/ggwave.js'), {
|
||||
headers: { 'Content-Type': 'application/javascript' },
|
||||
})
|
||||
}
|
||||
|
||||
if (url.pathname === '/ggwave.wasm') {
|
||||
return new Response(Bun.file(import.meta.dir + '/node_modules/ggwave/ggwave.wasm'), {
|
||||
headers: { 'Content-Type': 'application/wasm' },
|
||||
})
|
||||
}
|
||||
|
||||
if (url.pathname === '/events') {
|
||||
const stream = new ReadableStream({
|
||||
start(controller) {
|
||||
clients.add(controller)
|
||||
controller.enqueue(new TextEncoder().encode(`data: ${JSON.stringify({ type: 'ready' })}\n\n`))
|
||||
},
|
||||
cancel(controller) { clients.delete(controller) },
|
||||
})
|
||||
return new Response(stream, {
|
||||
headers: { 'Content-Type': 'text/event-stream', 'Cache-Control': 'no-cache', Connection: 'keep-alive' },
|
||||
})
|
||||
}
|
||||
|
||||
return new Response('not found', { status: 404 })
|
||||
},
|
||||
})
|
||||
|
||||
console.log(`Listening on http://localhost:${PORT}`)
|
||||
startMicListener()
|
||||
Loading…
Reference in New Issue
Block a user