16 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
ReefVM is a stack-based bytecode virtual machine for the Shrimp programming language. It implements a complete VM with closures, tail call optimization, exception handling, variadic functions, named parameters, and Ruby-style iterators with break/continue.
Essential reading: Before making changes, read README.md, SPEC.md, and GUIDE.md to understand the VM architecture, instruction set, and compiler patterns.
Development Commands
Running Files
bun <file.ts> # Run TypeScript files directly
bun examples/native.ts # Run example
Testing
bun test # Run all tests
bun test <file> # Run specific test file
bun test --watch # Watch mode
Tools
./bin/reef <file.reef> # Execute bytecode file
./bin/validate <file.reef> # Validate bytecode
./bin/debug <file.reef> # Step-by-step debugger
./bin/repl # Interactive REPL
Building
No build step required - Bun runs TypeScript directly.
Architecture
Core Components
VM Execution Model (src/vm.ts):
- Stack-based execution with program counter (PC)
- Call stack for function frames
- Exception handler stack for try/catch/finally
- Lexical scope chain with parent references (includes native functions)
Key subsystems:
- bytecode.ts: Compiler that converts both string and array formats to executable bytecode. Handles label resolution, constant pool management, and function definition parsing. The
toBytecode()function accepts either a string (human-readable) or typed array format (programmatic). - value.ts: Tagged union Value type system with type coercion functions (toNumber, toString, isTrue, isEqual)
- scope.ts: Linked scope chain for variable resolution with lexical scoping
- frame.ts: Call frame tracking for function calls and break targets
- exception.ts: Exception handler records for try/catch/finally blocks
- validator.ts: Bytecode validation to catch common errors before execution
- opcode.ts: OpCode enum defining all VM instructions
Critical Design Decisions
Label-based jumps: All JUMP instructions (JUMP, JUMP_IF_FALSE, JUMP_IF_TRUE) require label operands (.label), not numeric offsets. Labels are resolved to PC-relative offsets during compilation, making bytecode position-independent. PUSH_TRY/PUSH_FINALLY use absolute addresses and can accept either labels or numeric offsets.
Truthiness semantics: Only null and false are falsy. Unlike JavaScript, 0, "", empty arrays, and empty dicts are truthy.
No AND/OR opcodes: Short-circuit logical operations are implemented at the compiler level using JUMP patterns with DUP.
Tail call optimization: TAIL_CALL reuses the current call frame instead of pushing a new one, enabling unbounded recursion.
Break semantics: CALL marks frames as break targets. BREAK unwinds the call stack to the most recent break target, enabling Ruby-style iterator patterns.
Exception handling: THROW jumps to finally (if present) or catch. The VM does NOT auto-jump to finally on successful try completion - compilers must explicitly generate JUMPs to finally blocks.
Parameter binding priority: Named args bind to fixed params first. Unmatched named args go to @named dict parameter. Fixed params bind in order: named arg > positional arg > default > null.
Native function calling: Native functions are stored in scope and called via LOAD + CALL, using the same calling convention as Reef functions. Named arguments are supported by extracting parameter names from the function signature at call time.
Testing Strategy
Tests are organized by feature area:
- opcodes.test.ts: Stack ops, arithmetic, comparisons, variables, control flow
- functions.test.ts: Function creation, calls, closures, defaults, variadic, named args
- tail-call.test.ts: Tail call optimization and unbounded recursion
- exceptions.test.ts: Try/catch/finally, exception unwinding, nested handlers
- native.test.ts: Native function interop (sync and async)
- functions-parameter.test.ts: Convenience parameter for passing functions to run() and VM
- bytecode.test.ts: Bytecode string parser, label resolution, constants
- programmatic.test.ts: Array format API, typed tuples, labels, functions
- validator.test.ts: Bytecode validation rules
- unicode.test.ts: Unicode and emoji identifiers
- regex.test.ts: RegExp support
- examples.test.ts: Integration tests for example programs
When adding features:
- Add unit tests for the specific opcode/feature
- Add integration tests showing real-world usage
- Update SPEC.md with formal specification
- Update GUIDE.md with compiler patterns
- Consider adding an example to examples/
Common Patterns
Writing Bytecode Tests
ReefVM supports two bytecode formats: string and array.
String format (human-readable):
import { toBytecode, run } from "#reef"
const bytecode = toBytecode(`
PUSH 42
STORE x
LOAD x
HALT
`)
const result = await run(bytecode)
// result is { type: 'number', value: 42 }
Array format (programmatic, type-safe):
import { toBytecode, run } from "#reef"
const bytecode = toBytecode([
["PUSH", 42],
["STORE", "x"],
["LOAD", "x"],
["HALT"]
])
const result = await run(bytecode)
// result is { type: 'number', value: 42 }
Array format features:
- Typed tuples for compile-time type checking
- Labels defined as
[".label:"](single-element arrays with colon suffix) - Label references as strings:
["JUMP", ".label"](no colon in references) - Function params as string arrays:
["MAKE_FUNCTION", ["x", "y=10"], ".body"] - See
tests/programmatic.test.tsandexamples/programmatic.tsfor examples
Native Function Registration and Global Values
Option 1: Pass to run() or VM constructor (convenience)
const result = await run(bytecode, {
add: (a: number, b: number) => a + b,
greet: (name: string) => `Hello, ${name}!`,
pi: 3.14159,
config: { debug: true, port: 8080 }
})
// Or with VM constructor
const vm = new VM(bytecode, { add, greet, pi, config })
Option 2: Set values with vm.set() (manual)
const vm = new VM(bytecode)
// Set functions (auto-wrapped to native functions)
vm.set('add', (a: number, b: number) => a + b)
// Set any other values (auto-converted to ReefVM Values)
vm.set('pi', 3.14159)
vm.set('config', { debug: true, port: 8080 })
await vm.run()
Option 3: Set Value-based functions with vm.setValueFunction() (advanced)
For functions that work directly with ReefVM Value types:
const vm = new VM(bytecode)
// Set Value-based function (no wrapping, works directly with Values)
vm.setValueFunction('customOp', (a: Value, b: Value): Value => {
return toValue(toNumber(a) + toNumber(b))
})
await vm.run()
Auto-wrapping handles:
- Functions: wrapped as native functions with Value ↔ native type conversion
- Sync and async functions
- Arrays, objects, primitives, null, RegExp
- All values converted via
toValue()
Calling Functions from TypeScript
Use vm.call() to invoke Reef or native functions from TypeScript:
const bytecode = toBytecode(`
MAKE_FUNCTION (x y=10) .add
STORE add
HALT
.add:
LOAD x
LOAD y
ADD
RETURN
`)
const vm = new VM(bytecode, {
log: (msg: string) => console.log(msg) // Native function
})
await vm.run()
// Call Reef function with positional arguments
const result1 = await vm.call('add', 5, 3) // → 8
// Call Reef function with named arguments (pass final object)
const result2 = await vm.call('add', 5, { y: 20 }) // → 25
// Call Reef function with all named arguments
const result3 = await vm.call('add', { x: 10, y: 15 }) // → 25
// Call native function
await vm.call('log', 'Hello!')
How it works:
- Looks up function (Reef or native) in VM scope
- For Reef functions: converts to callable JavaScript function using
fnFromValue - For native functions: calls directly
- Automatically converts arguments to ReefVM Values
- Converts result back to JavaScript types
Label Usage (Required for JUMP instructions)
All JUMP instructions must use labels:
JUMP .skip
PUSH 42
HALT
.skip:
PUSH 99
HALT
Function Definition Patterns
When defining functions, you MUST prevent the PC from falling through into function bodies. Two patterns:
Pattern 1: JUMP over function bodies (Recommended)
MAKE_FUNCTION (params) .body
STORE function_name
JUMP .end ; Skip over function body
.body:
<function code>
RETURN
.end:
<continue with program>
Pattern 2: Function bodies after HALT
MAKE_FUNCTION (params) .body
STORE function_name
<use the function>
HALT ; Stop before function bodies
.body:
<function code>
RETURN
Pattern 1 is required for:
- Defining multiple functions before using them
- REPL mode
- Any case where execution continues after defining a function
Pattern 2 only works if you HALT before reaching function bodies.
REPL Mode (Incremental Execution)
For building REPLs (like the Shrimp REPL), use vm.continue() and vm.appendBytecode():
const vm = new VM(toBytecode([]), natives)
await vm.run() // Initialize (empty bytecode)
// User enters: x = 42
const line1 = compileLine("x = 42") // No HALT!
vm.appendBytecode(line1)
await vm.continue() // Execute only line 1
// User enters: x + 10
const line2 = compileLine("x + 10") // No HALT!
vm.appendBytecode(line2)
await vm.continue() // Execute only line 2, result is 52
Key points:
vm.run()resets PC to 0 (re-executes everything) - use for initial setup onlyvm.continue()resumes from current PC (executes only new bytecode)vm.appendBytecode(bytecode)properly handles constant index remapping- Don't use HALT in REPL lines - let VM stop naturally
- Scope and variables persist across all lines
- Side effects only run once
TypeScript Configuration
- Import alias:
#reefmaps to./src/index.ts - Module system: ES modules (
"type": "module"in package.json) - Bun automatically handles TypeScript compilation
Bun-Specific Notes
- Use
buninstead ofnode,npm,pnpm, orvite - No need for dotenv - Bun loads .env automatically
- Prefer Bun APIs over Node.js equivalents when available
- See .cursor/rules/use-bun-instead-of-node-vite-npm-pnpm.mdc for detailed Bun usage
Adding a New OpCode
When adding a new instruction to ReefVM, you must update multiple files in a specific order. Follow this checklist:
1. Define the OpCode (src/opcode.ts)
Add the new opcode to the OpCode enum with comprehensive documentation:
export enum OpCode {
// ... existing opcodes
MY_NEW_OP, // operand: <type> | stack: [inputs] → [outputs]
// Description of what it does
// Any important behavioral notes
}
2. Implement VM Execution (src/vm.ts)
Add a case to the execute() method's switch statement:
async execute(instruction: Instruction) {
switch (instruction.op) {
// ... existing cases
case OpCode.MY_NEW_OP:
// Implementation
// - Pop values from this.stack as needed
// - Perform the operation
// - Push results to this.stack
// - Throw errors for invalid operations
// - Use await for async operations
break
}
}
Common helper methods:
this.binaryOp((a, b) => ...)- For binary arithmetic/comparisontoNumber(value),toString(value),isTrue(value),isEqual(a, b)- Type coercionthis.scope.get(name),this.scope.set(name, value)- Variable access
3. Update Validator (src/validator.ts)
Add the opcode to the appropriate set:
// If your opcode requires an operand:
const OPCODES_WITH_OPERANDS = new Set([
// ... existing
OpCode.MY_NEW_OP,
])
// If your opcode takes no operand:
const OPCODES_WITHOUT_OPERANDS = new Set([
// ... existing
OpCode.MY_NEW_OP,
])
If your opcode has complex operand validation, add a specific check in the validation loop around line 154.
4. Update Array API (src/bytecode.ts)
Add your instruction to the InstructionTuple type:
type InstructionTuple =
// ... existing types
| ["MY_NEW_OP"] // No operand
| ["MY_NEW_OP", string] // String operand
| ["MY_NEW_OP", number] // Number operand
| ["MY_NEW_OP", string, number] // Multiple operands
If your opcode has special operand handling, add a case in toBytecodeFromArray() around line 241.
5. Write Tests (REQUIRED)
Create tests in the appropriate test file:
// tests/basic.test.ts, tests/functions.test.ts, etc.
test("MY_NEW_OP description", async () => {
const bytecode = toBytecode([
// Setup
["PUSH", 42],
["MY_NEW_OP"],
["HALT"]
])
const result = await run(bytecode)
expect(result).toEqual({ type: "number", value: 42 })
})
// Test edge cases
test("MY_NEW_OP with invalid input", async () => {
// Test error conditions
await expect(run(bytecode)).rejects.toThrow()
})
ALWAYS write tests. Test both success cases and error conditions. Add integration tests showing real-world usage.
6. Document Specification (SPEC.md)
Add a formal specification entry:
#### MY_NEW_OP
**Operand**: `<type>`
**Stack**: `[input] → [output]`
Description of what the instruction does.
**Behavior**:
- Specific behavior point 1
- Specific behavior point 2
**Errors**:
- Error condition 1
- Error condition 2
7. Update Compiler Guide (GUIDE.md)
If your opcode introduces new patterns, add examples to GUIDE.md:
### New Pattern Name
\```
PUSH value
MY_NEW_OP
STORE result
\```
Description of the pattern and when to use it.
8. Add Examples (Optional)
If your opcode enables new functionality, add an example to examples/:
// examples/my_feature.reef or examples/my_feature.ts
const example = toBytecode([
// Demonstrate the new opcode
])
Checklist Summary
When adding an opcode, update in this order:
src/opcode.ts- Add enum value with docssrc/vm.ts- Implement execution logicsrc/validator.ts- Add to operand requirement setsrc/bytecode.ts- Add to InstructionTuple typetests/*.test.ts- Write comprehensive tests (REQUIRED)SPEC.md- Document formal specificationGUIDE.md- Add compiler patterns (if applicable)examples/- Add example code (if applicable)
Run bun test to verify all tests pass before committing.
Common Gotchas
Label requirements: JUMP/JUMP_IF_FALSE/JUMP_IF_TRUE require label operands (.label), not numeric offsets. The bytecode compiler resolves labels to PC-relative offsets internally. PUSH_TRY/PUSH_FINALLY can use either labels or absolute instruction indices (#N).
Stack operations: Most binary operations pop in reverse order (second operand is popped first, then first operand).
MAKE_ARRAY operand: Specifies count, not a stack index. MAKE_ARRAY #3 pops 3 items.
Finally blocks: The compiler must generate explicit JUMPs to finally blocks for successful try/catch completion. The VM only auto-jumps to finally on THROW.
Variable scoping: STORE updates existing variables in parent scopes or creates in current scope. It does NOT shadow by default.
Identifiers: Variable and parameter names support Unicode and emoji! Valid: 💎, 🌟, 変数, counter. Invalid: cannot start with digits or special prefixes (., #, @, ...), cannot contain whitespace or syntax characters.