# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview ReefVM is a stack-based bytecode virtual machine for the Shrimp programming language. It implements a complete VM with closures, tail call optimization, exception handling, variadic functions, named parameters, and Ruby-style iterators with break/continue. **Essential reading**: Before making changes, read README.md, SPEC.md, and GUIDE.md to understand the VM architecture, instruction set, and compiler patterns. ## Development Commands ### Running Files ```bash bun # Run TypeScript files directly bun examples/native.ts # Run example ``` ### Testing ```bash bun test # Run all tests bun test # Run specific test file bun test --watch # Watch mode ``` ### Building No build step required - Bun runs TypeScript directly. ## Architecture ### Core Components **VM Execution Model** (src/vm.ts): - Stack-based execution with program counter (PC) - Call stack for function frames - Exception handler stack for try/catch/finally - Lexical scope chain with parent references - Native function registry for TypeScript interop **Key subsystems**: - **bytecode.ts**: Compiler that converts both string and array formats to executable bytecode. Handles label resolution, constant pool management, and function definition parsing. The `toBytecode()` function accepts either a string (human-readable) or typed array format (programmatic). - **value.ts**: Tagged union Value type system with type coercion functions (toNumber, toString, isTrue, isEqual) - **scope.ts**: Linked scope chain for variable resolution with lexical scoping - **frame.ts**: Call frame tracking for function calls and break targets - **exception.ts**: Exception handler records for try/catch/finally blocks - **validator.ts**: Bytecode validation to catch common errors before execution - **opcode.ts**: OpCode enum defining all VM instructions ### Critical Design Decisions **Relative jumps**: All JUMP instructions use PC-relative offsets (not absolute addresses), making bytecode position-independent. PUSH_TRY/PUSH_FINALLY use absolute addresses. **Truthiness semantics**: Only `null` and `false` are falsy. Unlike JavaScript, `0`, `""`, empty arrays, and empty dicts are truthy. **No AND/OR opcodes**: Short-circuit logical operations are implemented at the compiler level using JUMP patterns with DUP. **Tail call optimization**: TAIL_CALL reuses the current call frame instead of pushing a new one, enabling unbounded recursion. **Break semantics**: CALL marks frames as break targets. BREAK unwinds the call stack to the most recent break target, enabling Ruby-style iterator patterns. **Exception handling**: THROW jumps to finally (if present) or catch. The VM does NOT auto-jump to finally on successful try completion - compilers must explicitly generate JUMPs to finally blocks. **Parameter binding priority**: Named args bind to fixed params first. Unmatched named args go to `@named` dict parameter. Fixed params bind in order: named arg > positional arg > default > null. **Native function calling**: CALL_NATIVE consumes the entire stack as arguments (different from CALL which pops specific argument counts). ## Testing Strategy Tests are organized by feature area: - **basic.test.ts**: Stack ops, arithmetic, comparisons, variables, control flow - **functions.test.ts**: Function creation, calls, closures, defaults, variadic, named args - **tail-call.test.ts**: Tail call optimization and unbounded recursion - **exceptions.test.ts**: Try/catch/finally, exception unwinding, nested handlers - **native.test.ts**: Native function interop (sync and async) - **bytecode.test.ts**: Bytecode string parser, label resolution, constants - **programmatic.test.ts**: Array format API, typed tuples, labels, functions - **validator.test.ts**: Bytecode validation rules - **examples.test.ts**: Integration tests for example programs When adding features: 1. Add unit tests for the specific opcode/feature 2. Add integration tests showing real-world usage 3. Update SPEC.md with formal specification 4. Update GUIDE.md with compiler patterns 5. Consider adding an example to examples/ ## Common Patterns ### Writing Bytecode Tests ReefVM supports two bytecode formats: string and array. **String format** (human-readable): ```typescript import { toBytecode, run } from "#reef" const bytecode = toBytecode(` PUSH 42 STORE x LOAD x HALT `) const result = await run(bytecode) // result is { type: 'number', value: 42 } ``` **Array format** (programmatic, type-safe): ```typescript import { toBytecode, run } from "#reef" const bytecode = toBytecode([ ["PUSH", 42], ["STORE", "x"], ["LOAD", "x"], ["HALT"] ]) const result = await run(bytecode) // result is { type: 'number', value: 42 } ``` Array format features: - Typed tuples for compile-time type checking - Labels defined as `[".label:"]` (single-element arrays with colon suffix) - Label references as strings: `["JUMP", ".label"]` (no colon in references) - Function params as string arrays: `["MAKE_FUNCTION", ["x", "y=10"], ".body"]` - See `tests/programmatic.test.ts` and `examples/programmatic.ts` for examples ### Native Function Registration ```typescript const vm = new VM(bytecode) vm.registerFunction('functionName', (...args: Value[]): Value => { // Implementation return toValue(result) }) await vm.run() ``` ### Label Usage (Preferred) Use labels instead of numeric offsets for readability: ``` JUMP .skip PUSH 42 HALT .skip: PUSH 99 HALT ``` ## TypeScript Configuration - Import alias: `#reef` maps to `./src/index.ts` - Module system: ES modules (`"type": "module"` in package.json) - Bun automatically handles TypeScript compilation ## Bun-Specific Notes - Use `bun` instead of `node`, `npm`, `pnpm`, or `vite` - No need for dotenv - Bun loads .env automatically - Prefer Bun APIs over Node.js equivalents when available - See .cursor/rules/use-bun-instead-of-node-vite-npm-pnpm.mdc for detailed Bun usage ## Adding a New OpCode When adding a new instruction to ReefVM, you must update multiple files in a specific order. Follow this checklist: ### 1. Define the OpCode (src/opcode.ts) Add the new opcode to the `OpCode` enum with comprehensive documentation: ```typescript export enum OpCode { // ... existing opcodes MY_NEW_OP, // operand: | stack: [inputs] → [outputs] // Description of what it does // Any important behavioral notes } ``` ### 2. Implement VM Execution (src/vm.ts) Add a case to the `execute()` method's switch statement: ```typescript async execute(instruction: Instruction) { switch (instruction.op) { // ... existing cases case OpCode.MY_NEW_OP: // Implementation // - Pop values from this.stack as needed // - Perform the operation // - Push results to this.stack // - Throw errors for invalid operations // - Use await for async operations break } } ``` Common helper methods: - `this.binaryOp((a, b) => ...)` - For binary arithmetic/comparison - `toNumber(value)`, `toString(value)`, `isTrue(value)`, `isEqual(a, b)` - Type coercion - `this.scope.get(name)`, `this.scope.set(name, value)` - Variable access ### 3. Update Validator (src/validator.ts) Add the opcode to the appropriate set: ```typescript // If your opcode requires an operand: const OPCODES_WITH_OPERANDS = new Set([ // ... existing OpCode.MY_NEW_OP, ]) // If your opcode takes no operand: const OPCODES_WITHOUT_OPERANDS = new Set([ // ... existing OpCode.MY_NEW_OP, ]) ``` If your opcode has complex operand validation, add a specific check in the validation loop around line 154. ### 4. Update Array API (src/bytecode.ts) Add your instruction to the `InstructionTuple` type: ```typescript type InstructionTuple = // ... existing types | ["MY_NEW_OP"] // No operand | ["MY_NEW_OP", string] // String operand | ["MY_NEW_OP", number] // Number operand | ["MY_NEW_OP", string, number] // Multiple operands ``` If your opcode has special operand handling, add a case in `toBytecodeFromArray()` around line 241. ### 5. Write Tests (REQUIRED) Create tests in the appropriate test file: ```typescript // tests/basic.test.ts, tests/functions.test.ts, etc. test("MY_NEW_OP description", async () => { const bytecode = toBytecode([ // Setup ["PUSH", 42], ["MY_NEW_OP"], ["HALT"] ]) const result = await run(bytecode) expect(result).toEqual({ type: "number", value: 42 }) }) // Test edge cases test("MY_NEW_OP with invalid input", async () => { // Test error conditions await expect(run(bytecode)).rejects.toThrow() }) ``` **ALWAYS write tests.** Test both success cases and error conditions. Add integration tests showing real-world usage. ### 6. Document Specification (SPEC.md) Add a formal specification entry: ```markdown #### MY_NEW_OP **Operand**: `` **Stack**: `[input] → [output]` Description of what the instruction does. **Behavior**: - Specific behavior point 1 - Specific behavior point 2 **Errors**: - Error condition 1 - Error condition 2 ``` ### 7. Update Compiler Guide (GUIDE.md) If your opcode introduces new patterns, add examples to GUIDE.md: ```markdown ### New Pattern Name \``` PUSH value MY_NEW_OP STORE result \``` Description of the pattern and when to use it. ``` ### 8. Add Examples (Optional) If your opcode enables new functionality, add an example to `examples/`: ```typescript // examples/my_feature.reef or examples/my_feature.ts const example = toBytecode([ // Demonstrate the new opcode ]) ``` ### Checklist Summary When adding an opcode, update in this order: - [ ] `src/opcode.ts` - Add enum value with docs - [ ] `src/vm.ts` - Implement execution logic - [ ] `src/validator.ts` - Add to operand requirement set - [ ] `src/bytecode.ts` - Add to InstructionTuple type - [ ] `tests/*.test.ts` - Write comprehensive tests (**REQUIRED**) - [ ] `SPEC.md` - Document formal specification - [ ] `GUIDE.md` - Add compiler patterns (if applicable) - [ ] `examples/` - Add example code (if applicable) Run `bun test` to verify all tests pass before committing. ## Common Gotchas **Jump offsets**: JUMP/JUMP_IF_FALSE/JUMP_IF_TRUE use relative offsets from the next instruction (PC + 1). PUSH_TRY/PUSH_FINALLY use absolute instruction indices. **Stack operations**: Most binary operations pop in reverse order (second operand is popped first, then first operand). **MAKE_ARRAY operand**: Specifies count, not a stack index. `MAKE_ARRAY #3` pops 3 items. **CALL_NATIVE stack behavior**: Unlike CALL, it consumes all stack values as arguments and clears the stack. **Finally blocks**: The compiler must generate explicit JUMPs to finally blocks for successful try/catch completion. The VM only auto-jumps to finally on THROW. **Variable scoping**: STORE updates existing variables in parent scopes or creates in current scope. It does NOT shadow by default. **Identifiers**: Variable and parameter names support Unicode and emoji! Valid: `💎`, `🌟`, `変数`, `counter`. Invalid: cannot start with digits or special prefixes (`.`, `#`, `@`, `...`), cannot contain whitespace or syntax characters.