Compare commits

...

10 Commits

Author SHA1 Message Date
Chris Wanstrath
2855b4fbe3 docs 2025-10-06 10:29:24 -07:00
Chris Wanstrath
eaebe10c42 update claude.md 2025-10-06 10:02:08 -07:00
c848ee0216 validator! 2025-10-06 09:55:30 -07:00
e4443c65df add operands to guide 2025-10-06 09:11:30 -07:00
d057bf4b10 lib exports 2025-10-06 09:11:22 -07:00
000eb7ad92 native example 2025-10-06 09:07:58 -07:00
7d2047f3a6 bit more 2025-10-05 22:40:22 -07:00
c94cc59dea compiler writing guide 2025-10-05 22:34:11 -07:00
078fc37a02 no kwargs 2025-10-05 22:34:07 -07:00
d8e97c0f20 test examples 2025-10-05 22:24:46 -07:00
14 changed files with 1153 additions and 164 deletions

216
CLAUDE.md
View File

@ -1,119 +1,149 @@
---
description: Use Bun instead of Node.js, npm, pnpm, or vite.
globs: "*.ts, *.tsx, *.html, *.css, *.js, *.jsx, package.json"
alwaysApply: false
---
# CLAUDE.md
## Overview
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
This is a stack based VM for a simple dyanmic language called Shrimp.
## Project Overview
Please read README.md, SPEC.md, src/vm.ts, and the examples/ to understand the VM.
ReefVM is a stack-based bytecode virtual machine for the Shrimp programming language. It implements a complete VM with closures, tail call optimization, exception handling, variadic functions, named parameters, and Ruby-style iterators with break/continue.
## Bun
**Essential reading**: Before making changes, read README.md, SPEC.md, and GUIDE.md to understand the VM architecture, instruction set, and compiler patterns.
Default to using Bun instead of Node.js.
## Development Commands
- Use `bun <file>` instead of `node <file>` or `ts-node <file>`
- Use `bun test` instead of `jest` or `vitest`
- Use `bun build <file.html|file.ts|file.css>` instead of `webpack` or `esbuild`
- Use `bun install` instead of `npm install` or `yarn install` or `pnpm install`
- Use `bun run <script>` instead of `npm run <script>` or `yarn run <script>` or `pnpm run <script>`
- Bun automatically loads .env, so don't use dotenv.
## APIs
- `Bun.serve()` supports WebSockets, HTTPS, and routes. Don't use `express`.
- `bun:sqlite` for SQLite. Don't use `better-sqlite3`.
- `Bun.redis` for Redis. Don't use `ioredis`.
- `Bun.sql` for Postgres. Don't use `pg` or `postgres.js`.
- `WebSocket` is built-in. Don't use `ws`.
- Prefer `Bun.file` over `node:fs`'s readFile/writeFile
- Bun.$`ls` instead of execa.
## Testing
Use `bun test` to run tests.
```ts#index.test.ts
import { test, expect } from "bun:test";
test("hello world", () => {
expect(1).toBe(1);
});
### Running Files
```bash
bun <file.ts> # Run TypeScript files directly
bun examples/native.ts # Run example
```
## Frontend
### Testing
```bash
bun test # Run all tests
bun test <file> # Run specific test file
bun test --watch # Watch mode
```
Use HTML imports with `Bun.serve()`. Don't use `vite`. HTML imports fully support React, CSS, Tailwind.
### Building
No build step required - Bun runs TypeScript directly.
Server:
## Architecture
```ts#index.ts
import index from "./index.html"
### Core Components
Bun.serve({
routes: {
"/": index,
"/api/users/:id": {
GET: (req) => {
return new Response(JSON.stringify({ id: req.params.id }));
},
},
},
// optional websocket support
websocket: {
open: (ws) => {
ws.send("Hello, world!");
},
message: (ws, message) => {
ws.send(message);
},
close: (ws) => {
// handle close
}
},
development: {
hmr: true,
console: true,
}
**VM Execution Model** (src/vm.ts):
- Stack-based execution with program counter (PC)
- Call stack for function frames
- Exception handler stack for try/catch/finally
- Lexical scope chain with parent references
- Native function registry for TypeScript interop
**Key subsystems**:
- **bytecode.ts**: Parser that converts human-readable bytecode strings to executable bytecode. Handles label resolution, constant pool management, and function definition parsing.
- **value.ts**: Tagged union Value type system with type coercion functions (toNumber, toString, isTrue, isEqual)
- **scope.ts**: Linked scope chain for variable resolution with lexical scoping
- **frame.ts**: Call frame tracking for function calls and break targets
- **exception.ts**: Exception handler records for try/catch/finally blocks
- **validator.ts**: Bytecode validation to catch common errors before execution
- **opcode.ts**: OpCode enum defining all VM instructions
### Critical Design Decisions
**Relative jumps**: All JUMP instructions use PC-relative offsets (not absolute addresses), making bytecode position-independent. PUSH_TRY/PUSH_FINALLY use absolute addresses.
**Truthiness semantics**: Only `null` and `false` are falsy. Unlike JavaScript, `0`, `""`, empty arrays, and empty dicts are truthy.
**No AND/OR opcodes**: Short-circuit logical operations are implemented at the compiler level using JUMP patterns with DUP.
**Tail call optimization**: TAIL_CALL reuses the current call frame instead of pushing a new one, enabling unbounded recursion.
**Break semantics**: CALL marks frames as break targets. BREAK unwinds the call stack to the most recent break target, enabling Ruby-style iterator patterns.
**Exception handling**: THROW jumps to finally (if present) or catch. The VM does NOT auto-jump to finally on successful try completion - compilers must explicitly generate JUMPs to finally blocks.
**Parameter binding priority**: Named args bind to fixed params first. Unmatched named args go to `@named` dict parameter. Fixed params bind in order: named arg > positional arg > default > null.
**Native function calling**: CALL_NATIVE consumes the entire stack as arguments (different from CALL which pops specific argument counts).
## Testing Strategy
Tests are organized by feature area:
- **basic.test.ts**: Stack ops, arithmetic, comparisons, variables, control flow
- **functions.test.ts**: Function creation, calls, closures, defaults, variadic, named args
- **tail-call.test.ts**: Tail call optimization and unbounded recursion
- **exceptions.test.ts**: Try/catch/finally, exception unwinding, nested handlers
- **native.test.ts**: Native function interop (sync and async)
- **bytecode.test.ts**: Bytecode parser, label resolution, constants
- **validator.test.ts**: Bytecode validation rules
- **examples.test.ts**: Integration tests for example programs
When adding features:
1. Add unit tests for the specific opcode/feature
2. Add integration tests showing real-world usage
3. Update SPEC.md with formal specification
4. Update GUIDE.md with compiler patterns
5. Consider adding an example to examples/
## Common Patterns
### Writing Bytecode Tests
```typescript
import { toBytecode, run } from "#reef"
const bytecode = toBytecode(`
PUSH 42
STORE x
LOAD x
HALT
`)
const result = await run(bytecode)
// result is { type: 'number', value: 42 }
```
### Native Function Registration
```typescript
const vm = new VM(bytecode)
vm.registerFunction('functionName', (...args: Value[]): Value => {
// Implementation
return toValue(result)
})
await vm.run()
```
HTML files can import .tsx, .jsx or .js files directly and Bun's bundler will transpile & bundle automatically. `<link>` tags can point to stylesheets and Bun's CSS bundler will bundle.
```html#index.html
<html>
<body>
<h1>Hello, world!</h1>
<script type="module" src="./frontend.tsx"></script>
</body>
</html>
### Label Usage (Preferred)
Use labels instead of numeric offsets for readability:
```
JUMP .skip
PUSH 42
HALT
.skip:
PUSH 99
HALT
```
With the following `frontend.tsx`:
## TypeScript Configuration
```tsx#frontend.tsx
import React from "react";
- Import alias: `#reef` maps to `./src/index.ts`
- Module system: ES modules (`"type": "module"` in package.json)
- Bun automatically handles TypeScript compilation
// import .css files directly and it works
import './index.css';
## Bun-Specific Notes
import { createRoot } from "react-dom/client";
- Use `bun` instead of `node`, `npm`, `pnpm`, or `vite`
- No need for dotenv - Bun loads .env automatically
- Prefer Bun APIs over Node.js equivalents when available
- See .cursor/rules/use-bun-instead-of-node-vite-npm-pnpm.mdc for detailed Bun usage
const root = createRoot(document.body);
## Common Gotchas
export default function Frontend() {
return <h1>Hello, world!</h1>;
}
**Jump offsets**: JUMP/JUMP_IF_FALSE/JUMP_IF_TRUE use relative offsets from the next instruction (PC + 1). PUSH_TRY/PUSH_FINALLY use absolute instruction indices.
root.render(<Frontend />);
```
**Stack operations**: Most binary operations pop in reverse order (second operand is popped first, then first operand).
Then, run index.ts
**MAKE_ARRAY operand**: Specifies count, not a stack index. `MAKE_ARRAY #3` pops 3 items.
```sh
bun --hot ./index.ts
```
**CALL_NATIVE stack behavior**: Unlike CALL, it consumes all stack values as arguments and clears the stack.
For more information, read the Bun API docs in `node_modules/bun-types/docs/**.md`.
**Finally blocks**: The compiler must generate explicit JUMPs to finally blocks for successful try/catch completion. The VM only auto-jumps to finally on THROW.
**Variable scoping**: STORE updates existing variables in parent scopes or creates in current scope. It does NOT shadow by default.

344
GUIDE.md Normal file
View File

@ -0,0 +1,344 @@
# Reef Compiler Guide
Quick reference for compiling to Reef bytecode.
## Bytecode Syntax
### Instructions
```
OPCODE operand ; comment
```
### Operand Types
**Immediate numbers** (`#N`): Counts or relative offsets
- `MAKE_ARRAY #3` - count of 3 items
- `JUMP #5` - relative offset of 5 instructions (prefer labels)
- `PUSH_TRY #10` - absolute instruction index (prefer labels)
**Labels** (`.name`): Symbolic addresses resolved at parse time
- `.label:` - define label at current position
- `JUMP .loop` - jump to label
- `MAKE_FUNCTION (x) .body` - function body at label
**Variable names**: Plain identifiers
- `LOAD counter` - load variable
- `STORE result` - store variable
**Constants**: Literals added to constants pool
- Numbers: `PUSH 42`, `PUSH 3.14`
- Strings: `PUSH "hello"` or `PUSH 'world'`
- Booleans: `PUSH true`, `PUSH false`
- Null: `PUSH null`
**Native function names**: Registered TypeScript functions
- `CALL_NATIVE print`
### Functions
```
MAKE_FUNCTION (x y) .body ; Basic
MAKE_FUNCTION (x=10 y=20) .body ; Defaults
MAKE_FUNCTION (x ...rest) .body ; Variadic
MAKE_FUNCTION (x @named) .body ; Named args
MAKE_FUNCTION (x ...rest @named) .body ; Both
```
### Function Calls
Stack order (bottom to top):
```
LOAD fn
PUSH arg1 ; Positional args
PUSH arg2
PUSH "name" ; Named arg key
PUSH "value" ; Named arg value
PUSH 2 ; Positional count
PUSH 1 ; Named count
CALL
```
## Opcodes
### Stack
- `PUSH <const>` - Push constant
- `POP` - Remove top
- `DUP` - Duplicate top
### Variables
- `LOAD <name>` - Push variable value
- `STORE <name>` - Pop and store in variable
### Arithmetic
- `ADD`, `SUB`, `MUL`, `DIV`, `MOD` - Binary ops (pop 2, push result)
### Comparison
- `EQ`, `NEQ`, `LT`, `GT`, `LTE`, `GTE` - Pop 2, push boolean
### Logic
- `NOT` - Pop 1, push !value
### Control Flow
- `JUMP .label` - Unconditional jump
- `JUMP_IF_FALSE .label` - Jump if top is false or null (pops value)
- `JUMP_IF_TRUE .label` - Jump if top is truthy (pops value)
- `HALT` - Stop execution of the program
### Functions
- `MAKE_FUNCTION (params) .body` - Create function, push to stack
- `CALL` - Call function (see calling convention above)
- `TAIL_CALL` - Tail-recursive call (no stack growth)
- `RETURN` - Return from function (pops return value)
- `BREAK` - Exit iterator/loop (unwinds to break target)
### Arrays
- `MAKE_ARRAY #N` - Pop N items, push array
- `ARRAY_GET` - Pop index and array, push element
- `ARRAY_SET` - Pop value, index, array; mutate array
- `ARRAY_PUSH` - Pop value and array, append to array
- `ARRAY_LEN` - Pop array, push length
### Dicts
- `MAKE_DICT #N` - Pop N key-value pairs, push dict
- `DICT_GET` - Pop key and dict, push value (or null)
- `DICT_SET` - Pop value, key, dict; mutate dict
- `DICT_HAS` - Pop key and dict, push boolean
### Exceptions
- `PUSH_TRY .catch` - Register exception handler
- `PUSH_FINALLY .finally` - Add finally to current handler
- `POP_TRY` - Remove handler (try succeeded)
- `THROW` - Throw exception (pops error value)
### Native
- `CALL_NATIVE <name>` - Call registered TypeScript function (consumes entire stack as args)
## Compiler Patterns
### If-Else
```
<condition>
JUMP_IF_FALSE .else
<then-block>
JUMP .end
.else:
<else-block>
.end:
```
### While Loop
```
.loop:
<condition>
JUMP_IF_FALSE .end
<body>
JUMP .loop
.end:
```
### For Loop
```
<init>
.loop:
<condition>
JUMP_IF_FALSE .end
<body>
<increment>
JUMP .loop
.end:
```
### Continue
No CONTINUE opcode. Use backward jump to loop start:
```
.loop:
<condition>
JUMP_IF_FALSE .end
<early-check>
JUMP_IF_TRUE .loop ; continue
<body>
JUMP .loop
.end:
```
### Break in Loop
Mark iterator function as break target, use BREAK opcode:
```
MAKE_FUNCTION () .each_body
STORE each
LOAD collection
LOAD each
<call-iterator-with-break-semantics>
HALT
.each_body:
<condition>
JUMP_IF_TRUE .done
<body>
BREAK ; exits to caller
.done:
RETURN
```
### Short-Circuit AND
```
<left>
DUP
JUMP_IF_FALSE .end ; Short-circuit if false
POP
<right>
.end: ; Result on stack
```
### Short-Circuit OR
```
<left>
DUP
JUMP_IF_TRUE .end ; Short-circuit if true
POP
<right>
.end: ; Result on stack
```
### Try-Catch
```
PUSH_TRY .catch
<try-block>
POP_TRY
JUMP .end
.catch:
STORE err
<catch-block>
.end:
```
### Try-Catch-Finally
```
PUSH_TRY .catch
PUSH_FINALLY .finally
<try-block>
POP_TRY
JUMP .finally ; Compiler must generate this
.catch:
STORE err
<catch-block>
JUMP .finally ; And this
.finally:
<finally-block> ; Executes in both paths
.end:
```
**Important**: VM only auto-jumps to finally on THROW. For successful try/catch, compiler must explicitly JUMP to finally.
### Closures
Functions automatically capture current scope:
```
PUSH 0
STORE counter
MAKE_FUNCTION () .increment
RETURN
.increment:
LOAD counter ; Captured variable
PUSH 1
ADD
STORE counter
LOAD counter
RETURN
```
### Tail Recursion
Use TAIL_CALL instead of CALL for last call:
```
MAKE_FUNCTION (n acc) .factorial
STORE factorial
<...>
.factorial:
LOAD n
PUSH 0
LTE
JUMP_IF_FALSE .recurse
LOAD acc
RETURN
.recurse:
LOAD factorial
LOAD n
PUSH 1
SUB
LOAD n
LOAD acc
MUL
PUSH 2
PUSH 0
TAIL_CALL ; Reuses stack frame
```
## Key Concepts
### Truthiness
Only `null` and `false` are falsy. Everything else (including `0`, `""`, empty arrays/dicts) is truthy.
### Type Coercion
**toNumber**:
- `number` → identity
- `string` → parseFloat (or 0 if invalid)
- `boolean` → 1 (true) or 0 (false)
- `null` → 0
- Others → 0
**toString**:
- `string` → identity
- `number` → string representation
- `boolean` → "true" or "false"
- `null` → "null"
- `function` → "<function>"
- `array` → "[item, item]"
- `dict` → "{key: value, ...}"
**Arithmetic ops** (ADD, SUB, MUL, DIV, MOD) coerce both operands to numbers.
**Comparison ops** (LT, GT, LTE, GTE) coerce both operands to numbers.
**Equality ops** (EQ, NEQ) use type-aware comparison with deep equality for arrays/dicts.
**Note**: There is no string concatenation operator. ADD only works with numbers.
### Scope
- Variables resolved through parent scope chain
- STORE updates existing variable or creates in current scope
- Functions capture scope at definition time
### Break Semantics
- CALL marks current frame as break target
- BREAK unwinds call stack to that target
- Used for Ruby-style iterator pattern
### Parameter Binding Priority
For function calls, parameters bound in order:
1. Positional argument (if provided)
2. Named argument (if provided and matches param name)
3. Default value (if defined)
4. Null
### Exception Handlers
- PUSH_TRY uses absolute addresses for catch blocks
- Nested try blocks form a stack
- THROW unwinds to most recent handler and jumps to finally (if present) or catch
- VM does NOT automatically jump to finally on success - compiler must generate JUMPs
- Finally execution in all cases is compiler's responsibility, not VM's
### Calling Convention
All calls push arguments in order:
1. Function
2. Positional args (in order)
3. Named args (key1, val1, key2, val2, ...)
4. Positional count (as number)
5. Named count (as number)
6. CALL or TAIL_CALL
### CALL_NATIVE Behavior
Unlike CALL, CALL_NATIVE consumes the **entire stack** as arguments and clears the stack. The native function receives all values that were on the stack at the time of the call.
### Empty Stack
- RETURN with empty stack returns null
- HALT with empty stack returns null

View File

@ -19,7 +19,7 @@ It's where Shrimp live.
- Dictionary operations (MAKE_DICT, DICT_GET, DICT_SET, DICT_HAS)
- Function operations (MAKE_FUNCTION, CALL, TAIL_CALL, RETURN) with parameter binding
- Variadic functions with positional rest parameters (`...rest`)
- Named arguments (kwargs) that collect unmatched named args into a dict (`@named`)
- Named arguments (named) that collect unmatched named args into a dict (`@named`)
- Mixed positional and named arguments with proper priority binding
- Tail call optimization with unbounded recursion (10,000+ iterations without stack overflow)
- Exception handling (PUSH_TRY, PUSH_FINALLY, POP_TRY, THROW) with nested try/finally blocks and call stack unwinding

19
bin/validate Executable file
View File

@ -0,0 +1,19 @@
#!/usr/bin/env bun
import { validateBytecode, formatValidationErrors } from "../src/validator"
const args = process.argv.slice(2)
if (args.length === 0) {
console.error("Usage: validate <file.reef>")
process.exit(1)
}
const filePath = args[0]!
const source = await Bun.file(filePath).text()
const result = validateBytecode(source)
console.log(formatValidationErrors(result))
if (!result.valid) {
process.exit(1)
}

19
examples/native.ts Normal file
View File

@ -0,0 +1,19 @@
import { VM, toBytecode, type Value, toString, toNull } from "#reef"
const bytecode = toBytecode(`
PUSH 5
PUSH 10
ADD
CALL_NATIVE print
`)
const vm = new VM(bytecode)
vm.registerFunction('print', (...args: Value[]): Value => {
console.log(...args.map(toString))
return toNull()
})
console.write('5 + 10 = ')
await vm.run()

View File

@ -1,8 +1,12 @@
import type { Bytecode } from "./bytecode"
import type { Value } from "./value"
import { type Value } from "./value"
import { VM } from "./vm"
export async function run(bytecode: Bytecode): Promise<Value> {
const vm = new VM(bytecode)
return await vm.run()
}
export { type Bytecode, toBytecode } from "./bytecode"
export { type Value, toValue, toString, toNumber, toJs, toNull } from "./value"
export { VM } from "./vm"

View File

@ -1,65 +1,65 @@
export enum OpCode {
// stack
PUSH, // operand: constant index (number)
POP, // operand: none
DUP, // operand: none
PUSH, // operand: constant index (number) | stack: [] → [value]
POP, // operand: none | stack: [value] → []
DUP, // operand: none | stack: [value] → [value, value]
// variables
LOAD, // operand: variable name (string)
STORE, // operand: variable name (string)
LOAD, // operand: variable name (identifier) | stack: [] → [value]
STORE, // operand: variable name (identifier) | stack: [value] → []
// math
ADD,
SUB,
MUL,
DIV,
MOD,
// math (coerce to number, pop 2, push result)
ADD, // operand: none | stack: [a, b] → [a + b]
SUB, // operand: none | stack: [a, b] → [a - b]
MUL, // operand: none | stack: [a, b] → [a * b]
DIV, // operand: none | stack: [a, b] → [a / b]
MOD, // operand: none | stack: [a, b] → [a % b]
// comparison
EQ,
NEQ,
LT,
GT,
LTE,
GTE,
// comparison (pop 2, push boolean)
EQ, // operand: none | stack: [a, b] → [a == b] (deep equality)
NEQ, // operand: none | stack: [a, b] → [a != b]
LT, // operand: none | stack: [a, b] → [a < b] (numeric)
GT, // operand: none | stack: [a, b] → [a > b] (numeric)
LTE, // operand: none | stack: [a, b] → [a <= b] (numeric)
GTE, // operand: none | stack: [a, b] → [a >= b] (numeric)
// logical
NOT,
NOT, // operand: none | stack: [a] → [!isTrue(a)]
// control flow
JUMP,
JUMP_IF_FALSE,
JUMP_IF_TRUE,
BREAK,
JUMP, // operand: relative offset (number) | PC-relative jump
JUMP_IF_FALSE, // operand: relative offset (number) | stack: [condition] → [] | jump if falsy
JUMP_IF_TRUE, // operand: relative offset (number) | stack: [condition] → [] | jump if truthy
BREAK, // operand: none | unwind call stack to break target
// exception handling
PUSH_TRY,
PUSH_FINALLY,
POP_TRY,
THROW,
PUSH_TRY, // operand: absolute catch address (number) | register exception handler
PUSH_FINALLY, // operand: absolute finally address (number) | add finally to current handler
POP_TRY, // operand: none | remove exception handler (try completed successfully)
THROW, // operand: none | stack: [error] → (unwound) | throw exception
// functions
MAKE_FUNCTION,
CALL,
TAIL_CALL,
RETURN,
MAKE_FUNCTION, // operand: function def index (number) | stack: [] → [function] | captures scope
CALL, // operand: none | stack: [fn, ...args, posCount, namedCount] → [result] | marks break target
TAIL_CALL, // operand: none | stack: [fn, ...args, posCount, namedCount] → [result] | reuses frame
RETURN, // operand: none | stack: [value] → (restored with value) | return from function
// arrays
MAKE_ARRAY,
ARRAY_GET,
ARRAY_SET,
ARRAY_PUSH,
ARRAY_LEN,
MAKE_ARRAY, // operand: item count (number) | stack: [item1, ..., itemN] → [array]
ARRAY_GET, // operand: none | stack: [array, index] → [value]
ARRAY_SET, // operand: none | stack: [array, index, value] → [] | mutates array
ARRAY_PUSH, // operand: none | stack: [array, value] → [] | mutates array
ARRAY_LEN, // operand: none | stack: [array] → [length]
// dicts
MAKE_DICT,
DICT_GET,
DICT_SET,
DICT_HAS,
MAKE_DICT, // operand: pair count (number) | stack: [key1, val1, ..., keyN, valN] → [dict]
DICT_GET, // operand: none | stack: [dict, key] → [value] | returns null if missing
DICT_SET, // operand: none | stack: [dict, key, value] → [] | mutates dict
DICT_HAS, // operand: none | stack: [dict, key] → [boolean]
// typescript interop
CALL_NATIVE,
CALL_NATIVE, // operand: function name (identifier) | stack: [...args] → [result] | consumes entire stack
// special
HALT
HALT // operand: none | stop execution
}

333
src/validator.ts Normal file
View File

@ -0,0 +1,333 @@
import { OpCode } from "./opcode"
export type ValidationError = {
line: number
message: string
}
export type ValidationResult = {
valid: boolean
errors: ValidationError[]
}
// Opcodes that require operands
const OPCODES_WITH_OPERANDS = new Set([
OpCode.PUSH,
OpCode.LOAD,
OpCode.STORE,
OpCode.JUMP,
OpCode.JUMP_IF_FALSE,
OpCode.JUMP_IF_TRUE,
OpCode.PUSH_TRY,
OpCode.PUSH_FINALLY,
OpCode.MAKE_ARRAY,
OpCode.MAKE_DICT,
OpCode.MAKE_FUNCTION,
OpCode.CALL_NATIVE,
])
// Opcodes that should NOT have operands
const OPCODES_WITHOUT_OPERANDS = new Set([
OpCode.POP,
OpCode.DUP,
OpCode.ADD,
OpCode.SUB,
OpCode.MUL,
OpCode.DIV,
OpCode.MOD,
OpCode.EQ,
OpCode.NEQ,
OpCode.LT,
OpCode.GT,
OpCode.LTE,
OpCode.GTE,
OpCode.NOT,
OpCode.HALT,
OpCode.BREAK,
OpCode.POP_TRY,
OpCode.THROW,
OpCode.CALL,
OpCode.TAIL_CALL,
OpCode.RETURN,
OpCode.ARRAY_GET,
OpCode.ARRAY_SET,
OpCode.ARRAY_PUSH,
OpCode.ARRAY_LEN,
OpCode.DICT_GET,
OpCode.DICT_SET,
OpCode.DICT_HAS,
])
export function validateBytecode(source: string): ValidationResult {
const errors: ValidationError[] = []
const lines = source.split("\n")
const labels = new Map<string, number>()
const labelReferences = new Map<string, number[]>()
let instructionCount = 0
// First pass: collect labels and check for duplicates
for (let i = 0; i < lines.length; i++) {
const lineNum = i + 1
let line = lines[i]!
// Strip comments
const commentIndex = line.indexOf(';')
if (commentIndex !== -1) {
line = line.slice(0, commentIndex)
}
const trimmed = line.trim()
if (!trimmed) continue
// Check for label definition
if (/^\.[a-zA-Z_][a-zA-Z0-9_]*:$/.test(trimmed)) {
const labelName = trimmed.slice(1, -1)
if (labels.has(labelName)) {
errors.push({
line: lineNum,
message: `Duplicate label: .${labelName} (first defined at line ${labels.get(labelName)})`,
})
} else {
labels.set(labelName, lineNum)
}
continue
}
instructionCount++
}
// Second pass: validate instructions
instructionCount = 0
for (let i = 0; i < lines.length; i++) {
const lineNum = i + 1
let line = lines[i]!
// Strip comments
const commentIndex = line.indexOf(';')
if (commentIndex !== -1) {
line = line.slice(0, commentIndex)
}
const trimmed = line.trim()
if (!trimmed) continue
// Skip label definitions
if (/^\.[a-zA-Z_][a-zA-Z0-9_]*:$/.test(trimmed)) {
continue
}
instructionCount++
const parts = trimmed.split(/\s+/)
const opName = parts[0]!
const operand = parts.slice(1).join(' ')
// Check if opcode exists
const opCode = OpCode[opName as keyof typeof OpCode]
if (opCode === undefined) {
errors.push({
line: lineNum,
message: `Unknown opcode: ${opName}`,
})
continue
}
// Check operand requirements
if (OPCODES_WITH_OPERANDS.has(opCode) && !operand) {
errors.push({
line: lineNum,
message: `${opName} requires an operand`,
})
continue
}
if (OPCODES_WITHOUT_OPERANDS.has(opCode) && operand) {
errors.push({
line: lineNum,
message: `${opName} does not take an operand`,
})
continue
}
// Validate specific operand formats
if (operand) {
// Check for label references
if (operand.startsWith('.') && !operand.includes('(')) {
const labelName = operand.slice(1)
if (!labelReferences.has(labelName)) {
labelReferences.set(labelName, [])
}
labelReferences.get(labelName)!.push(lineNum)
}
// Validate MAKE_FUNCTION syntax
if (opCode === OpCode.MAKE_FUNCTION) {
if (!operand.startsWith('(')) {
errors.push({
line: lineNum,
message: `MAKE_FUNCTION requires parameter list: MAKE_FUNCTION (params) address`,
})
continue
}
const match = operand.match(/^(\(.*?\))\s+(.+)$/)
if (!match) {
errors.push({
line: lineNum,
message: `Invalid MAKE_FUNCTION syntax: expected (params) address`,
})
continue
}
const [, paramStr, bodyAddr] = match
// Validate parameter syntax
const paramList = paramStr!.slice(1, -1).trim()
if (paramList) {
const params = paramList.split(/\s+/)
let seenVariadic = false
let seenNamed = false
for (const param of params) {
// Check for invalid order
if (seenVariadic && !param.startsWith('@')) {
errors.push({
line: lineNum,
message: `Invalid parameter order: variadic parameter (...) must come before named parameter (@)`,
})
}
if (seenNamed) {
errors.push({
line: lineNum,
message: `Invalid parameter order: named parameter (@) must be last`,
})
}
// Check parameter format
if (param.startsWith('...')) {
seenVariadic = true
const name = param.slice(3)
if (!/^[a-zA-Z_][a-zA-Z0-9_]*$/.test(name)) {
errors.push({
line: lineNum,
message: `Invalid variadic parameter name: ${param}`,
})
}
} else if (param.startsWith('@')) {
seenNamed = true
const name = param.slice(1)
if (!/^[a-zA-Z_][a-zA-Z0-9_]*$/.test(name)) {
errors.push({
line: lineNum,
message: `Invalid named parameter name: ${param}`,
})
}
} else if (param.includes('=')) {
// Default parameter
const [name, defaultValue] = param.split('=')
if (!/^[a-zA-Z_][a-zA-Z0-9_]*$/.test(name!.trim())) {
errors.push({
line: lineNum,
message: `Invalid parameter name: ${name}`,
})
}
} else {
// Regular parameter
if (!/^[a-zA-Z_][a-zA-Z0-9_]*$/.test(param)) {
errors.push({
line: lineNum,
message: `Invalid parameter name: ${param}`,
})
}
}
}
}
// Validate body address
if (!bodyAddr!.startsWith('.') && !bodyAddr!.startsWith('#')) {
errors.push({
line: lineNum,
message: `Invalid body address: expected .label or #offset`,
})
}
// If it's a label, track it
if (bodyAddr!.startsWith('.')) {
const labelName = bodyAddr!.slice(1)
if (!labelReferences.has(labelName)) {
labelReferences.set(labelName, [])
}
labelReferences.get(labelName)!.push(lineNum)
}
}
// Validate immediate numbers
if (operand.startsWith('#')) {
const numStr = operand.slice(1)
if (!/^-?\d+$/.test(numStr)) {
errors.push({
line: lineNum,
message: `Invalid immediate number: ${operand}`,
})
}
}
// Validate variable names for LOAD/STORE
if ((opCode === OpCode.LOAD || opCode === OpCode.STORE) &&
!/^[a-zA-Z_][a-zA-Z0-9_]*$/.test(operand)) {
errors.push({
line: lineNum,
message: `Invalid variable name: ${operand}`,
})
}
// Validate string constants
if ((operand.startsWith('"') || operand.startsWith("'")) &&
!(operand.endsWith('"') || operand.endsWith("'"))) {
errors.push({
line: lineNum,
message: `Unterminated string: ${operand}`,
})
}
}
}
// Check for undefined label references
for (const [labelName, refLines] of labelReferences) {
if (!labels.has(labelName)) {
for (const refLine of refLines) {
errors.push({
line: refLine,
message: `Undefined label: .${labelName}`,
})
}
}
}
// Sort errors by line number
errors.sort((a, b) => a.line - b.line)
return {
valid: errors.length === 0,
errors,
}
}
export function formatValidationErrors(result: ValidationResult): string {
if (result.valid) {
return "✓ Bytecode is valid"
}
const lines: string[] = [
`✗ Found ${result.errors.length} error${result.errors.length === 1 ? '' : 's'}:`,
'',
]
for (const error of result.errors) {
lines.push(` Line ${error.line}: ${error.message}`)
}
return lines.join('\n')
}

View File

@ -132,3 +132,7 @@ export function toJs(v: Value): any {
case 'function': return '<function>'
}
}
export function toNull(): Value {
return toValue(null)
}

View File

@ -386,7 +386,7 @@ export class VM {
// Check if named argument was provided for this param
if (namedArgs.has(paramName)) {
this.scope.set(paramName, namedArgs.get(paramName)!)
namedArgs.delete(paramName) // Remove from named args so it won't go to kwargs
namedArgs.delete(paramName) // Remove from named args so it won't go to named
} else if (positionalArgs[i] !== undefined) {
this.scope.set(paramName, positionalArgs[i]!)
} else if (fn.defaults[paramName] !== undefined) {
@ -410,11 +410,11 @@ export class VM {
// Handle named parameter (collect remaining named args that didn't match params)
if (fn.named) {
const namedParamName = fn.params[fn.params.length - 1]!
const kwargsDict = new Map<string, Value>()
const namedDict = new Map<string, Value>()
for (const [key, value] of namedArgs) {
kwargsDict.set(key, value)
namedDict.set(key, value)
}
this.scope.set(namedParamName, { type: 'dict', value: kwargsDict })
this.scope.set(namedParamName, { type: 'dict', value: namedDict })
}
// subtract 1 because pc was incremented
@ -488,11 +488,11 @@ export class VM {
// Handle named parameter
if (tailFn.named) {
const namedParamName = tailFn.params[tailFn.params.length - 1]!
const kwargsDict = new Map<string, Value>()
const namedDict = new Map<string, Value>()
for (const [key, value] of tailNamedArgs) {
kwargsDict.set(key, value)
namedDict.set(key, value)
}
this.scope.set(namedParamName, { type: 'dict', value: kwargsDict })
this.scope.set(namedParamName, { type: 'dict', value: namedDict })
}
// subtract 1 because PC was incremented

31
tests/examples.test.ts Normal file
View File

@ -0,0 +1,31 @@
import { test, expect } from "bun:test"
import { readdirSync } from "fs"
import { join } from "path"
import { toBytecode } from "#bytecode"
import { VM } from "#vm"
// Get all .reef files from examples directory
const examplesDir = join(import.meta.dir, "..", "examples")
const exampleFiles = readdirSync(examplesDir)
.filter(file => file.endsWith(".reef"))
.sort()
// Create a test for each example file
for (const file of exampleFiles) {
test(`examples/${file} runs without errors`, async () => {
const filePath = join(examplesDir, file)
const content = await Bun.file(filePath).text()
// Parse and run the bytecode
const bytecode = toBytecode(content)
const vm = new VM(bytecode)
// Should not throw
const result = await vm.run()
// Result should be a valid Value
expect(result).toBeDefined()
expect(result).toHaveProperty("type")
expect(["null", "boolean", "number", "string", "array", "dict", "function"]).toContain(result.type)
})
}

View File

@ -201,7 +201,7 @@ test("TAIL_CALL - variadic function", async () => {
test("CALL - named args function with no fixed params", async () => {
const bytecode = toBytecode(`
MAKE_FUNCTION (@kwargs) #9
MAKE_FUNCTION (@named) #9
PUSH "name"
PUSH "Bob"
PUSH "age"
@ -210,7 +210,7 @@ test("CALL - named args function with no fixed params", async () => {
PUSH 2
CALL
HALT
LOAD kwargs
LOAD named
RETURN
`)
@ -224,7 +224,7 @@ test("CALL - named args function with no fixed params", async () => {
test("CALL - named args function with one fixed param", async () => {
const bytecode = toBytecode(`
MAKE_FUNCTION (x @kwargs) #8
MAKE_FUNCTION (x @named) #8
PUSH 10
PUSH "name"
PUSH "Alice"
@ -232,7 +232,7 @@ test("CALL - named args function with one fixed param", async () => {
PUSH 1
CALL
HALT
LOAD kwargs
LOAD named
RETURN
`)
@ -244,9 +244,9 @@ test("CALL - named args function with one fixed param", async () => {
}
})
test("CALL - named args with matching param name should bind to param not kwargs", async () => {
test("CALL - named args with matching param name should bind to param not named", async () => {
const bytecode = toBytecode(`
MAKE_FUNCTION (name @kwargs) #8
MAKE_FUNCTION (name @named) #8
PUSH "Bob"
PUSH "age"
PUSH 50
@ -259,13 +259,13 @@ test("CALL - named args with matching param name should bind to param not kwargs
`)
const result = await new VM(bytecode).run()
// name should be bound as regular param, not collected in kwargs
// name should be bound as regular param, not collected in named
expect(result).toEqual({ type: 'string', value: 'Bob' })
})
test("CALL - named args that match param names should not be in kwargs", async () => {
test("CALL - named args that match param names should not be in named", async () => {
const bytecode = toBytecode(`
MAKE_FUNCTION (name age @kwargs) #9
MAKE_FUNCTION (name age @named) #9
PUSH "name"
PUSH "Bob"
PUSH "city"
@ -274,14 +274,14 @@ test("CALL - named args that match param names should not be in kwargs", async (
PUSH 2
CALL
HALT
LOAD kwargs
LOAD named
RETURN
`)
const result = await new VM(bytecode).run()
expect(result.type).toBe('dict')
if (result.type === 'dict') {
// Only city should be in kwargs, name should be bound to param
// Only city should be in named, name should be bound to param
expect(result.value.get('city')).toEqual({ type: 'string', value: 'NYC' })
expect(result.value.has('name')).toBe(false)
expect(result.value.size).toBe(1)
@ -290,7 +290,7 @@ test("CALL - named args that match param names should not be in kwargs", async (
test("CALL - mixed variadic and named args", async () => {
const bytecode = toBytecode(`
MAKE_FUNCTION (x ...rest @kwargs) #10
MAKE_FUNCTION (x ...rest @named) #10
PUSH 1
PUSH 2
PUSH 3
@ -315,9 +315,9 @@ test("CALL - mixed variadic and named args", async () => {
})
})
test("CALL - mixed variadic and named args, check kwargs", async () => {
test("CALL - mixed variadic and named args, check named", async () => {
const bytecode = toBytecode(`
MAKE_FUNCTION (x ...rest @kwargs) #10
MAKE_FUNCTION (x ...rest @named) #10
PUSH 1
PUSH 2
PUSH 3
@ -327,7 +327,7 @@ test("CALL - mixed variadic and named args, check kwargs", async () => {
PUSH 1
CALL
HALT
LOAD kwargs
LOAD named
RETURN
`)
@ -340,18 +340,18 @@ test("CALL - mixed variadic and named args, check kwargs", async () => {
test("CALL - named args with no extra named args", async () => {
const bytecode = toBytecode(`
MAKE_FUNCTION (x @kwargs) #6
MAKE_FUNCTION (x @named) #6
PUSH 10
PUSH 1
PUSH 0
CALL
HALT
LOAD kwargs
LOAD named
RETURN
`)
const result = await new VM(bytecode).run()
// kwargs should be empty dict
// named should be empty dict
expect(result.type).toBe('dict')
if (result.type === 'dict') {
expect(result.value.size).toBe(0)
@ -360,7 +360,7 @@ test("CALL - named args with no extra named args", async () => {
test("CALL - named args with defaults on fixed params", async () => {
const bytecode = toBytecode(`
MAKE_FUNCTION (x=5 @kwargs) #7
MAKE_FUNCTION (x=5 @named) #7
PUSH "name"
PUSH "Alice"
PUSH 0

202
tests/validator.test.ts Normal file
View File

@ -0,0 +1,202 @@
import { test, expect } from "bun:test"
import { validateBytecode, formatValidationErrors } from "#validator"
test("valid bytecode passes validation", () => {
const source = `
PUSH 1
PUSH 2
ADD
HALT
`
const result = validateBytecode(source)
expect(result.valid).toBe(true)
expect(result.errors).toHaveLength(0)
})
test("valid bytecode with labels passes validation", () => {
const source = `
JUMP .end
PUSH 999
.end:
PUSH 42
HALT
`
const result = validateBytecode(source)
expect(result.valid).toBe(true)
expect(result.errors).toHaveLength(0)
})
test("detects unknown opcode", () => {
const source = `
PUSH 1
INVALID_OP
HALT
`
const result = validateBytecode(source)
expect(result.valid).toBe(false)
expect(result.errors).toHaveLength(1)
expect(result.errors[0]!.message).toContain("Unknown opcode: INVALID_OP")
})
test("detects undefined label", () => {
const source = `
JUMP .nowhere
HALT
`
const result = validateBytecode(source)
expect(result.valid).toBe(false)
expect(result.errors).toHaveLength(1)
expect(result.errors[0]!.message).toContain("Undefined label: .nowhere")
})
test("detects duplicate labels", () => {
const source = `
.loop:
PUSH 1
.loop:
PUSH 2
`
const result = validateBytecode(source)
expect(result.valid).toBe(false)
expect(result.errors).toHaveLength(1)
expect(result.errors[0]!.message).toContain("Duplicate label: .loop")
})
test("detects missing operand", () => {
const source = `
PUSH
HALT
`
const result = validateBytecode(source)
expect(result.valid).toBe(false)
expect(result.errors).toHaveLength(1)
expect(result.errors[0]!.message).toContain("PUSH requires an operand")
})
test("detects unexpected operand", () => {
const source = `
ADD 42
HALT
`
const result = validateBytecode(source)
expect(result.valid).toBe(false)
expect(result.errors).toHaveLength(1)
expect(result.errors[0]!.message).toContain("ADD does not take an operand")
})
test("detects invalid MAKE_FUNCTION syntax", () => {
const source = `
MAKE_FUNCTION x y .body
HALT
`
const result = validateBytecode(source)
expect(result.valid).toBe(false)
expect(result.errors[0]!.message).toContain("MAKE_FUNCTION requires parameter list")
})
test("detects invalid parameter order", () => {
const source = `
MAKE_FUNCTION (x ...rest y) .body
HALT
.body:
RETURN
`
const result = validateBytecode(source)
expect(result.valid).toBe(false)
expect(result.errors[0]!.message).toContain("variadic parameter")
})
test("detects invalid parameter name", () => {
const source = `
MAKE_FUNCTION (123invalid) .body
HALT
.body:
RETURN
`
const result = validateBytecode(source)
expect(result.valid).toBe(false)
expect(result.errors[0]!.message).toContain("Invalid parameter name")
})
test("detects invalid variable name", () => {
const source = `
LOAD 123invalid
HALT
`
const result = validateBytecode(source)
expect(result.valid).toBe(false)
expect(result.errors[0]!.message).toContain("Invalid variable name")
})
test("detects unterminated string", () => {
const source = `
PUSH "unterminated
HALT
`
const result = validateBytecode(source)
expect(result.valid).toBe(false)
expect(result.errors[0]!.message).toContain("Unterminated string")
})
test("detects invalid immediate number", () => {
const source = `
MAKE_ARRAY #abc
`
const result = validateBytecode(source)
expect(result.valid).toBe(false)
expect(result.errors[0]!.message).toContain("Invalid immediate number")
})
test("handles comments correctly", () => {
const source = `
PUSH 1 ; this is a comment
; this entire line is a comment
PUSH 2
ADD ; another comment
`
const result = validateBytecode(source)
expect(result.valid).toBe(true)
})
test("validates function with label reference", () => {
const source = `
MAKE_FUNCTION (x y) .body
JUMP .skip
.body:
LOAD x
LOAD y
ADD
RETURN
.skip:
HALT
`
const result = validateBytecode(source)
expect(result.valid).toBe(true)
})
test("detects multiple errors and sorts by line", () => {
const source = `
UNKNOWN_OP
PUSH
JUMP .undefined
`
const result = validateBytecode(source)
expect(result.valid).toBe(false)
expect(result.errors.length).toBeGreaterThanOrEqual(2)
// Check that errors are sorted by line number
for (let i = 1; i < result.errors.length; i++) {
expect(result.errors[i]!.line).toBeGreaterThanOrEqual(result.errors[i-1]!.line)
}
})
test("formatValidationErrors produces readable output", () => {
const source = `
PUSH 1
UNKNOWN
`
const result = validateBytecode(source)
const formatted = formatValidationErrors(result)
expect(formatted).toContain("error")
expect(formatted).toContain("Line")
expect(formatted).toContain("UNKNOWN")
})

View File

@ -28,6 +28,9 @@
"paths": {
"#*": [
"./src/*"
],
"#reef": [
"./src/index.ts"
]
},
}