forked from defunkt/ReefVM
1004 lines
24 KiB
Markdown
1004 lines
24 KiB
Markdown
# Reef Compiler Guide
|
|
|
|
Quick reference for compiling to Reef bytecode.
|
|
|
|
## Bytecode Formats
|
|
|
|
ReefVM supports two bytecode formats:
|
|
|
|
1. **String format**: Human-readable text with opcodes and operands
|
|
2. **Array format**: TypeScript arrays with typed tuples for programmatic generation
|
|
|
|
Both formats are compiled using the same `toBytecode()` function.
|
|
|
|
## Bytecode Syntax
|
|
|
|
### Instructions
|
|
```
|
|
OPCODE operand ; comment
|
|
```
|
|
|
|
### Operand Types
|
|
|
|
**Immediate numbers** (`#N`): Counts or relative offsets
|
|
- `MAKE_ARRAY #3` - count of 3 items
|
|
- `JUMP #5` - relative offset of 5 instructions (prefer labels)
|
|
- `PUSH_TRY #10` - absolute instruction index (prefer labels)
|
|
|
|
**Labels** (`.name`): Symbolic addresses resolved at parse time
|
|
- `.label:` - define label at current position
|
|
- `JUMP .loop` - jump to label
|
|
- `MAKE_FUNCTION (x) .body` - function body at label
|
|
|
|
**Variable names**: Plain identifiers (supports Unicode and emoji!)
|
|
- `LOAD counter` - load variable
|
|
- `STORE result` - store variable
|
|
- `LOAD 💎` - load emoji variable
|
|
- `STORE 変数` - store Unicode variable
|
|
|
|
**Constants**: Literals added to constants pool
|
|
- Numbers: `PUSH 42`, `PUSH 3.14`
|
|
- Strings: `PUSH "hello"` or `PUSH 'world'`
|
|
- Booleans: `PUSH true`, `PUSH false`
|
|
- Null: `PUSH null`
|
|
|
|
## Array Format
|
|
|
|
The programmatic array format uses TypeScript tuples for type safety:
|
|
|
|
```typescript
|
|
import { toBytecode, run } from "#reef"
|
|
|
|
const bytecode = toBytecode([
|
|
["PUSH", 42], // Atom values: number | string | boolean | null
|
|
["STORE", "x"], // Variable names as strings
|
|
["LOAD", "x"],
|
|
["HALT"]
|
|
])
|
|
|
|
const result = await run(bytecode)
|
|
```
|
|
|
|
### Operand Types in Array Format
|
|
|
|
**Atoms** (`number | string | boolean | null`): Constants for PUSH
|
|
```typescript
|
|
["PUSH", 42]
|
|
["PUSH", "hello"]
|
|
["PUSH", true]
|
|
["PUSH", null]
|
|
```
|
|
|
|
**Variable names**: String identifiers
|
|
```typescript
|
|
["LOAD", "counter"]
|
|
["STORE", "result"]
|
|
```
|
|
|
|
**Label definitions**: Single-element arrays starting with `.` and ending with `:`
|
|
```typescript
|
|
[".loop:"]
|
|
[".end:"]
|
|
[".function_body:"]
|
|
```
|
|
|
|
**Label references**: Strings in jump/function instructions
|
|
```typescript
|
|
["JUMP", ".loop"]
|
|
["JUMP_IF_FALSE", ".end"]
|
|
["MAKE_FUNCTION", ["x", "y"], ".body"]
|
|
["PUSH_TRY", ".catch"]
|
|
```
|
|
|
|
**Counts**: Numbers for array/dict construction
|
|
```typescript
|
|
["MAKE_ARRAY", 3] // Pop 3 items
|
|
["MAKE_DICT", 2] // Pop 2 key-value pairs
|
|
```
|
|
|
|
### Functions in Array Format
|
|
|
|
```typescript
|
|
// Basic function
|
|
["MAKE_FUNCTION", ["x", "y"], ".body"]
|
|
|
|
// With defaults
|
|
["MAKE_FUNCTION", ["x", "y=10"], ".body"]
|
|
|
|
// Variadic
|
|
["MAKE_FUNCTION", ["...args"], ".body"]
|
|
|
|
// Named args
|
|
["MAKE_FUNCTION", ["@opts"], ".body"]
|
|
|
|
// Mixed
|
|
["MAKE_FUNCTION", ["x", "y=5", "...rest", "@opts"], ".body"]
|
|
```
|
|
|
|
### Complete Example
|
|
|
|
```typescript
|
|
const factorial = toBytecode([
|
|
["MAKE_FUNCTION", ["n", "acc=1"], ".fact"],
|
|
["STORE", "factorial"],
|
|
["JUMP", ".main"],
|
|
|
|
[".fact:"],
|
|
["LOAD", "n"],
|
|
["PUSH", 0],
|
|
["LTE"],
|
|
["JUMP_IF_FALSE", ".recurse"],
|
|
["LOAD", "acc"],
|
|
["RETURN"],
|
|
|
|
[".recurse:"],
|
|
["LOAD", "factorial"],
|
|
["LOAD", "n"],
|
|
["PUSH", 1],
|
|
["SUB"],
|
|
["LOAD", "n"],
|
|
["LOAD", "acc"],
|
|
["MUL"],
|
|
["PUSH", 2],
|
|
["PUSH", 0],
|
|
["TAIL_CALL"],
|
|
|
|
[".main:"],
|
|
["LOAD", "factorial"],
|
|
["PUSH", 5],
|
|
["PUSH", 1],
|
|
["PUSH", 0],
|
|
["CALL"],
|
|
["HALT"]
|
|
])
|
|
|
|
const result = await run(factorial) // { type: "number", value: 120 }
|
|
```
|
|
|
|
## String Format
|
|
|
|
### Functions
|
|
```
|
|
MAKE_FUNCTION (x y) .body ; Basic
|
|
MAKE_FUNCTION (x=10 y=20) .body ; Defaults
|
|
MAKE_FUNCTION (x ...rest) .body ; Variadic
|
|
MAKE_FUNCTION (x @named) .body ; Named args
|
|
MAKE_FUNCTION (x ...rest @named) .body ; Both
|
|
```
|
|
|
|
### Function Calls
|
|
Stack order (bottom to top):
|
|
```
|
|
LOAD fn
|
|
PUSH arg1 ; Positional args
|
|
PUSH arg2
|
|
PUSH "name" ; Named arg key
|
|
PUSH "value" ; Named arg value
|
|
PUSH 2 ; Positional count
|
|
PUSH 1 ; Named count
|
|
CALL
|
|
```
|
|
|
|
**Null triggers defaults**: Pass `null` to use default values:
|
|
```
|
|
; Function: greet(name='Guest', msg='Hello')
|
|
LOAD greet
|
|
PUSH null ; Use default for 'name'
|
|
PUSH "Hi" ; Provide 'msg'
|
|
PUSH 2
|
|
PUSH 0
|
|
CALL ; → "Hi, Guest"
|
|
```
|
|
|
|
## Opcodes
|
|
|
|
### Stack
|
|
- `PUSH <const>` - Push constant
|
|
- `POP` - Remove top
|
|
- `DUP` - Duplicate top
|
|
- `SWAP` - Swap top two values
|
|
- `TYPE` - Pop value, push its type as string
|
|
|
|
### Variables
|
|
- `LOAD <name>` - Push variable value (throws if not found)
|
|
- `TRY_LOAD <name>` - Push variable value if found, otherwise push name as string (never throws)
|
|
- `STORE <name>` - Pop and store in variable
|
|
|
|
### Arithmetic
|
|
- `ADD`, `SUB`, `MUL`, `DIV`, `MOD` - Binary ops (pop 2, push result)
|
|
|
|
### Bitwise
|
|
- `BIT_AND`, `BIT_OR`, `BIT_XOR` - Bitwise logical ops (pop 2, push result)
|
|
- `BIT_SHL`, `BIT_SHR`, `BIT_USHR` - Bitwise shift ops (pop 2, push result)
|
|
|
|
### Comparison
|
|
- `EQ`, `NEQ`, `LT`, `GT`, `LTE`, `GTE` - Pop 2, push boolean
|
|
|
|
### Logic
|
|
- `NOT` - Pop 1, push !value
|
|
|
|
### Control Flow
|
|
- `JUMP .label` - Unconditional jump
|
|
- `JUMP_IF_FALSE .label` - Jump if top is false or null (pops value)
|
|
- `JUMP_IF_TRUE .label` - Jump if top is truthy (pops value)
|
|
- `HALT` - Stop execution of the program
|
|
|
|
### Functions
|
|
- `MAKE_FUNCTION (params) .body` - Create function, push to stack
|
|
- `CALL` - Call function (see calling convention above)
|
|
- `TAIL_CALL` - Tail-recursive call (no stack growth)
|
|
- `RETURN` - Return from function (pops return value)
|
|
- `TRY_CALL <name>` - Call function (if found), push value (if exists), or push name as string (if not found)
|
|
- `BREAK` - Exit iterator/loop (unwinds to break target)
|
|
|
|
### Arrays
|
|
- `MAKE_ARRAY #N` - Pop N items, push array
|
|
- `ARRAY_GET` - Pop index and array, push element
|
|
- `ARRAY_SET` - Pop value, index, array; mutate array
|
|
- `ARRAY_PUSH` - Pop value and array, append to array
|
|
- `ARRAY_LEN` - Pop array, push length
|
|
|
|
### Dicts
|
|
- `MAKE_DICT #N` - Pop N key-value pairs, push dict
|
|
- `DICT_GET` - Pop key and dict, push value (or null)
|
|
- `DICT_SET` - Pop value, key, dict; mutate dict
|
|
- `DICT_HAS` - Pop key and dict, push boolean
|
|
|
|
### Unified Access
|
|
- `DOT_GET` - Pop index/key and array/dict, push value (null if missing)
|
|
|
|
### Strings
|
|
- `STR_CONCAT #N` - Pop N values, convert to strings, concatenate, push result
|
|
|
|
### Exceptions
|
|
- `PUSH_TRY .catch` - Register exception handler
|
|
- `PUSH_FINALLY .finally` - Add finally to current handler
|
|
- `POP_TRY` - Remove handler (try succeeded)
|
|
- `THROW` - Throw exception (pops error value)
|
|
|
|
## Compiler Patterns
|
|
|
|
### Function Definitions
|
|
|
|
When defining functions, you must prevent the PC from "falling through" into the function body during sequential execution. There are two standard patterns:
|
|
|
|
**Pattern 1: JUMP over function bodies (Recommended)**
|
|
```
|
|
MAKE_FUNCTION (params) .body
|
|
STORE function_name
|
|
JUMP .end ; Skip over function body
|
|
.body:
|
|
<function code>
|
|
RETURN
|
|
.end:
|
|
<continue with program>
|
|
```
|
|
|
|
**Pattern 2: Function bodies after HALT**
|
|
```
|
|
MAKE_FUNCTION (params) .body
|
|
STORE function_name
|
|
<use the function>
|
|
HALT ; Stop execution before function bodies
|
|
.body:
|
|
<function code>
|
|
RETURN
|
|
```
|
|
|
|
**Important**: Pattern 2 only works if you HALT before reaching function bodies. Pattern 1 is more flexible and required for:
|
|
- Defining multiple functions before using them
|
|
- REPL mode (incremental execution)
|
|
- Any case where execution continues after defining a function
|
|
|
|
**Why?** `MAKE_FUNCTION` creates a function value but doesn't jump to the body—it just stores the body's address. Without JUMP or HALT, the PC increments into the function body and executes it as top-level code.
|
|
|
|
### If-Else
|
|
```
|
|
<condition>
|
|
JUMP_IF_FALSE .else
|
|
<then-block>
|
|
JUMP .end
|
|
.else:
|
|
<else-block>
|
|
.end:
|
|
```
|
|
|
|
### While Loop
|
|
```
|
|
.loop:
|
|
<condition>
|
|
JUMP_IF_FALSE .end
|
|
<body>
|
|
JUMP .loop
|
|
.end:
|
|
```
|
|
|
|
### For Loop
|
|
```
|
|
<init>
|
|
.loop:
|
|
<condition>
|
|
JUMP_IF_FALSE .end
|
|
<body>
|
|
<increment>
|
|
JUMP .loop
|
|
.end:
|
|
```
|
|
|
|
### Continue
|
|
No CONTINUE opcode. Use backward jump to loop start:
|
|
```
|
|
.loop:
|
|
<condition>
|
|
JUMP_IF_FALSE .end
|
|
<early-check>
|
|
JUMP_IF_TRUE .loop ; continue
|
|
<body>
|
|
JUMP .loop
|
|
.end:
|
|
```
|
|
|
|
### Break in Loop
|
|
Mark iterator function as break target, use BREAK opcode:
|
|
```
|
|
MAKE_FUNCTION () .each_body
|
|
STORE each
|
|
LOAD collection
|
|
LOAD each
|
|
<call-iterator-with-break-semantics>
|
|
HALT
|
|
|
|
.each_body:
|
|
<condition>
|
|
JUMP_IF_TRUE .done
|
|
<body>
|
|
BREAK ; exits to caller
|
|
.done:
|
|
RETURN
|
|
```
|
|
|
|
### Short-Circuit AND
|
|
```
|
|
<left>
|
|
DUP
|
|
JUMP_IF_FALSE .end ; Short-circuit if false
|
|
POP
|
|
<right>
|
|
.end: ; Result on stack
|
|
```
|
|
|
|
### Short-Circuit OR
|
|
```
|
|
<left>
|
|
DUP
|
|
JUMP_IF_TRUE .end ; Short-circuit if true
|
|
POP
|
|
<right>
|
|
.end: ; Result on stack
|
|
```
|
|
|
|
### Reversing Operand Order
|
|
Use SWAP to reverse operand order for non-commutative operations:
|
|
|
|
```
|
|
; Compute 10 / 2 when values are in reverse order
|
|
PUSH 2
|
|
PUSH 10
|
|
SWAP ; Now: [10, 2]
|
|
DIV ; 10 / 2 = 5
|
|
```
|
|
|
|
```
|
|
; Compute "hello" - "world" (subtraction with strings coerced to numbers)
|
|
PUSH "world"
|
|
PUSH "hello"
|
|
SWAP ; Now: ["hello", "world"]
|
|
SUB ; Result based on operand order
|
|
```
|
|
|
|
**Common Use Cases**:
|
|
- Division and subtraction when operands are in wrong order
|
|
- String concatenation with specific order
|
|
- Preparing arguments for functions that care about position
|
|
|
|
### Bitwise Operations
|
|
All bitwise operations work with 32-bit signed integers:
|
|
|
|
```
|
|
; Bitwise AND (masking)
|
|
PUSH 5
|
|
PUSH 3
|
|
BIT_AND ; → 1 (0101 & 0011 = 0001)
|
|
|
|
; Bitwise OR (combining flags)
|
|
PUSH 5
|
|
PUSH 3
|
|
BIT_OR ; → 7 (0101 | 0011 = 0111)
|
|
|
|
; Bitwise XOR (toggling bits)
|
|
PUSH 5
|
|
PUSH 3
|
|
BIT_XOR ; → 6 (0101 ^ 0011 = 0110)
|
|
|
|
; Left shift (multiply by power of 2)
|
|
PUSH 5
|
|
PUSH 2
|
|
BIT_SHL ; → 20 (5 << 2 = 5 * 4)
|
|
|
|
; Arithmetic right shift (divide by power of 2, preserves sign)
|
|
PUSH 20
|
|
PUSH 2
|
|
BIT_SHR ; → 5 (20 >> 2 = 20 / 4)
|
|
|
|
PUSH -20
|
|
PUSH 2
|
|
BIT_SHR ; → -5 (sign preserved)
|
|
|
|
; Logical right shift (zero-fill)
|
|
PUSH -1
|
|
PUSH 1
|
|
BIT_USHR ; → 2147483647 (unsigned shift)
|
|
```
|
|
|
|
**Common Use Cases**:
|
|
- Flags and bit masks: `flags band MASK` to test, `flags bor FLAG` to set
|
|
- Fast multiplication/division by powers of 2
|
|
- Color manipulation: extract RGB components
|
|
- Low-level bit manipulation for protocols or file formats
|
|
|
|
### Runtime Type Checking (TYPE)
|
|
Get the type of a value as a string for runtime introspection:
|
|
|
|
```
|
|
; Basic type check
|
|
PUSH 42
|
|
TYPE ; → "number"
|
|
|
|
PUSH "hello"
|
|
TYPE ; → "string"
|
|
|
|
MAKE_ARRAY #3
|
|
TYPE ; → "array"
|
|
```
|
|
|
|
**Type Guard Pattern** (check type before operation):
|
|
```
|
|
; Safe addition - only add if both are numbers
|
|
LOAD x
|
|
DUP
|
|
TYPE
|
|
PUSH "number"
|
|
EQ
|
|
JUMP_IF_FALSE .not_number
|
|
|
|
LOAD y
|
|
DUP
|
|
TYPE
|
|
PUSH "number"
|
|
EQ
|
|
JUMP_IF_FALSE .cleanup_not_number
|
|
|
|
ADD ; Safe to add
|
|
JUMP .end
|
|
|
|
.cleanup_not_number:
|
|
POP ; Remove y
|
|
.not_number:
|
|
POP ; Remove x
|
|
PUSH null
|
|
.end:
|
|
```
|
|
|
|
**Common Use Cases**:
|
|
- Type validation before operations
|
|
- Polymorphic functions that handle multiple types
|
|
- Debugging and introspection
|
|
- Dynamic dispatch in DSLs
|
|
- Safe coercion with fallbacks
|
|
|
|
### Try-Catch
|
|
```
|
|
PUSH_TRY .catch
|
|
<try-block>
|
|
POP_TRY
|
|
JUMP .end
|
|
.catch:
|
|
STORE err
|
|
<catch-block>
|
|
.end:
|
|
```
|
|
|
|
### Try-Catch-Finally
|
|
```
|
|
PUSH_TRY .catch
|
|
PUSH_FINALLY .finally
|
|
<try-block>
|
|
POP_TRY
|
|
JUMP .finally ; Compiler must generate this
|
|
.catch:
|
|
STORE err
|
|
<catch-block>
|
|
JUMP .finally ; And this
|
|
.finally:
|
|
<finally-block> ; Executes in both paths
|
|
.end:
|
|
```
|
|
|
|
**Important**: VM only auto-jumps to finally on THROW. For successful try/catch, compiler must explicitly JUMP to finally.
|
|
|
|
### Closures
|
|
Functions automatically capture current scope:
|
|
```
|
|
PUSH 0
|
|
STORE counter
|
|
MAKE_FUNCTION () .increment
|
|
STORE increment_fn
|
|
JUMP .main
|
|
|
|
.increment:
|
|
LOAD counter ; Captured variable
|
|
PUSH 1
|
|
ADD
|
|
STORE counter
|
|
LOAD counter
|
|
RETURN
|
|
|
|
.main:
|
|
LOAD increment_fn
|
|
PUSH 0
|
|
PUSH 0
|
|
CALL ; Returns 1
|
|
POP
|
|
LOAD increment_fn
|
|
PUSH 0
|
|
PUSH 0
|
|
CALL ; Returns 2 (counter persists!)
|
|
HALT
|
|
```
|
|
|
|
### Tail Recursion
|
|
Use TAIL_CALL instead of CALL for last call:
|
|
```
|
|
MAKE_FUNCTION (n acc) .factorial
|
|
STORE factorial
|
|
JUMP .main
|
|
|
|
.factorial:
|
|
LOAD n
|
|
PUSH 0
|
|
LTE
|
|
JUMP_IF_FALSE .recurse
|
|
LOAD acc
|
|
RETURN
|
|
.recurse:
|
|
LOAD factorial
|
|
LOAD n
|
|
PUSH 1
|
|
SUB
|
|
LOAD n
|
|
LOAD acc
|
|
MUL
|
|
PUSH 2
|
|
PUSH 0
|
|
TAIL_CALL ; Reuses stack frame
|
|
|
|
.main:
|
|
LOAD factorial
|
|
PUSH 5
|
|
PUSH 1
|
|
PUSH 2
|
|
PUSH 0
|
|
CALL ; factorial(5, 1) = 120
|
|
HALT
|
|
```
|
|
|
|
### Optional Function Calls (TRY_CALL)
|
|
Call function if defined, otherwise use value or name as string:
|
|
```
|
|
; Define optional hook
|
|
MAKE_FUNCTION () .onInit
|
|
STORE onInit
|
|
|
|
; Later: call if defined, skip if not
|
|
TRY_CALL onInit ; Calls onInit() if it's a function
|
|
; Pushes value if it exists but isn't a function
|
|
; Pushes "onInit" as string if undefined
|
|
|
|
; Use with values
|
|
PUSH 42
|
|
STORE answer
|
|
TRY_CALL answer ; Pushes 42 (not a function)
|
|
|
|
; Use with undefined
|
|
TRY_CALL unknown ; Pushes "unknown" as string
|
|
```
|
|
|
|
**Use Cases**:
|
|
- Optional hooks/callbacks in DSLs
|
|
- Shell-like languages where unknown identifiers become strings
|
|
- Templating systems with optional transformers
|
|
|
|
### String Concatenation
|
|
Build strings from multiple values:
|
|
```
|
|
; Simple concatenation
|
|
PUSH "Hello"
|
|
PUSH " "
|
|
PUSH "World"
|
|
STR_CONCAT #3 ; → "Hello World"
|
|
|
|
; With variables
|
|
PUSH "Name: "
|
|
LOAD userName
|
|
STR_CONCAT #2 ; → "Name: Alice"
|
|
|
|
; With expressions and type coercion
|
|
PUSH "Result: "
|
|
PUSH 10
|
|
PUSH 5
|
|
ADD
|
|
STR_CONCAT #2 ; → "Result: 15"
|
|
|
|
; Template-like interpolation
|
|
PUSH "User "
|
|
LOAD userId
|
|
PUSH " has "
|
|
LOAD count
|
|
PUSH " items"
|
|
STR_CONCAT #5 ; → "User 42 has 3 items"
|
|
```
|
|
|
|
**Composability**: Results can be concatenated again
|
|
```
|
|
PUSH "Hello"
|
|
PUSH " "
|
|
PUSH "World"
|
|
STR_CONCAT #3
|
|
PUSH "!"
|
|
STR_CONCAT #2 ; → "Hello World!"
|
|
```
|
|
|
|
### Unified Access (DOT_GET)
|
|
DOT_GET provides a single opcode for accessing both arrays and dicts:
|
|
|
|
```
|
|
; Array access
|
|
PUSH 10
|
|
PUSH 20
|
|
PUSH 30
|
|
MAKE_ARRAY #3
|
|
PUSH 1
|
|
DOT_GET ; → 20
|
|
|
|
; Dict access
|
|
PUSH 'name'
|
|
PUSH 'Alice'
|
|
MAKE_DICT #1
|
|
PUSH 'name'
|
|
DOT_GET ; → 'Alice'
|
|
```
|
|
|
|
**Chained access**:
|
|
```
|
|
; Access dict['users'][0]['name']
|
|
LOAD dict
|
|
PUSH 'users'
|
|
DOT_GET ; Get users array
|
|
PUSH 0
|
|
DOT_GET ; Get first user
|
|
PUSH 'name'
|
|
DOT_GET ; Get name field
|
|
```
|
|
|
|
**With variables**:
|
|
```
|
|
LOAD data
|
|
LOAD key ; Key can be string or number
|
|
DOT_GET ; Works for both array and dict
|
|
```
|
|
|
|
**Null safety**: Returns null for missing keys or out-of-bounds indices
|
|
```
|
|
MAKE_ARRAY #0
|
|
PUSH 0
|
|
DOT_GET ; → null (empty array)
|
|
|
|
MAKE_DICT #0
|
|
PUSH 'key'
|
|
DOT_GET ; → null (missing key)
|
|
```
|
|
|
|
## Key Concepts
|
|
|
|
### Truthiness
|
|
Only `null` and `false` are falsy. Everything else (including `0`, `""`, empty arrays/dicts) is truthy.
|
|
|
|
### Type Coercion
|
|
|
|
**toNumber**:
|
|
- `number` → identity
|
|
- `string` → parseFloat (or 0 if invalid)
|
|
- `boolean` → 1 (true) or 0 (false)
|
|
- `null` → 0
|
|
- Others → 0
|
|
|
|
**toString**:
|
|
- `string` → identity
|
|
- `number` → string representation
|
|
- `boolean` → "true" or "false"
|
|
- `null` → "null"
|
|
- `function` → "<function>"
|
|
- `array` → "[item, item]"
|
|
- `dict` → "{key: value, ...}"
|
|
|
|
**Arithmetic ops** (ADD, SUB, MUL, DIV, MOD) coerce both operands to numbers.
|
|
|
|
**Bitwise ops** (BIT_AND, BIT_OR, BIT_XOR, BIT_SHL, BIT_SHR, BIT_USHR) coerce both operands to 32-bit signed integers.
|
|
|
|
**Comparison ops** (LT, GT, LTE, GTE) coerce both operands to numbers.
|
|
|
|
**Equality ops** (EQ, NEQ) use type-aware comparison with deep equality for arrays/dicts.
|
|
|
|
**Note**: There is no string concatenation operator. ADD only works with numbers.
|
|
|
|
### Scope
|
|
- Variables resolved through parent scope chain
|
|
- STORE updates existing variable or creates in current scope
|
|
- Functions capture scope at definition time
|
|
|
|
### Identifiers
|
|
Variable and function parameter names support Unicode and emoji:
|
|
- Valid: `💎`, `🌟`, `変数`, `counter`, `_private`
|
|
- Invalid: Cannot start with digits, `.`, `#`, `@`, or `...`
|
|
- Invalid: Cannot contain whitespace or special chars: `;`, `()`, `[]`, `{}`, `=`, `'`, `"`
|
|
|
|
### Break Semantics
|
|
- CALL marks current frame as break target
|
|
- BREAK unwinds call stack to that target
|
|
- Used for Ruby-style iterator pattern
|
|
|
|
### Parameter Binding Priority
|
|
For function calls, parameters bound in order:
|
|
1. Named argument (if provided and matches param name)
|
|
2. Positional argument (if provided)
|
|
3. Default value (if defined)
|
|
4. Null
|
|
|
|
**Null Triggering Defaults**: Passing `null` as an argument (positional or named) triggers the default value if one exists. This allows callers to explicitly "opt-in" to defaults:
|
|
```
|
|
# Function with defaults: greet(name='Guest', greeting='Hello')
|
|
LOAD greet
|
|
PUSH null # Triggers default: name='Guest'
|
|
PUSH 'Hi' # Provided: greeting='Hi'
|
|
PUSH 2
|
|
PUSH 0
|
|
CALL # Returns "Hi, Guest"
|
|
```
|
|
|
|
This works for both ReefVM functions and native TypeScript functions. If no default exists, `null` is bound as-is.
|
|
|
|
### Exception Handlers
|
|
- PUSH_TRY uses absolute addresses for catch blocks
|
|
- Nested try blocks form a stack
|
|
- THROW unwinds to most recent handler and jumps to finally (if present) or catch
|
|
- VM does NOT automatically jump to finally on success - compiler must generate JUMPs
|
|
- Finally execution in all cases is compiler's responsibility, not VM's
|
|
|
|
### Calling Convention
|
|
All calls (including native functions) push arguments in order:
|
|
1. Function
|
|
2. Positional args (in order)
|
|
3. Named args (key1, val1, key2, val2, ...)
|
|
4. Positional count (as number)
|
|
5. Named count (as number)
|
|
6. CALL or TAIL_CALL
|
|
|
|
Native functions use the same calling convention as Reef functions. They are registered into scope and called via LOAD + CALL.
|
|
|
|
### Registering Native Functions
|
|
|
|
Native TypeScript functions are registered into the VM's scope and accessed like regular variables.
|
|
|
|
**Method 1**: Pass to `run()` or `VM` constructor
|
|
```typescript
|
|
const result = await run(bytecode, {
|
|
add: (a: number, b: number) => a + b,
|
|
greet: (name: string) => `Hello, ${name}!`
|
|
})
|
|
|
|
// Or with VM
|
|
const vm = new VM(bytecode, { add, greet })
|
|
```
|
|
|
|
**Method 2**: Register after construction
|
|
```typescript
|
|
const vm = new VM(bytecode)
|
|
vm.set('add', (a: number, b: number) => a + b)
|
|
await vm.run()
|
|
```
|
|
|
|
**Method 3**: Value-based functions (for full control)
|
|
```typescript
|
|
vm.setValueFunction('customOp', (a: Value, b: Value): Value => {
|
|
return { type: 'number', value: toNumber(a) + toNumber(b) }
|
|
})
|
|
```
|
|
|
|
**Auto-wrapping**: `vm.set()` automatically converts between native TypeScript types and ReefVM Value types. Both sync and async functions work.
|
|
|
|
**Usage in bytecode**:
|
|
```
|
|
; Positional arguments
|
|
LOAD add ; Load native function from scope
|
|
PUSH 5
|
|
PUSH 10
|
|
PUSH 2 ; positionalCount
|
|
PUSH 0 ; namedCount
|
|
CALL ; Call like any other function
|
|
|
|
; Named arguments
|
|
LOAD greet
|
|
PUSH "name"
|
|
PUSH "Alice"
|
|
PUSH "greeting"
|
|
PUSH "Hi"
|
|
PUSH 0 ; positionalCount
|
|
PUSH 2 ; namedCount
|
|
CALL ; → "Hi, Alice!"
|
|
```
|
|
|
|
**Named Arguments**: Native functions support named arguments. Parameter names are extracted from the function signature at call time, and arguments are bound using the same priority as Reef functions (named arg > positional arg > default > null).
|
|
|
|
**@named Pattern**: Parameters starting with `at` followed by an uppercase letter (e.g., `atOptions`, `atNamed`) collect unmatched named arguments:
|
|
|
|
```typescript
|
|
// Basic @named - collects all named args
|
|
vm.set('greet', (atNamed: any = {}) => {
|
|
return `Hello, ${atNamed.name || 'World'}!`
|
|
})
|
|
|
|
// Mixed positional and @named
|
|
vm.set('configure', (name: string, atOptions: any = {}) => {
|
|
return {
|
|
name,
|
|
debug: atOptions.debug || false,
|
|
port: atOptions.port || 3000
|
|
}
|
|
})
|
|
```
|
|
|
|
Bytecode example:
|
|
```
|
|
; Call with mixed positional and named args
|
|
LOAD configure
|
|
PUSH "myApp" ; positional arg → name
|
|
PUSH "debug"
|
|
PUSH true
|
|
PUSH "port"
|
|
PUSH 8080
|
|
PUSH 1 ; 1 positional arg
|
|
PUSH 2 ; 2 named args (debug, port)
|
|
CALL ; atOptions receives {debug: true, port: 8080}
|
|
```
|
|
|
|
Named arguments that match fixed parameter names are bound to those parameters. Remaining unmatched named arguments are collected into the `atXxx` parameter as a plain JavaScript object.
|
|
|
|
### Calling Functions from TypeScript
|
|
|
|
You can call both Reef and native functions from TypeScript using `vm.call()`:
|
|
|
|
```typescript
|
|
const bytecode = toBytecode(`
|
|
MAKE_FUNCTION (name greeting="Hello") .greet
|
|
STORE greet
|
|
HALT
|
|
|
|
.greet:
|
|
LOAD greeting
|
|
PUSH " "
|
|
LOAD name
|
|
PUSH "!"
|
|
STR_CONCAT #4
|
|
RETURN
|
|
`)
|
|
|
|
const vm = new VM(bytecode, {
|
|
log: (msg: string) => console.log(msg) // Native function
|
|
})
|
|
await vm.run()
|
|
|
|
// Call Reef function with positional arguments
|
|
const result1 = await vm.call('greet', 'Alice')
|
|
// Returns: "Hello Alice!"
|
|
|
|
// Call Reef function with named arguments (pass as final object)
|
|
const result2 = await vm.call('greet', 'Bob', { greeting: 'Hi' })
|
|
// Returns: "Hi Bob!"
|
|
|
|
// Call Reef function with only named arguments
|
|
const result3 = await vm.call('greet', { name: 'Carol', greeting: 'Hey' })
|
|
// Returns: "Hey Carol!"
|
|
|
|
// Call native function
|
|
await vm.call('log', 'Hello from TypeScript!')
|
|
```
|
|
|
|
**How it works**:
|
|
- `vm.call(functionName, ...args)` looks up the function (Reef or native) in the VM's scope
|
|
- For Reef functions: converts to callable JavaScript function
|
|
- For native functions: calls directly
|
|
- Arguments are automatically converted to ReefVM Values
|
|
- Returns the result (automatically converted back to JavaScript types)
|
|
|
|
**Named arguments**: Pass a plain object as the final argument to provide named arguments. If the last argument is a non-array object, it's treated as named arguments. All preceding arguments are treated as positional.
|
|
|
|
**Type conversion**: Arguments and return values are automatically converted between JavaScript types and ReefVM Values:
|
|
- Primitives: `number`, `string`, `boolean`, `null`
|
|
- Arrays: converted recursively
|
|
- Objects: converted to ReefVM dicts
|
|
- Functions: Reef functions are converted to callable JavaScript functions
|
|
|
|
### REPL Mode (Incremental Compilation)
|
|
|
|
ReefVM supports incremental bytecode execution for building REPLs. This allows you to execute code line-by-line while preserving scope and avoiding re-execution of side effects.
|
|
|
|
**The Problem**: By default, `vm.run()` resets the program counter (PC) to 0, re-executing all previous bytecode. This makes it impossible to implement a REPL where each line executes only once.
|
|
|
|
**The Solution**: Use `vm.continue()` to resume execution from where you left off:
|
|
|
|
```typescript
|
|
// Line 1: Define variable
|
|
const line1 = toBytecode([
|
|
["PUSH", 42],
|
|
["STORE", "x"]
|
|
])
|
|
|
|
const vm = new VM(line1)
|
|
await vm.run() // Execute first line
|
|
|
|
// Line 2: Use the variable
|
|
const line2 = toBytecode([
|
|
["LOAD", "x"],
|
|
["PUSH", 10],
|
|
["ADD"]
|
|
])
|
|
|
|
vm.appendBytecode(line2) // Append new bytecode with proper constant remapping
|
|
await vm.continue() // Execute ONLY the new bytecode
|
|
|
|
// Result: 52 (42 + 10)
|
|
// The first line never re-executed!
|
|
```
|
|
|
|
**Key methods**:
|
|
- `vm.run()`: Resets PC to 0 and runs from the beginning (normal execution)
|
|
- `vm.continue()`: Continues from current PC (REPL mode)
|
|
- `vm.appendBytecode(bytecode)`: Helper that properly appends bytecode with constant index remapping
|
|
|
|
**Important**: Don't use `HALT` in REPL mode! The VM naturally stops when it runs out of instructions. Using `HALT` sets `vm.stopped = true`, which prevents `continue()` from resuming.
|
|
|
|
**Example REPL pattern**:
|
|
```typescript
|
|
const vm = new VM(toBytecode([]), { /* native functions */ })
|
|
|
|
while (true) {
|
|
const input = await getUserInput() // Get next line from user
|
|
const bytecode = compileLine(input) // Compile to bytecode (no HALT!)
|
|
|
|
vm.appendBytecode(bytecode) // Append to VM
|
|
const result = await vm.continue() // Execute only the new code
|
|
|
|
console.log(fromValue(result)) // Show result to user
|
|
}
|
|
```
|
|
|
|
This pattern ensures:
|
|
- Variables persist between lines
|
|
- Side effects (like `echo` or function calls) only run once
|
|
- Previous bytecode never re-executes
|
|
- Scope accumulates across all lines
|
|
|
|
### Empty Stack
|
|
- RETURN with empty stack returns null
|
|
- HALT with empty stack returns null
|