10 KiB
Reef Compiler Guide
Quick reference for compiling to Reef bytecode.
Bytecode Formats
ReefVM supports two bytecode formats:
- String format: Human-readable text with opcodes and operands
- Array format: TypeScript arrays with typed tuples for programmatic generation
Both formats are compiled using the same toBytecode() function.
Bytecode Syntax
Instructions
OPCODE operand ; comment
Operand Types
Immediate numbers (#N): Counts or relative offsets
MAKE_ARRAY #3- count of 3 itemsJUMP #5- relative offset of 5 instructions (prefer labels)PUSH_TRY #10- absolute instruction index (prefer labels)
Labels (.name): Symbolic addresses resolved at parse time
.label:- define label at current positionJUMP .loop- jump to labelMAKE_FUNCTION (x) .body- function body at label
Variable names: Plain identifiers (supports Unicode and emoji!)
LOAD counter- load variableSTORE result- store variableLOAD 💎- load emoji variableSTORE 変数- store Unicode variable
Constants: Literals added to constants pool
- Numbers:
PUSH 42,PUSH 3.14 - Strings:
PUSH "hello"orPUSH 'world' - Booleans:
PUSH true,PUSH false - Null:
PUSH null
Native function names: Registered TypeScript functions
CALL_NATIVE print
Array Format
The programmatic array format uses TypeScript tuples for type safety:
import { toBytecode, run } from "#reef"
const bytecode = toBytecode([
["PUSH", 42], // Atom values: number | string | boolean | null
["STORE", "x"], // Variable names as strings
["LOAD", "x"],
["HALT"]
])
const result = await run(bytecode)
Operand Types in Array Format
Atoms (number | string | boolean | null): Constants for PUSH
["PUSH", 42]
["PUSH", "hello"]
["PUSH", true]
["PUSH", null]
Variable names: String identifiers
["LOAD", "counter"]
["STORE", "result"]
Label definitions: Single-element arrays starting with . and ending with :
[".loop:"]
[".end:"]
[".function_body:"]
Label references: Strings in jump/function instructions
["JUMP", ".loop"]
["JUMP_IF_FALSE", ".end"]
["MAKE_FUNCTION", ["x", "y"], ".body"]
["PUSH_TRY", ".catch"]
Counts: Numbers for array/dict construction
["MAKE_ARRAY", 3] // Pop 3 items
["MAKE_DICT", 2] // Pop 2 key-value pairs
Native function names: Strings for registered functions
["CALL_NATIVE", "print"]
Functions in Array Format
// Basic function
["MAKE_FUNCTION", ["x", "y"], ".body"]
// With defaults
["MAKE_FUNCTION", ["x", "y=10"], ".body"]
// Variadic
["MAKE_FUNCTION", ["...args"], ".body"]
// Named args
["MAKE_FUNCTION", ["@opts"], ".body"]
// Mixed
["MAKE_FUNCTION", ["x", "y=5", "...rest", "@opts"], ".body"]
Complete Example
const factorial = toBytecode([
["MAKE_FUNCTION", ["n", "acc=1"], ".fact"],
["STORE", "factorial"],
["JUMP", ".main"],
[".fact:"],
["LOAD", "n"],
["PUSH", 0],
["LTE"],
["JUMP_IF_FALSE", ".recurse"],
["LOAD", "acc"],
["RETURN"],
[".recurse:"],
["LOAD", "factorial"],
["LOAD", "n"],
["PUSH", 1],
["SUB"],
["LOAD", "n"],
["LOAD", "acc"],
["MUL"],
["PUSH", 2],
["PUSH", 0],
["TAIL_CALL"],
[".main:"],
["LOAD", "factorial"],
["PUSH", 5],
["PUSH", 1],
["PUSH", 0],
["CALL"],
["HALT"]
])
const result = await run(factorial) // { type: "number", value: 120 }
String Format
Functions
MAKE_FUNCTION (x y) .body ; Basic
MAKE_FUNCTION (x=10 y=20) .body ; Defaults
MAKE_FUNCTION (x ...rest) .body ; Variadic
MAKE_FUNCTION (x @named) .body ; Named args
MAKE_FUNCTION (x ...rest @named) .body ; Both
Function Calls
Stack order (bottom to top):
LOAD fn
PUSH arg1 ; Positional args
PUSH arg2
PUSH "name" ; Named arg key
PUSH "value" ; Named arg value
PUSH 2 ; Positional count
PUSH 1 ; Named count
CALL
Opcodes
Stack
PUSH <const>- Push constantPOP- Remove topDUP- Duplicate top
Variables
LOAD <name>- Push variable valueSTORE <name>- Pop and store in variable
Arithmetic
ADD,SUB,MUL,DIV,MOD- Binary ops (pop 2, push result)
Comparison
EQ,NEQ,LT,GT,LTE,GTE- Pop 2, push boolean
Logic
NOT- Pop 1, push !value
Control Flow
JUMP .label- Unconditional jumpJUMP_IF_FALSE .label- Jump if top is false or null (pops value)JUMP_IF_TRUE .label- Jump if top is truthy (pops value)HALT- Stop execution of the program
Functions
MAKE_FUNCTION (params) .body- Create function, push to stackCALL- Call function (see calling convention above)TAIL_CALL- Tail-recursive call (no stack growth)RETURN- Return from function (pops return value)BREAK- Exit iterator/loop (unwinds to break target)
Arrays
MAKE_ARRAY #N- Pop N items, push arrayARRAY_GET- Pop index and array, push elementARRAY_SET- Pop value, index, array; mutate arrayARRAY_PUSH- Pop value and array, append to arrayARRAY_LEN- Pop array, push length
Dicts
MAKE_DICT #N- Pop N key-value pairs, push dictDICT_GET- Pop key and dict, push value (or null)DICT_SET- Pop value, key, dict; mutate dictDICT_HAS- Pop key and dict, push boolean
Exceptions
PUSH_TRY .catch- Register exception handlerPUSH_FINALLY .finally- Add finally to current handlerPOP_TRY- Remove handler (try succeeded)THROW- Throw exception (pops error value)
Native
CALL_NATIVE <name>- Call registered TypeScript function (consumes entire stack as args)
Compiler Patterns
If-Else
<condition>
JUMP_IF_FALSE .else
<then-block>
JUMP .end
.else:
<else-block>
.end:
While Loop
.loop:
<condition>
JUMP_IF_FALSE .end
<body>
JUMP .loop
.end:
For Loop
<init>
.loop:
<condition>
JUMP_IF_FALSE .end
<body>
<increment>
JUMP .loop
.end:
Continue
No CONTINUE opcode. Use backward jump to loop start:
.loop:
<condition>
JUMP_IF_FALSE .end
<early-check>
JUMP_IF_TRUE .loop ; continue
<body>
JUMP .loop
.end:
Break in Loop
Mark iterator function as break target, use BREAK opcode:
MAKE_FUNCTION () .each_body
STORE each
LOAD collection
LOAD each
<call-iterator-with-break-semantics>
HALT
.each_body:
<condition>
JUMP_IF_TRUE .done
<body>
BREAK ; exits to caller
.done:
RETURN
Short-Circuit AND
<left>
DUP
JUMP_IF_FALSE .end ; Short-circuit if false
POP
<right>
.end: ; Result on stack
Short-Circuit OR
<left>
DUP
JUMP_IF_TRUE .end ; Short-circuit if true
POP
<right>
.end: ; Result on stack
Try-Catch
PUSH_TRY .catch
<try-block>
POP_TRY
JUMP .end
.catch:
STORE err
<catch-block>
.end:
Try-Catch-Finally
PUSH_TRY .catch
PUSH_FINALLY .finally
<try-block>
POP_TRY
JUMP .finally ; Compiler must generate this
.catch:
STORE err
<catch-block>
JUMP .finally ; And this
.finally:
<finally-block> ; Executes in both paths
.end:
Important: VM only auto-jumps to finally on THROW. For successful try/catch, compiler must explicitly JUMP to finally.
Closures
Functions automatically capture current scope:
PUSH 0
STORE counter
MAKE_FUNCTION () .increment
RETURN
.increment:
LOAD counter ; Captured variable
PUSH 1
ADD
STORE counter
LOAD counter
RETURN
Tail Recursion
Use TAIL_CALL instead of CALL for last call:
MAKE_FUNCTION (n acc) .factorial
STORE factorial
<...>
.factorial:
LOAD n
PUSH 0
LTE
JUMP_IF_FALSE .recurse
LOAD acc
RETURN
.recurse:
LOAD factorial
LOAD n
PUSH 1
SUB
LOAD n
LOAD acc
MUL
PUSH 2
PUSH 0
TAIL_CALL ; Reuses stack frame
Key Concepts
Truthiness
Only null and false are falsy. Everything else (including 0, "", empty arrays/dicts) is truthy.
Type Coercion
toNumber:
number→ identitystring→ parseFloat (or 0 if invalid)boolean→ 1 (true) or 0 (false)null→ 0- Others → 0
toString:
string→ identitynumber→ string representationboolean→ "true" or "false"null→ "null"function→ ""array→ "[item, item]"dict→ "{key: value, ...}"
Arithmetic ops (ADD, SUB, MUL, DIV, MOD) coerce both operands to numbers.
Comparison ops (LT, GT, LTE, GTE) coerce both operands to numbers.
Equality ops (EQ, NEQ) use type-aware comparison with deep equality for arrays/dicts.
Note: There is no string concatenation operator. ADD only works with numbers.
Scope
- Variables resolved through parent scope chain
- STORE updates existing variable or creates in current scope
- Functions capture scope at definition time
Identifiers
Variable and function parameter names support Unicode and emoji:
- Valid:
💎,🌟,変数,counter,_private - Invalid: Cannot start with digits,
.,#,@, or... - Invalid: Cannot contain whitespace or special chars:
;,(),[],{},=,',"
Break Semantics
- CALL marks current frame as break target
- BREAK unwinds call stack to that target
- Used for Ruby-style iterator pattern
Parameter Binding Priority
For function calls, parameters bound in order:
- Positional argument (if provided)
- Named argument (if provided and matches param name)
- Default value (if defined)
- Null
Exception Handlers
- PUSH_TRY uses absolute addresses for catch blocks
- Nested try blocks form a stack
- THROW unwinds to most recent handler and jumps to finally (if present) or catch
- VM does NOT automatically jump to finally on success - compiler must generate JUMPs
- Finally execution in all cases is compiler's responsibility, not VM's
Calling Convention
All calls push arguments in order:
- Function
- Positional args (in order)
- Named args (key1, val1, key2, val2, ...)
- Positional count (as number)
- Named count (as number)
- CALL or TAIL_CALL
CALL_NATIVE Behavior
Unlike CALL, CALL_NATIVE consumes the entire stack as arguments and clears the stack. The native function receives all values that were on the stack at the time of the call.
Empty Stack
- RETURN with empty stack returns null
- HALT with empty stack returns null