ReefVM/GUIDE.md
Chris Wanstrath 43842adc87 TRY_CALL
2025-10-10 14:21:43 -07:00

11 KiB

Reef Compiler Guide

Quick reference for compiling to Reef bytecode.

Bytecode Formats

ReefVM supports two bytecode formats:

  1. String format: Human-readable text with opcodes and operands
  2. Array format: TypeScript arrays with typed tuples for programmatic generation

Both formats are compiled using the same toBytecode() function.

Bytecode Syntax

Instructions

OPCODE operand     ; comment

Operand Types

Immediate numbers (#N): Counts or relative offsets

  • MAKE_ARRAY #3 - count of 3 items
  • JUMP #5 - relative offset of 5 instructions (prefer labels)
  • PUSH_TRY #10 - absolute instruction index (prefer labels)

Labels (.name): Symbolic addresses resolved at parse time

  • .label: - define label at current position
  • JUMP .loop - jump to label
  • MAKE_FUNCTION (x) .body - function body at label

Variable names: Plain identifiers (supports Unicode and emoji!)

  • LOAD counter - load variable
  • STORE result - store variable
  • LOAD 💎 - load emoji variable
  • STORE 変数 - store Unicode variable

Constants: Literals added to constants pool

  • Numbers: PUSH 42, PUSH 3.14
  • Strings: PUSH "hello" or PUSH 'world'
  • Booleans: PUSH true, PUSH false
  • Null: PUSH null

Native function names: Registered TypeScript functions

  • CALL_NATIVE print

Array Format

The programmatic array format uses TypeScript tuples for type safety:

import { toBytecode, run } from "#reef"

const bytecode = toBytecode([
  ["PUSH", 42],        // Atom values: number | string | boolean | null
  ["STORE", "x"],      // Variable names as strings
  ["LOAD", "x"],
  ["HALT"]
])

const result = await run(bytecode)

Operand Types in Array Format

Atoms (number | string | boolean | null): Constants for PUSH

["PUSH", 42]
["PUSH", "hello"]
["PUSH", true]
["PUSH", null]

Variable names: String identifiers

["LOAD", "counter"]
["STORE", "result"]

Label definitions: Single-element arrays starting with . and ending with :

[".loop:"]
[".end:"]
[".function_body:"]

Label references: Strings in jump/function instructions

["JUMP", ".loop"]
["JUMP_IF_FALSE", ".end"]
["MAKE_FUNCTION", ["x", "y"], ".body"]
["PUSH_TRY", ".catch"]

Counts: Numbers for array/dict construction

["MAKE_ARRAY", 3]    // Pop 3 items
["MAKE_DICT", 2]     // Pop 2 key-value pairs

Native function names: Strings for registered functions

["CALL_NATIVE", "print"]

Functions in Array Format

// Basic function
["MAKE_FUNCTION", ["x", "y"], ".body"]

// With defaults
["MAKE_FUNCTION", ["x", "y=10"], ".body"]

// Variadic
["MAKE_FUNCTION", ["...args"], ".body"]

// Named args
["MAKE_FUNCTION", ["@opts"], ".body"]

// Mixed
["MAKE_FUNCTION", ["x", "y=5", "...rest", "@opts"], ".body"]

Complete Example

const factorial = toBytecode([
  ["MAKE_FUNCTION", ["n", "acc=1"], ".fact"],
  ["STORE", "factorial"],
  ["JUMP", ".main"],

  [".fact:"],
  ["LOAD", "n"],
  ["PUSH", 0],
  ["LTE"],
  ["JUMP_IF_FALSE", ".recurse"],
  ["LOAD", "acc"],
  ["RETURN"],

  [".recurse:"],
  ["LOAD", "factorial"],
  ["LOAD", "n"],
  ["PUSH", 1],
  ["SUB"],
  ["LOAD", "n"],
  ["LOAD", "acc"],
  ["MUL"],
  ["PUSH", 2],
  ["PUSH", 0],
  ["TAIL_CALL"],

  [".main:"],
  ["LOAD", "factorial"],
  ["PUSH", 5],
  ["PUSH", 1],
  ["PUSH", 0],
  ["CALL"],
  ["HALT"]
])

const result = await run(factorial)  // { type: "number", value: 120 }

String Format

Functions

MAKE_FUNCTION (x y) .body       ; Basic
MAKE_FUNCTION (x=10 y=20) .body ; Defaults
MAKE_FUNCTION (x ...rest) .body ; Variadic
MAKE_FUNCTION (x @named) .body  ; Named args
MAKE_FUNCTION (x ...rest @named) .body ; Both

Function Calls

Stack order (bottom to top):

LOAD fn
PUSH arg1           ; Positional args
PUSH arg2
PUSH "name"         ; Named arg key
PUSH "value"        ; Named arg value
PUSH 2              ; Positional count
PUSH 1              ; Named count
CALL

Opcodes

Stack

  • PUSH <const> - Push constant
  • POP - Remove top
  • DUP - Duplicate top

Variables

  • LOAD <name> - Push variable value (throws if not found)
  • TRY_LOAD <name> - Push variable value if found, otherwise push name as string (never throws)
  • STORE <name> - Pop and store in variable

Arithmetic

  • ADD, SUB, MUL, DIV, MOD - Binary ops (pop 2, push result)

Comparison

  • EQ, NEQ, LT, GT, LTE, GTE - Pop 2, push boolean

Logic

  • NOT - Pop 1, push !value

Control Flow

  • JUMP .label - Unconditional jump
  • JUMP_IF_FALSE .label - Jump if top is false or null (pops value)
  • JUMP_IF_TRUE .label - Jump if top is truthy (pops value)
  • HALT - Stop execution of the program

Functions

  • MAKE_FUNCTION (params) .body - Create function, push to stack
  • CALL - Call function (see calling convention above)
  • TAIL_CALL - Tail-recursive call (no stack growth)
  • RETURN - Return from function (pops return value)
  • TRY_CALL <name> - Call function (if found), push value (if exists), or push name as string (if not found)
  • BREAK - Exit iterator/loop (unwinds to break target)

Arrays

  • MAKE_ARRAY #N - Pop N items, push array
  • ARRAY_GET - Pop index and array, push element
  • ARRAY_SET - Pop value, index, array; mutate array
  • ARRAY_PUSH - Pop value and array, append to array
  • ARRAY_LEN - Pop array, push length

Dicts

  • MAKE_DICT #N - Pop N key-value pairs, push dict
  • DICT_GET - Pop key and dict, push value (or null)
  • DICT_SET - Pop value, key, dict; mutate dict
  • DICT_HAS - Pop key and dict, push boolean

Exceptions

  • PUSH_TRY .catch - Register exception handler
  • PUSH_FINALLY .finally - Add finally to current handler
  • POP_TRY - Remove handler (try succeeded)
  • THROW - Throw exception (pops error value)

Native

  • CALL_NATIVE <name> - Call registered TypeScript function (consumes entire stack as args)

Compiler Patterns

If-Else

<condition>
JUMP_IF_FALSE .else
  <then-block>
  JUMP .end
.else:
  <else-block>
.end:

While Loop

.loop:
  <condition>
  JUMP_IF_FALSE .end
  <body>
  JUMP .loop
.end:

For Loop

<init>
.loop:
  <condition>
  JUMP_IF_FALSE .end
  <body>
  <increment>
  JUMP .loop
.end:

Continue

No CONTINUE opcode. Use backward jump to loop start:

.loop:
  <condition>
  JUMP_IF_FALSE .end
  <early-check>
  JUMP_IF_TRUE .loop    ; continue
  <body>
  JUMP .loop
.end:

Break in Loop

Mark iterator function as break target, use BREAK opcode:

MAKE_FUNCTION () .each_body
STORE each
LOAD collection
LOAD each
<call-iterator-with-break-semantics>
HALT

.each_body:
  <condition>
  JUMP_IF_TRUE .done
  <body>
  BREAK                  ; exits to caller
.done:
  RETURN

Short-Circuit AND

<left>
DUP
JUMP_IF_FALSE .end      ; Short-circuit if false
POP
<right>
.end:                    ; Result on stack

Short-Circuit OR

<left>
DUP
JUMP_IF_TRUE .end       ; Short-circuit if true
POP
<right>
.end:                    ; Result on stack

Try-Catch

PUSH_TRY .catch
  <try-block>
  POP_TRY
  JUMP .end
.catch:
  STORE err
  <catch-block>
.end:

Try-Catch-Finally

PUSH_TRY .catch
PUSH_FINALLY .finally
  <try-block>
  POP_TRY
  JUMP .finally         ; Compiler must generate this
.catch:
  STORE err
  <catch-block>
  JUMP .finally         ; And this
.finally:
  <finally-block>       ; Executes in both paths
.end:

Important: VM only auto-jumps to finally on THROW. For successful try/catch, compiler must explicitly JUMP to finally.

Closures

Functions automatically capture current scope:

PUSH 0
STORE counter
MAKE_FUNCTION () .increment
RETURN

.increment:
  LOAD counter          ; Captured variable
  PUSH 1
  ADD
  STORE counter
  LOAD counter
  RETURN

Tail Recursion

Use TAIL_CALL instead of CALL for last call:

MAKE_FUNCTION (n acc) .factorial
STORE factorial
<...>

.factorial:
  LOAD n
  PUSH 0
  LTE
  JUMP_IF_FALSE .recurse
  LOAD acc
  RETURN
.recurse:
  LOAD factorial
  LOAD n
  PUSH 1
  SUB
  LOAD n
  LOAD acc
  MUL
  PUSH 2
  PUSH 0
  TAIL_CALL             ; Reuses stack frame

Optional Function Calls (TRY_CALL)

Call function if defined, otherwise use value or name as string:

; Define optional hook
MAKE_FUNCTION () .onInit
STORE onInit

; Later: call if defined, skip if not
TRY_CALL onInit       ; Calls onInit() if it's a function
                       ; Pushes value if it exists but isn't a function
                       ; Pushes "onInit" as string if undefined

; Use with values
PUSH 42
STORE answer
TRY_CALL answer        ; Pushes 42 (not a function)

; Use with undefined
TRY_CALL unknown       ; Pushes "unknown" as string

Use Cases:

  • Optional hooks/callbacks in DSLs
  • Shell-like languages where unknown identifiers become strings
  • Templating systems with optional transformers

Key Concepts

Truthiness

Only null and false are falsy. Everything else (including 0, "", empty arrays/dicts) is truthy.

Type Coercion

toNumber:

  • number → identity
  • string → parseFloat (or 0 if invalid)
  • boolean → 1 (true) or 0 (false)
  • null → 0
  • Others → 0

toString:

  • string → identity
  • number → string representation
  • boolean → "true" or "false"
  • null → "null"
  • function → ""
  • array → "[item, item]"
  • dict → "{key: value, ...}"

Arithmetic ops (ADD, SUB, MUL, DIV, MOD) coerce both operands to numbers.

Comparison ops (LT, GT, LTE, GTE) coerce both operands to numbers.

Equality ops (EQ, NEQ) use type-aware comparison with deep equality for arrays/dicts.

Note: There is no string concatenation operator. ADD only works with numbers.

Scope

  • Variables resolved through parent scope chain
  • STORE updates existing variable or creates in current scope
  • Functions capture scope at definition time

Identifiers

Variable and function parameter names support Unicode and emoji:

  • Valid: 💎, 🌟, 変数, counter, _private
  • Invalid: Cannot start with digits, ., #, @, or ...
  • Invalid: Cannot contain whitespace or special chars: ;, (), [], {}, =, ', "

Break Semantics

  • CALL marks current frame as break target
  • BREAK unwinds call stack to that target
  • Used for Ruby-style iterator pattern

Parameter Binding Priority

For function calls, parameters bound in order:

  1. Positional argument (if provided)
  2. Named argument (if provided and matches param name)
  3. Default value (if defined)
  4. Null

Exception Handlers

  • PUSH_TRY uses absolute addresses for catch blocks
  • Nested try blocks form a stack
  • THROW unwinds to most recent handler and jumps to finally (if present) or catch
  • VM does NOT automatically jump to finally on success - compiler must generate JUMPs
  • Finally execution in all cases is compiler's responsibility, not VM's

Calling Convention

All calls push arguments in order:

  1. Function
  2. Positional args (in order)
  3. Named args (key1, val1, key2, val2, ...)
  4. Positional count (as number)
  5. Named count (as number)
  6. CALL or TAIL_CALL

CALL_NATIVE Behavior

Unlike CALL, CALL_NATIVE consumes the entire stack as arguments and clears the stack. The native function receives all values that were on the stack at the time of the call.

Empty Stack

  • RETURN with empty stack returns null
  • HALT with empty stack returns null