From c94cc59dea847583a49247e46f602ff3649da6e9 Mon Sep 17 00:00:00 2001 From: Chris Wanstrath Date: Sun, 5 Oct 2025 22:34:11 -0700 Subject: [PATCH] compiler writing guide --- GUIDE.md | 302 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 302 insertions(+) create mode 100644 GUIDE.md diff --git a/GUIDE.md b/GUIDE.md new file mode 100644 index 0000000..e1c7726 --- /dev/null +++ b/GUIDE.md @@ -0,0 +1,302 @@ +# Reef Compiler Guide + +Quick reference for compiling to Reef bytecode. + +## Bytecode Syntax + +### Instructions +``` +OPCODE operand ; comment +``` + +### Labels +``` +.label_name: ; Define label +JUMP .label_name ; Reference label +``` + +### Constants +- Numbers: `PUSH 42`, `PUSH 3.14` +- Strings: `PUSH "hello"` or `PUSH 'world'` +- Booleans: `PUSH true`, `PUSH false` +- Null: `PUSH null` + +### Variables +- Load: `LOAD varname` +- Store: `STORE varname` + +### Functions +``` +MAKE_FUNCTION (x y) .body ; Basic +MAKE_FUNCTION (x=10 y=20) .body ; Defaults +MAKE_FUNCTION (x ...rest) .body ; Variadic +MAKE_FUNCTION (x @named) .body ; Named args +MAKE_FUNCTION (x ...rest @named) .body ; Both +``` + +### Function Calls +Stack order (bottom to top): +``` +LOAD fn +PUSH arg1 ; Positional args +PUSH arg2 +PUSH "name" ; Named arg key +PUSH "value" ; Named arg value +PUSH 2 ; Positional count +PUSH 1 ; Named count +CALL +``` + +## Opcodes + +### Stack +- `PUSH ` - Push constant +- `POP` - Remove top +- `DUP` - Duplicate top + +### Variables +- `LOAD ` - Push variable value +- `STORE ` - Pop and store in variable + +### Arithmetic +- `ADD`, `SUB`, `MUL`, `DIV`, `MOD` - Binary ops (pop 2, push result) + +### Comparison +- `EQ`, `NEQ`, `LT`, `GT`, `LTE`, `GTE` - Pop 2, push boolean + +### Logic +- `NOT` - Pop 1, push !value + +### Control Flow +- `JUMP .label` - Unconditional jump +- `JUMP_IF_FALSE .label` - Jump if top is false or null (pops value) +- `JUMP_IF_TRUE .label` - Jump if top is truthy (pops value) +- `HALT` - Stop execution of the program + +### Functions +- `MAKE_FUNCTION (params) .body` - Create function, push to stack +- `CALL` - Call function (see calling convention above) +- `TAIL_CALL` - Tail-recursive call (no stack growth) +- `RETURN` - Return from function (pops return value) +- `BREAK` - Exit iterator/loop (unwinds to break target) + +### Arrays +- `MAKE_ARRAY #N` - Pop N items, push array +- `ARRAY_GET` - Pop index and array, push element +- `ARRAY_SET` - Pop value, index, array; mutate array +- `ARRAY_PUSH` - Pop value and array, append to array +- `ARRAY_LEN` - Pop array, push length + +### Dicts +- `MAKE_DICT #N` - Pop N key-value pairs, push dict +- `DICT_GET` - Pop key and dict, push value (or null) +- `DICT_SET` - Pop value, key, dict; mutate dict +- `DICT_HAS` - Pop key and dict, push boolean + +### Exceptions +- `PUSH_TRY .catch` - Register exception handler +- `PUSH_FINALLY .finally` - Add finally to current handler +- `POP_TRY` - Remove handler (try succeeded) +- `THROW` - Throw exception (pops error value) + +### Native +- `CALL_NATIVE ` - Call registered TypeScript function + +## Compiler Patterns + +### If-Else +``` + +JUMP_IF_FALSE .else + + JUMP .end +.else: + +.end: +``` + +### While Loop +``` +.loop: + + JUMP_IF_FALSE .end + + JUMP .loop +.end: +``` + +### For Loop +``` + +.loop: + + JUMP_IF_FALSE .end + + + JUMP .loop +.end: +``` + +### Continue +No CONTINUE opcode. Use backward jump to loop start: +``` +.loop: + + JUMP_IF_FALSE .end + + JUMP_IF_TRUE .loop ; continue + + JUMP .loop +.end: +``` + +### Break in Loop +Mark iterator function as break target, use BREAK opcode: +``` +MAKE_FUNCTION () .each_body +STORE each +LOAD collection +LOAD each + +HALT + +.each_body: + + JUMP_IF_TRUE .done + + BREAK ; exits to caller +.done: + RETURN +``` + +### Short-Circuit AND +``` + +DUP +JUMP_IF_FALSE .end ; Short-circuit if false +POP + +.end: ; Result on stack +``` + +### Short-Circuit OR +``` + +DUP +JUMP_IF_TRUE .end ; Short-circuit if true +POP + +.end: ; Result on stack +``` + +### Try-Catch +``` +PUSH_TRY .catch + + POP_TRY + JUMP .end +.catch: + STORE err + +.end: +``` + +### Try-Catch-Finally +``` +PUSH_TRY .catch +PUSH_FINALLY .finally + + POP_TRY + JUMP .finally ; Compiler must generate this +.catch: + STORE err + + JUMP .finally ; And this +.finally: + ; Executes in both paths +.end: +``` + +**Important**: VM only auto-jumps to finally on THROW. For successful try/catch, compiler must explicitly JUMP to finally. + +### Closures +Functions automatically capture current scope: +``` +PUSH 0 +STORE counter +MAKE_FUNCTION () .increment +RETURN + +.increment: + LOAD counter ; Captured variable + PUSH 1 + ADD + STORE counter + LOAD counter + RETURN +``` + +### Tail Recursion +Use TAIL_CALL instead of CALL for last call: +``` +MAKE_FUNCTION (n acc) .factorial +STORE factorial +<...> + +.factorial: + LOAD n + PUSH 0 + LTE + JUMP_IF_FALSE .recurse + LOAD acc + RETURN +.recurse: + LOAD factorial + LOAD n + PUSH 1 + SUB + LOAD n + LOAD acc + MUL + PUSH 2 + PUSH 0 + TAIL_CALL ; Reuses stack frame +``` + +## Key Concepts + +### Truthiness +Only `null` and `false` are falsy. Everything else (including `0`, `""`, empty arrays/dicts) is truthy. + +### Scope +- Variables resolved through parent scope chain +- STORE updates existing variable or creates in current scope +- Functions capture scope at definition time + +### Break Semantics +- CALL marks current frame as break target +- BREAK unwinds call stack to that target +- Used for Ruby-style iterator pattern + +### Parameter Binding Priority +For function calls, parameters bound in order: +1. Positional argument (if provided) +2. Named argument (if provided and matches param name) +3. Default value (if defined) +4. Null + +### Exception Handlers +- PUSH_TRY uses absolute addresses for catch blocks +- Nested try blocks form a stack +- THROW unwinds to most recent handler and jumps to finally (if present) or catch +- VM does NOT automatically jump to finally on success - compiler must generate JUMPs +- Finally execution in all cases is compiler's responsibility, not VM's + +### Calling Convention +All calls push arguments in order: +1. Function +2. Positional args (in order) +3. Named args (key1, val1, key2, val2, ...) +4. Positional count (as number) +5. Named count (as number) +6. CALL or TAIL_CALL