843 lines
24 KiB
Markdown
843 lines
24 KiB
Markdown
# ReefVM Specification
|
|
|
|
Version 1.0
|
|
|
|
## Overview
|
|
|
|
The ReefVM is a stack-based bytecode virtual machine designed for the Shrimp programming language. It supports closures, tail call optimization, exception handling, variadic functions, named parameters, and Ruby-style iterators with break/continue.
|
|
|
|
## Architecture
|
|
|
|
### Components
|
|
|
|
- **Value Stack**: Operand stack for computation
|
|
- **Call Stack**: Call frames for function invocations
|
|
- **Exception Handlers**: Stack of try/catch handlers
|
|
- **Scope Chain**: Linked scopes for lexical variable resolution
|
|
- **Program Counter (PC)**: Current instruction index
|
|
- **Constants Pool**: Immutable values and function metadata
|
|
- **Native Function Registry**: External functions callable from Shrimp
|
|
|
|
### Execution Model
|
|
|
|
1. VM loads bytecode with instructions and constants
|
|
2. PC starts at instruction 0
|
|
3. Each instruction is executed sequentially (unless jumps occur)
|
|
4. Execution continues until HALT or end of instructions
|
|
5. Final value is top of stack (or null if empty)
|
|
|
|
## Value Types
|
|
|
|
All runtime values are tagged unions:
|
|
|
|
```typescript
|
|
type Value =
|
|
| { type: 'null', value: null }
|
|
| { type: 'boolean', value: boolean }
|
|
| { type: 'number', value: number }
|
|
| { type: 'string', value: string }
|
|
| { type: 'array', value: Value[] }
|
|
| { type: 'dict', value: Map<string, Value> }
|
|
| { type: 'function', params: string[], defaults: Record<string, number>,
|
|
body: number, parentScope: Scope, variadic: boolean, named: boolean }
|
|
```
|
|
|
|
### Type Coercion
|
|
|
|
**toNumber**: number → identity, string → parseFloat (or 0), boolean → 1/0, others → 0
|
|
|
|
**toString**: string → identity, number → string, boolean → string, null → "null",
|
|
function → "<function>", array → "[item, item]", dict → "{key: value, ...}"
|
|
|
|
**isTrue**: Only `null` and `false` are falsy. Everything else (including `0`, `""`, empty arrays, empty dicts) is truthy.
|
|
|
|
## Bytecode Format
|
|
|
|
```typescript
|
|
type Bytecode = {
|
|
instructions: Instruction[]
|
|
constants: Constant[]
|
|
}
|
|
|
|
type Instruction = {
|
|
op: OpCode
|
|
operand?: number | string
|
|
}
|
|
|
|
type Constant =
|
|
| Value
|
|
| { type: 'function_def', params: string[], defaults: Record<string, number>,
|
|
body: number, variadic: boolean, named: boolean }
|
|
```
|
|
|
|
## Scope Chain
|
|
|
|
Variables are resolved through a linked scope chain:
|
|
|
|
```typescript
|
|
class Scope {
|
|
locals: Map<string, Value>;
|
|
parent?: Scope;
|
|
}
|
|
```
|
|
|
|
**Variable Resolution (LOAD)**:
|
|
1. Check current scope's locals
|
|
2. If not found, recursively check parent
|
|
3. If not found anywhere, throw error
|
|
|
|
**Variable Resolution (TRY_LOAD)**:
|
|
1. Check current scope's locals
|
|
2. If not found, recursively check parent
|
|
3. If not found anywhere, return variable name as string (no error)
|
|
|
|
**Variable Assignment (STORE)**:
|
|
1. If variable exists in current scope, update it
|
|
2. Else if variable exists in any parent scope, update it there
|
|
3. Else create new variable in current scope
|
|
|
|
This implements "assign to outermost scope where defined" semantics.
|
|
|
|
## Call Frames
|
|
|
|
```typescript
|
|
type CallFrame = {
|
|
returnAddress: number // Where to resume after RETURN
|
|
returnScope: Scope // Scope to restore after RETURN
|
|
isBreakTarget: boolean // Can be targeted by BREAK
|
|
}
|
|
```
|
|
|
|
## Exception Handlers
|
|
|
|
```typescript
|
|
type ExceptionHandler = {
|
|
catchAddress: number // Where to jump on exception
|
|
finallyAddress?: number // Where to jump for finally block (always runs)
|
|
callStackDepth: number // Call stack depth when handler pushed
|
|
scope: Scope // Scope to restore in catch block
|
|
}
|
|
```
|
|
|
|
## Opcodes
|
|
|
|
### Stack Operations
|
|
|
|
#### PUSH
|
|
**Operand**: Index into constants pool (number)
|
|
**Effect**: Push constant onto stack
|
|
**Stack**: [] → [value]
|
|
|
|
#### POP
|
|
**Operand**: None
|
|
**Effect**: Discard top of stack
|
|
**Stack**: [value] → []
|
|
|
|
#### DUP
|
|
**Operand**: None
|
|
**Effect**: Duplicate top of stack
|
|
**Stack**: [value] → [value, value]
|
|
|
|
### Variable Operations
|
|
|
|
#### LOAD
|
|
**Operand**: Variable name (string)
|
|
**Effect**: Push variable value onto stack
|
|
**Stack**: [] → [value]
|
|
**Errors**: Throws if variable not found in scope chain
|
|
|
|
#### STORE
|
|
**Operand**: Variable name (string)
|
|
**Effect**: Store top of stack into variable (following scope chain rules)
|
|
**Stack**: [value] → []
|
|
|
|
#### TRY_LOAD
|
|
**Operand**: Variable name (string)
|
|
**Effect**: Push variable value onto stack if found, otherwise push variable name as string
|
|
**Stack**: [] → [value | name]
|
|
**Errors**: Never throws (unlike LOAD)
|
|
|
|
**Behavior**:
|
|
1. Search for variable in scope chain (current scope and all parents)
|
|
2. If found, push the variable's value onto stack
|
|
3. If not found, push the variable name as a string value onto stack
|
|
|
|
**Use Cases**:
|
|
- Shell-like behavior where strings don't need quotes
|
|
|
|
**Example**:
|
|
```
|
|
PUSH 42
|
|
STORE x
|
|
TRY_LOAD x ; Pushes 42 (variable exists)
|
|
TRY_LOAD y ; Pushes "y" (variable doesn't exist)
|
|
```
|
|
|
|
### Arithmetic Operations
|
|
|
|
All arithmetic operations pop two values, perform operation, push result as number.
|
|
|
|
#### ADD
|
|
**Stack**: [a, b] → [a + b]
|
|
**Note**: Only for numbers (use separate string concat if needed)
|
|
|
|
#### SUB
|
|
**Stack**: [a, b] → [a - b]
|
|
|
|
#### MUL
|
|
**Stack**: [a, b] → [a * b]
|
|
|
|
#### DIV
|
|
**Stack**: [a, b] → [a / b]
|
|
|
|
#### MOD
|
|
**Stack**: [a, b] → [a % b]
|
|
|
|
### Comparison Operations
|
|
|
|
All comparison operations pop two values, compare, push boolean result.
|
|
|
|
#### EQ
|
|
**Stack**: [a, b] → [boolean]
|
|
**Note**: Type-aware equality (deep comparison for arrays/dicts)
|
|
|
|
#### NEQ
|
|
**Stack**: [a, b] → [boolean]
|
|
|
|
#### LT
|
|
**Stack**: [a, b] → [boolean]
|
|
**Note**: Numeric comparison (values coerced to numbers)
|
|
|
|
#### GT
|
|
**Stack**: [a, b] → [boolean]
|
|
**Note**: Numeric comparison (values coerced to numbers)
|
|
|
|
#### LTE
|
|
**Stack**: [a, b] → [boolean]
|
|
**Note**: Numeric comparison (values coerced to numbers)
|
|
|
|
#### GTE
|
|
**Stack**: [a, b] → [boolean]
|
|
**Note**: Numeric comparison (values coerced to numbers)
|
|
|
|
### Logical Operations
|
|
|
|
#### NOT
|
|
**Stack**: [a] → [!isTrue(a)]
|
|
|
|
**Note on AND/OR**: There are no AND/OR opcodes. Short-circuiting logical operations are implemented at the compiler level using JUMP instructions:
|
|
|
|
**AND pattern** (short-circuits if left side is false):
|
|
```
|
|
<evaluate left>
|
|
DUP
|
|
JUMP_IF_FALSE #2 # skip POP and <evaluate right>
|
|
POP
|
|
<evaluate right>
|
|
end:
|
|
```
|
|
|
|
**OR pattern** (short-circuits if left side is true):
|
|
```
|
|
<evaluate left>
|
|
DUP
|
|
JUMP_IF_TRUE #2 # skip POP and <evaluate right>
|
|
POP
|
|
<evaluate right>
|
|
end:
|
|
```
|
|
|
|
### Control Flow
|
|
|
|
#### JUMP
|
|
**Operand**: Offset (number)
|
|
**Effect**: Add offset to PC (relative jump)
|
|
**Stack**: No change
|
|
|
|
#### JUMP_IF_FALSE
|
|
**Operand**: Offset (number)
|
|
**Effect**: If top of stack is falsy, add offset to PC (relative jump)
|
|
**Stack**: [condition] → []
|
|
|
|
#### JUMP_IF_TRUE
|
|
**Operand**: Offset (number)
|
|
**Effect**: If top of stack is truthy, add offset to PC (relative jump)
|
|
**Stack**: [condition] → []
|
|
|
|
#### BREAK
|
|
**Operand**: None
|
|
**Effect**: Unwind call stack until frame with `isBreakTarget = true`, resume there
|
|
**Stack**: No change
|
|
**Errors**: Throws if no break target found
|
|
|
|
**Behavior**:
|
|
1. Pop frames from call stack
|
|
2. For each frame, restore its returnScope and returnAddress
|
|
3. Stop when finding frame with `isBreakTarget = true`
|
|
4. Resume execution at that frame's return address
|
|
|
|
**Note on CONTINUE**: There is no CONTINUE opcode. Compilers implement continue behavior using JUMP with negative offsets to jump back to the loop start.
|
|
|
|
### Exception Handling
|
|
|
|
#### PUSH_TRY
|
|
**Operand**: Catch block offset (number)
|
|
**Effect**: Push exception handler
|
|
**Stack**: No change
|
|
|
|
Registers a try block. If THROW occurs before POP_TRY, execution jumps to catch address.
|
|
|
|
#### PUSH_FINALLY
|
|
**Operand**: Finally block offset (number)
|
|
**Effect**: Add finally address to most recent exception handler
|
|
**Stack**: No change
|
|
**Errors**: Throws if no exception handler to modify
|
|
|
|
Adds a finally block to the current try/catch. The finally block will execute whether an exception is thrown or not.
|
|
|
|
#### POP_TRY
|
|
**Operand**: None
|
|
**Effect**: Pop exception handler (try block completed without exception)
|
|
**Stack**: No change
|
|
**Errors**: Throws if no handler to pop
|
|
|
|
**Behavior**:
|
|
1. Pop exception handler
|
|
2. Continue to next instruction
|
|
|
|
**Notes**:
|
|
- The VM does NOT automatically jump to finally blocks on POP_TRY
|
|
- The compiler must explicitly generate JUMP instructions to finally blocks when the try block completes normally
|
|
- The compiler must ensure catch blocks also jump to finally when present
|
|
- Finally blocks should end with normal control flow (no special terminator needed)
|
|
|
|
#### THROW
|
|
**Operand**: None
|
|
**Effect**: Throw exception with error value from stack
|
|
**Stack**: [errorValue] → (unwound)
|
|
|
|
**Behavior**:
|
|
1. Pop error value from stack
|
|
2. If no exception handlers, throw JavaScript Error with error message
|
|
3. Otherwise, pop most recent exception handler
|
|
4. Unwind call stack to handler's depth
|
|
5. Restore handler's scope
|
|
6. Push error value back onto stack
|
|
7. If handler has `finallyAddress`, jump there; otherwise jump to `catchAddress`
|
|
|
|
**Notes**:
|
|
- When THROW jumps to finally (if present), the error value remains on stack for the finally block
|
|
- The compiler must structure catch/finally blocks appropriately to handle the error value
|
|
- If finally is present, the catch block is typically entered via a jump from the finally block or through explicit compiler-generated control flow
|
|
|
|
### Function Operations
|
|
|
|
#### MAKE_FUNCTION
|
|
**Operand**: Index into constants pool (number)
|
|
**Effect**: Create function value, capturing current scope
|
|
**Stack**: [] → [function]
|
|
|
|
The constant must be a `function_def` with:
|
|
- `params`: Parameter names
|
|
- `defaults`: Map of param names to constant indices for default values
|
|
- `body`: Instruction address of function body
|
|
- `variadic`: If true, second-to-last param (if `named` is also true) or last param collects remaining positional args as array
|
|
- `named`: If true, last param collects unmatched named args as dict
|
|
|
|
The created function captures `currentScope` as its `parentScope`.
|
|
|
|
#### CALL
|
|
**Operand**: None
|
|
|
|
**Stack**: [fn, arg1, arg2, ..., name1, val1, name2, val2, ..., positionalCount, namedCount] → [returnValue]
|
|
|
|
**Behavior**:
|
|
1. Pop namedCount from stack (top of stack)
|
|
2. Pop positionalCount from stack
|
|
3. Pop named arguments (name/value pairs) from stack
|
|
4. Pop positional arguments from stack
|
|
5. Pop function from stack
|
|
6. Mark current frame (if exists) as break target (`isBreakTarget = true`)
|
|
7. Push new call frame with current PC and scope
|
|
8. Create new scope with function's parentScope as parent
|
|
9. Bind parameters:
|
|
- For regular functions: bind params by position, then by name, then defaults, then null
|
|
- For variadic functions: bind fixed params, collect rest into array
|
|
- For functions with `named: true`: bind fixed params by position/name, collect unmatched named args into dict
|
|
10. Set currentScope to new scope
|
|
11. Jump to function body
|
|
|
|
**Parameter Binding Priority** (for fixed params):
|
|
1. Named argument (if provided and matches param name)
|
|
2. Positional argument (if provided)
|
|
3. Default value (if defined)
|
|
4. Null
|
|
|
|
**Named Args Handling**:
|
|
- Named args that match fixed parameter names are bound to those params
|
|
- If the function has `named: true`, remaining named args (that don't match any fixed param) are collected into the last parameter as a dict
|
|
- This allows flexible calling: `fn(x=10, y=20, extra=30)` where `extra` goes to the named args dict
|
|
|
|
**Errors**: Throws if top of stack is not a function
|
|
|
|
#### TAIL_CALL
|
|
**Operand**: None
|
|
**Effect**: Same as CALL, but reuses current call frame
|
|
**Stack**: [fn, arg1, arg2, ..., name1, val1, name2, val2, ..., positionalCount, namedCount] → [returnValue]
|
|
|
|
**Behavior**: Identical to CALL except:
|
|
- Does NOT push a new call frame
|
|
- Replaces currentScope instead of creating nested scope
|
|
- Enables unbounded tail recursion without stack overflow
|
|
|
|
#### RETURN
|
|
**Operand**: None
|
|
**Effect**: Return from function
|
|
**Stack**: [returnValue] → (restored stack with returnValue on top)
|
|
|
|
**Behavior**:
|
|
1. Pop return value (or null if stack empty)
|
|
2. Pop call frame
|
|
3. Restore scope from frame
|
|
4. Set PC to frame's return address
|
|
5. Push return value onto stack
|
|
|
|
**Errors**: Throws if no call frame to return from
|
|
|
|
#### TRY_CALL
|
|
**Operand**: Variable name (string)
|
|
**Effect**: Conditionally call function or push value/string onto stack
|
|
**Stack**: [] → [returnValue | value | name]
|
|
**Errors**: Never throws (unlike CALL)
|
|
|
|
**Behavior**:
|
|
1. Look up variable by name in scope chain
|
|
2. **If variable is a function**: Call it with 0 arguments (no positional, no named) and push the returned value onto the stack.
|
|
3. **If variable exists but is not a function**: Push the variable's value onto stack
|
|
4. **If variable doesn't exist**: Push the variable name as a string onto stack
|
|
|
|
**Use Cases**:
|
|
- DSL/templating languages with "call if callable, otherwise use as literal" semantics
|
|
- Shell-like behavior where unknown identifiers become strings
|
|
- Optional function hooks (call if defined, silently skip if not)
|
|
|
|
**Implementation Note**:
|
|
- Uses intentional fall-through in VM switch statement from TRY_CALL to CALL case
|
|
- When function is found, stacks are set up to match CALL's expectations exactly
|
|
- No break target marking or frame pushing occurs when non-function value is found
|
|
|
|
**Example**:
|
|
```
|
|
MAKE_FUNCTION () .body
|
|
STORE greet
|
|
PUSH 42
|
|
STORE answer
|
|
TRY_CALL greet ; Calls function greet(), returns its value
|
|
TRY_CALL answer ; Pushes 42 (number value)
|
|
TRY_CALL unknown ; Pushes "unknown" (string)
|
|
|
|
.body:
|
|
PUSH "Hello!"
|
|
RETURN
|
|
```
|
|
|
|
### Array Operations
|
|
|
|
#### MAKE_ARRAY
|
|
**Operand**: Number of items (number)
|
|
**Effect**: Create array from N stack items
|
|
**Stack**: [item1, item2, ..., itemN] → [array]
|
|
|
|
Items are popped in reverse order (item1 is array[0]).
|
|
|
|
#### ARRAY_GET
|
|
**Operand**: None
|
|
**Effect**: Get array element at index
|
|
**Stack**: [array, index] → [value]
|
|
**Errors**: Throws if not array or index out of bounds
|
|
|
|
Index is coerced to number and floored.
|
|
|
|
#### ARRAY_SET
|
|
**Operand**: None
|
|
**Effect**: Set array element at index (mutates array)
|
|
**Stack**: [array, index, value] → []
|
|
**Errors**: Throws if not array or index out of bounds
|
|
|
|
#### ARRAY_PUSH
|
|
**Operand**: None
|
|
**Effect**: Append value to end of array (mutates array, grows by 1)
|
|
**Stack**: [array, value] → []
|
|
**Errors**: Throws if not array
|
|
|
|
#### ARRAY_LEN
|
|
**Operand**: None
|
|
**Effect**: Get array length
|
|
**Stack**: [array] → [length]
|
|
**Errors**: Throws if not array
|
|
|
|
### Dictionary Operations
|
|
|
|
#### MAKE_DICT
|
|
**Operand**: Number of key-value pairs (number)
|
|
**Effect**: Create dict from N key-value pairs
|
|
**Stack**: [key1, val1, key2, val2, ...] → [dict]
|
|
|
|
Keys are coerced to strings.
|
|
|
|
#### DICT_GET
|
|
**Operand**: None
|
|
**Effect**: Get dict value for key
|
|
**Stack**: [dict, key] → [value]
|
|
|
|
Returns null if key not found. Key is coerced to string.
|
|
**Errors**: Throws if not dict
|
|
|
|
#### DICT_SET
|
|
**Operand**: None
|
|
**Effect**: Set dict value for key (mutates dict)
|
|
**Stack**: [dict, key, value] → []
|
|
|
|
Key is coerced to string.
|
|
**Errors**: Throws if not dict
|
|
|
|
#### DICT_HAS
|
|
**Operand**: None
|
|
**Effect**: Check if key exists in dict
|
|
**Stack**: [dict, key] → [boolean]
|
|
|
|
Key is coerced to string.
|
|
**Errors**: Throws if not dict
|
|
|
|
### String Operations
|
|
|
|
#### STR_CONCAT
|
|
**Operand**: Number of values to concatenate (number)
|
|
**Effect**: Concatenate N values from stack into a single string
|
|
**Stack**: [val1, val2, ..., valN] → [string]
|
|
|
|
**Behavior**:
|
|
1. Pop N values from stack (in reverse order)
|
|
2. Convert each value to string using `toString()`
|
|
3. Concatenate all strings in order (val1 + val2 + ... + valN)
|
|
4. Push resulting string onto stack
|
|
|
|
**Type Coercion**:
|
|
- Numbers → string representation (e.g., `42` → `"42"`)
|
|
- Booleans → `"true"` or `"false"`
|
|
- Null → `"null"`
|
|
- Strings → identity
|
|
- Arrays → `"[item, item]"` format
|
|
- Dicts → `"{key: value, ...}"` format
|
|
- Functions → `"<function>"`
|
|
|
|
**Use Cases**:
|
|
- Building dynamic strings from multiple parts
|
|
- Template string interpolation
|
|
- String formatting with mixed types
|
|
|
|
**Composability**:
|
|
- Results can be concatenated again with additional STR_CONCAT operations
|
|
- Can leave values on stack (only consumes specified count)
|
|
|
|
**Example**:
|
|
```
|
|
PUSH "Hello"
|
|
PUSH " "
|
|
PUSH "World"
|
|
STR_CONCAT #3 ; → "Hello World"
|
|
|
|
PUSH "Count: "
|
|
PUSH 42
|
|
PUSH ", Active: "
|
|
PUSH true
|
|
STR_CONCAT #4 ; → "Count: 42, Active: true"
|
|
```
|
|
|
|
**Edge Cases**:
|
|
- `STR_CONCAT #0` produces empty string `""`
|
|
- `STR_CONCAT #1` converts single value to string
|
|
- If stack has fewer values than count, behavior depends on implementation (may use empty strings or throw)
|
|
|
|
### TypeScript Interop
|
|
|
|
#### CALL_NATIVE
|
|
**Operand**: Function name (string)
|
|
**Effect**: Call registered TypeScript function
|
|
**Stack**: [...args] → [returnValue]
|
|
|
|
**Behavior**:
|
|
1. Look up function by name in registry
|
|
2. Mark current frame (if exists) as break target
|
|
3. Await function call (native function receives arguments and returns a Value)
|
|
4. Push return value onto stack
|
|
|
|
**Notes**:
|
|
- TypeScript functions are passed the raw stack values as arguments
|
|
- They must return a valid Value
|
|
- They can be async (VM awaits them)
|
|
- Like CALL, but function is from TypeScript registry instead of stack
|
|
|
|
**Errors**: Throws if function not found
|
|
|
|
**TypeScript Function Signature**:
|
|
```typescript
|
|
type TypeScriptFunction = (...args: Value[]) => Promise<Value> | Value;
|
|
```
|
|
|
|
### Special
|
|
|
|
#### HALT
|
|
**Operand**: None
|
|
**Effect**: Stop execution
|
|
**Stack**: No change
|
|
|
|
## Label Syntax
|
|
|
|
The bytecode format supports labels for improved readability:
|
|
|
|
**Label Definition**: `.label_name:` marks an instruction position
|
|
**Label Reference**: `.label_name` in operands (e.g., `JUMP .loop_start`)
|
|
|
|
Labels are resolved to numeric offsets during parsing. The original numeric offset syntax (`#N`) is still supported for backwards compatibility.
|
|
|
|
Example with labels:
|
|
```
|
|
JUMP .skip
|
|
.middle:
|
|
PUSH 999
|
|
HALT
|
|
.skip:
|
|
PUSH 42
|
|
HALT
|
|
```
|
|
|
|
Equivalent with numeric offsets:
|
|
```
|
|
JUMP #2
|
|
PUSH 999
|
|
HALT
|
|
PUSH 42
|
|
HALT
|
|
```
|
|
|
|
## Common Bytecode Patterns
|
|
|
|
### If-Else Statement
|
|
```
|
|
LOAD 'x'
|
|
PUSH 5
|
|
GT
|
|
JUMP_IF_FALSE .else
|
|
# then block
|
|
JUMP .end
|
|
.else:
|
|
# else block
|
|
.end:
|
|
```
|
|
|
|
### While Loop
|
|
```
|
|
.loop_start:
|
|
# condition
|
|
JUMP_IF_FALSE .loop_end
|
|
# body
|
|
JUMP .loop_start
|
|
.loop_end:
|
|
```
|
|
|
|
### Function Definition
|
|
```
|
|
MAKE_FUNCTION <params> .function_body
|
|
STORE 'functionName'
|
|
JUMP .skip_body
|
|
.function_body:
|
|
# function code
|
|
RETURN
|
|
.skip_body:
|
|
```
|
|
|
|
### Try-Catch
|
|
```
|
|
PUSH_TRY .catch
|
|
; try block
|
|
POP_TRY
|
|
JUMP .end
|
|
.catch:
|
|
STORE 'errorVar' ; Error is on stack
|
|
; catch block
|
|
.end:
|
|
```
|
|
|
|
### Try-Catch-Finally
|
|
```
|
|
PUSH_TRY .catch
|
|
PUSH_FINALLY .finally
|
|
; try block
|
|
POP_TRY
|
|
JUMP .finally
|
|
.catch:
|
|
STORE 'errorVar' ; Error is on stack
|
|
; catch block
|
|
JUMP .finally
|
|
.finally:
|
|
; finally block (executes in both cases)
|
|
.end:
|
|
```
|
|
|
|
### Named Function Call
|
|
```
|
|
LOAD 'mkdir'
|
|
PUSH 'src/bin' # positional arg
|
|
PUSH 'recursive' # name
|
|
PUSH true # value
|
|
PUSH 1 # positionalCount
|
|
PUSH 1 # namedCount
|
|
CALL
|
|
```
|
|
|
|
### Tail Recursive Function
|
|
```
|
|
MAKE_FUNCTION (n acc) .factorial_body
|
|
STORE 'factorial'
|
|
JUMP .main
|
|
.factorial_body:
|
|
LOAD 'n'
|
|
PUSH 0
|
|
EQ
|
|
JUMP_IF_FALSE .recurse
|
|
LOAD 'acc'
|
|
RETURN
|
|
.recurse:
|
|
LOAD 'factorial'
|
|
LOAD 'n'
|
|
PUSH 1
|
|
SUB
|
|
LOAD 'n'
|
|
LOAD 'acc'
|
|
MUL
|
|
PUSH 2 # positionalCount
|
|
PUSH 0 # namedCount
|
|
TAIL_CALL # No stack growth!
|
|
.main:
|
|
LOAD 'factorial'
|
|
PUSH 5
|
|
PUSH 1
|
|
PUSH 2 # positionalCount
|
|
PUSH 0 # namedCount
|
|
CALL
|
|
```
|
|
|
|
## Error Conditions
|
|
|
|
### Runtime Errors
|
|
|
|
All of these should throw errors:
|
|
|
|
1. **Undefined Variable**: LOAD of non-existent variable
|
|
2. **Type Mismatch**: ARRAY_GET on non-array, DICT_GET on non-dict, CALL on non-function
|
|
3. **Index Out of Bounds**: ARRAY_GET/SET with invalid index
|
|
4. **Stack Underflow**: Arithmetic ops without enough operands
|
|
5. **Uncaught Exception**: THROW with no exception handlers
|
|
6. **Break Outside Loop**: BREAK with no break target
|
|
7. **Continue Outside Loop**: CONTINUE with no continue target
|
|
8. **Return Outside Function**: RETURN with no call frame
|
|
9. **Unknown Function**: CALL_NATIVE with unregistered function
|
|
10. **Mismatched Handler**: POP_TRY with no handler
|
|
11. **Invalid Constant**: PUSH with invalid constant index
|
|
12. **Invalid Function Definition**: MAKE_FUNCTION with non-function_def constant
|
|
|
|
## Edge Cases
|
|
|
|
### Empty Stack
|
|
- Arithmetic/comparison ops on empty stack should throw
|
|
- RETURN with empty stack returns null
|
|
- HALT with empty stack returns null
|
|
|
|
### Null Values
|
|
- Arithmetic with null coerces to 0
|
|
- Comparisons with null work normally
|
|
- Null is falsy
|
|
|
|
### Scope Shadowing
|
|
- Variables in inner scopes shadow outer scopes during LOAD
|
|
- STORE updates outermost scope where variable is defined
|
|
|
|
### Function Parameter Binding
|
|
- Missing positional args → use named args → use defaults → use null
|
|
- Extra positional args → collected by variadic parameter or ignored
|
|
- Extra named args → collected by named args parameter (if `named: true`) or ignored
|
|
- Named arg matching is case-sensitive
|
|
|
|
### Tail Call Optimization
|
|
- TAIL_CALL reuses frame, so return address is from original caller
|
|
- Multiple tail calls in sequence never grow stack
|
|
- TAIL_CALL can call different function (not just self-recursive)
|
|
|
|
### Break/Continue Semantics
|
|
- BREAK unwinds to frame that called the iterator function
|
|
- Multiple nested function calls: break exits all of them until reaching marked frame
|
|
- CONTINUE is implemented by the compiler using JUMPs
|
|
|
|
### Exception Unwinding
|
|
- THROW unwinds call stack to handler's depth
|
|
- Exception handlers form a stack (nested try blocks)
|
|
- Error value on stack is available in catch/finally blocks
|
|
- When THROW occurs and handler has finallyAddress, VM jumps to finally first
|
|
- Compiler is responsible for structuring control flow so finally executes in all cases
|
|
- Finally typically executes after try (if no exception) or after catch (if exception), but control flow is compiler-managed
|
|
|
|
## VM Initialization
|
|
|
|
```typescript
|
|
const vm = new VM(bytecode);
|
|
vm.registerFunction('add', (a, b) => {
|
|
return { type: 'number', value: toNumber(a) + toNumber(b) }
|
|
})
|
|
const result = await vm.execute()
|
|
```
|
|
|
|
## Testing Considerations
|
|
|
|
### Unit Tests Should Cover
|
|
|
|
1. **Each opcode** individually with minimal setup
|
|
2. **Type coercion** for arithmetic, comparison, and logical ops
|
|
3. **Scope chain** resolution (local, parent, global)
|
|
4. **Call frames** (nested calls, return values)
|
|
5. **Exception handling** (nested try blocks, unwinding, finally blocks)
|
|
6. **Break/continue** (nested functions, iterator pattern)
|
|
7. **Closures** (capturing variables, multiple nesting levels)
|
|
8. **Tail calls** (self-recursive, mutual recursion)
|
|
9. **Parameter binding** (positional, named, defaults, variadic, named args collection, combinations)
|
|
10. **Array/dict operations** (creation, access, mutation)
|
|
11. **Error conditions** (all error cases listed above)
|
|
12. **Edge cases** (empty stack, null values, shadowing, etc.)
|
|
|
|
### Integration Tests Should Cover
|
|
|
|
1. **Recursive functions** (factorial, fibonacci)
|
|
2. **Iterator pattern** (each with break)
|
|
3. **Closure examples** (counters, adder factories)
|
|
4. **Exception examples** (try/catch/throw chains)
|
|
5. **Complex scope** (deeply nested functions)
|
|
6. **Mixed features** (variadic + defaults + named args)
|
|
|
|
### Property-Based Tests Should Cover
|
|
|
|
1. **Stack integrity** (stack size matches expectations after ops)
|
|
2. **Scope integrity** (variables remain accessible)
|
|
3. **Frame integrity** (call stack unwinds correctly)
|
|
|
|
## Version History
|
|
|
|
- **1.0** (2024): Initial specification
|
|
|
|
## Notes
|
|
|
|
- PC increment happens after each instruction execution
|
|
- Jump instructions use relative offsets (added to current PC after increment)
|
|
- All async operations (native functions) must be awaited
|
|
- Arrays and dicts are mutable (pass by reference)
|
|
- Functions are immutable values
|
|
- The VM is single-threaded (no concurrency primitives) |