require labels for JUMP opcodes to avoid compiler bugs

This commit is contained in:
Chris Wanstrath 2025-11-09 22:18:10 -08:00
parent 5350bb8c2b
commit b2a6021fb8
5 changed files with 57 additions and 43 deletions

View File

@ -55,7 +55,7 @@ No build step required - Bun runs TypeScript directly.
### Critical Design Decisions ### Critical Design Decisions
**Relative jumps**: All JUMP instructions use PC-relative offsets (not absolute addresses), making bytecode position-independent. PUSH_TRY/PUSH_FINALLY use absolute addresses. **Label-based jumps**: All JUMP instructions (`JUMP`, `JUMP_IF_FALSE`, `JUMP_IF_TRUE`) require label operands (`.label`), not numeric offsets. Labels are resolved to PC-relative offsets during compilation, making bytecode position-independent. PUSH_TRY/PUSH_FINALLY use absolute addresses and can accept either labels or numeric offsets.
**Truthiness semantics**: Only `null` and `false` are falsy. Unlike JavaScript, `0`, `""`, empty arrays, and empty dicts are truthy. **Truthiness semantics**: Only `null` and `false` are falsy. Unlike JavaScript, `0`, `""`, empty arrays, and empty dicts are truthy.
@ -229,8 +229,8 @@ await vm.call('log', 'Hello!')
- Automatically converts arguments to ReefVM Values - Automatically converts arguments to ReefVM Values
- Converts result back to JavaScript types - Converts result back to JavaScript types
### Label Usage (Preferred) ### Label Usage (Required for JUMP instructions)
Use labels instead of numeric offsets for readability: All JUMP instructions must use labels:
``` ```
JUMP .skip JUMP .skip
PUSH 42 PUSH 42
@ -486,7 +486,7 @@ Run `bun test` to verify all tests pass before committing.
## Common Gotchas ## Common Gotchas
**Jump offsets**: JUMP/JUMP_IF_FALSE/JUMP_IF_TRUE use relative offsets from the next instruction (PC + 1). PUSH_TRY/PUSH_FINALLY use absolute instruction indices. **Label requirements**: JUMP/JUMP_IF_FALSE/JUMP_IF_TRUE require label operands (`.label`), not numeric offsets. The bytecode compiler resolves labels to PC-relative offsets internally. PUSH_TRY/PUSH_FINALLY can use either labels or absolute instruction indices (`#N`).
**Stack operations**: Most binary operations pop in reverse order (second operand is popped first, then first operand). **Stack operations**: Most binary operations pop in reverse order (second operand is popped first, then first operand).

43
SPEC.md
View File

@ -327,39 +327,45 @@ All comparison operations pop two values, compare, push boolean result.
``` ```
<evaluate left> <evaluate left>
DUP DUP
JUMP_IF_FALSE #2 # skip POP and <evaluate right> JUMP_IF_FALSE .end
POP POP
<evaluate right> <evaluate right>
end: .end:
``` ```
**OR pattern** (short-circuits if left side is true): **OR pattern** (short-circuits if left side is true):
``` ```
<evaluate left> <evaluate left>
DUP DUP
JUMP_IF_TRUE #2 # skip POP and <evaluate right> JUMP_IF_TRUE .end
POP POP
<evaluate right> <evaluate right>
end: .end:
``` ```
### Control Flow ### Control Flow
#### JUMP #### JUMP
**Operand**: Offset (number) **Operand**: Label (string)
**Effect**: Add offset to PC (relative jump) **Effect**: Jump to the specified label
**Stack**: No change **Stack**: No change
**Note**: JUMP only accepts label operands (`.label`), not numeric offsets. The VM resolves labels to relative offsets internally.
#### JUMP_IF_FALSE #### JUMP_IF_FALSE
**Operand**: Offset (number) **Operand**: Label (string)
**Effect**: If top of stack is falsy, add offset to PC (relative jump) **Effect**: If top of stack is falsy, jump to the specified label
**Stack**: [condition] → [] **Stack**: [condition] → []
**Note**: JUMP_IF_FALSE only accepts label operands (`.label`), not numeric offsets.
#### JUMP_IF_TRUE #### JUMP_IF_TRUE
**Operand**: Offset (number) **Operand**: Label (string)
**Effect**: If top of stack is truthy, add offset to PC (relative jump) **Effect**: If top of stack is truthy, jump to the specified label
**Stack**: [condition] → [] **Stack**: [condition] → []
**Note**: JUMP_IF_TRUE only accepts label operands (`.label`), not numeric offsets.
#### BREAK #### BREAK
**Operand**: None **Operand**: None
**Effect**: Unwind call stack until frame with `isBreakTarget = true`, resume there **Effect**: Unwind call stack until frame with `isBreakTarget = true`, resume there
@ -814,14 +820,16 @@ CALL ; → "Hi, Bob!"
## Label Syntax ## Label Syntax
The bytecode format supports labels for improved readability: The bytecode format requires labels for control flow jumps:
**Label Definition**: `.label_name:` marks an instruction position **Label Definition**: `.label_name:` marks an instruction position
**Label Reference**: `.label_name` in operands (e.g., `JUMP .loop_start`) **Label Reference**: `.label_name` in operands (e.g., `JUMP .loop_start`)
Labels are resolved to numeric offsets during parsing. The original numeric offset syntax (`#N`) is still supported for backwards compatibility. Labels are resolved to relative PC offsets during bytecode compilation. All JUMP instructions (`JUMP`, `JUMP_IF_FALSE`, `JUMP_IF_TRUE`) require label operands.
Example with labels: **Note**: Exception handling instructions (`PUSH_TRY`, `PUSH_FINALLY`) and function definitions (`MAKE_FUNCTION`) can use either labels or absolute instruction indices (`#N`).
Example:
``` ```
JUMP .skip JUMP .skip
.middle: .middle:
@ -832,15 +840,6 @@ JUMP .skip
HALT HALT
``` ```
Equivalent with numeric offsets:
```
JUMP #2
PUSH 999
HALT
PUSH 42
HALT
```
## Common Bytecode Patterns ## Common Bytecode Patterns
### If-Else Statement ### If-Else Statement

View File

@ -44,9 +44,9 @@ type InstructionTuple =
| ["NOT"] | ["NOT"]
// Control flow // Control flow
| ["JUMP", string | number] | ["JUMP", string]
| ["JUMP_IF_FALSE", string | number] | ["JUMP_IF_FALSE", string]
| ["JUMP_IF_TRUE", string | number] | ["JUMP_IF_TRUE", string]
| ["BREAK"] | ["BREAK"]
// Exception handling // Exception handling
@ -56,7 +56,7 @@ type InstructionTuple =
| ["THROW"] | ["THROW"]
// Functions // Functions
| ["MAKE_FUNCTION", string[], string | number] | ["MAKE_FUNCTION", string[], string]
| ["CALL"] | ["CALL"]
| ["TAIL_CALL"] | ["TAIL_CALL"]
| ["RETURN"] | ["RETURN"]

View File

@ -87,11 +87,15 @@ const OPCODES_WITHOUT_OPERANDS = new Set([
OpCode.DOT_GET, OpCode.DOT_GET,
]) ])
// immediate = immediate number, eg #5 // JUMP* instructions require labels only (no numeric immediates)
const OPCODES_REQUIRING_IMMEDIATE_OR_LABEL = new Set([ const OPCODES_REQUIRING_LABEL = new Set([
OpCode.JUMP, OpCode.JUMP,
OpCode.JUMP_IF_FALSE, OpCode.JUMP_IF_FALSE,
OpCode.JUMP_IF_TRUE, OpCode.JUMP_IF_TRUE,
])
// PUSH_TRY/PUSH_FINALLY still allow immediate or label
const OPCODES_REQUIRING_IMMEDIATE_OR_LABEL = new Set([
OpCode.PUSH_TRY, OpCode.PUSH_TRY,
OpCode.PUSH_FINALLY, OpCode.PUSH_FINALLY,
]) ])
@ -197,6 +201,16 @@ export function validateBytecode(source: string): ValidationResult {
// Validate specific operand formats // Validate specific operand formats
if (operand) { if (operand) {
if (OPCODES_REQUIRING_LABEL.has(opCode)) {
if (!operand.startsWith('.')) {
errors.push({
line: lineNum,
message: `${opName} requires label (.label), got: ${operand}`,
})
continue
}
}
if (OPCODES_REQUIRING_IMMEDIATE_OR_LABEL.has(opCode)) { if (OPCODES_REQUIRING_IMMEDIATE_OR_LABEL.has(opCode)) {
if (!operand.startsWith('#') && !operand.startsWith('.')) { if (!operand.startsWith('#') && !operand.startsWith('.')) {
errors.push({ errors.push({
@ -310,11 +324,11 @@ export function validateBytecode(source: string): ValidationResult {
} }
} }
// Validate body address // Validate body address (must be a label)
if (!bodyAddr!.startsWith('.') && !bodyAddr!.startsWith('#')) { if (!bodyAddr!.startsWith('.')) {
errors.push({ errors.push({
line: lineNum, line: lineNum,
message: `Invalid body address: expected .label or #offset`, message: `Invalid body address: expected .label, got: ${bodyAddr}`,
}) })
} }

View File

@ -201,17 +201,17 @@ test("formatValidationErrors produces readable output", () => {
expect(formatted).toContain("UNKNOWN") expect(formatted).toContain("UNKNOWN")
}) })
test("detects JUMP without # or .label", () => { test("detects JUMP without .label", () => {
const source = ` const source = `
JUMP 5 JUMP 5
HALT HALT
` `
const result = validateBytecode(source) const result = validateBytecode(source)
expect(result.valid).toBe(false) expect(result.valid).toBe(false)
expect(result.errors[0]!.message).toContain("JUMP requires immediate (#number) or label (.label)") expect(result.errors[0]!.message).toContain("JUMP requires label (.label)")
}) })
test("detects JUMP_IF_TRUE without # or .label", () => { test("detects JUMP_IF_TRUE without .label", () => {
const source = ` const source = `
PUSH true PUSH true
JUMP_IF_TRUE 2 JUMP_IF_TRUE 2
@ -219,10 +219,10 @@ test("detects JUMP_IF_TRUE without # or .label", () => {
` `
const result = validateBytecode(source) const result = validateBytecode(source)
expect(result.valid).toBe(false) expect(result.valid).toBe(false)
expect(result.errors[0]!.message).toContain("JUMP_IF_TRUE requires immediate (#number) or label (.label)") expect(result.errors[0]!.message).toContain("JUMP_IF_TRUE requires label (.label)")
}) })
test("detects JUMP_IF_FALSE without # or .label", () => { test("detects JUMP_IF_FALSE without .label", () => {
const source = ` const source = `
PUSH false PUSH false
JUMP_IF_FALSE 2 JUMP_IF_FALSE 2
@ -230,17 +230,18 @@ test("detects JUMP_IF_FALSE without # or .label", () => {
` `
const result = validateBytecode(source) const result = validateBytecode(source)
expect(result.valid).toBe(false) expect(result.valid).toBe(false)
expect(result.errors[0]!.message).toContain("JUMP_IF_FALSE requires immediate (#number) or label (.label)") expect(result.errors[0]!.message).toContain("JUMP_IF_FALSE requires label (.label)")
}) })
test("allows JUMP with immediate number", () => { test("rejects JUMP with immediate number", () => {
const source = ` const source = `
JUMP #2 JUMP #2
PUSH 999 PUSH 999
HALT HALT
` `
const result = validateBytecode(source) const result = validateBytecode(source)
expect(result.valid).toBe(true) expect(result.valid).toBe(false)
expect(result.errors[0]!.message).toContain("JUMP requires label (.label)")
}) })
test("detects MAKE_ARRAY without #", () => { test("detects MAKE_ARRAY without #", () => {