Compare commits
10 Commits
a652f83b63
...
0f7d3126a2
| Author | SHA1 | Date | |
|---|---|---|---|
| 0f7d3126a2 | |||
| 78ae96fc72 | |||
| b0d5a7f50c | |||
| 290270dc7b | |||
| 4619791b7d | |||
| aee9fa0747 | |||
| 7de1682e91 | |||
| 2fc321596f | |||
| 1e6fabf954 | |||
| b2c5db77b2 |
12
CLAUDE.md
12
CLAUDE.md
|
|
@ -195,6 +195,18 @@ function parseExpression(input: string) {
|
||||||
|
|
||||||
**Expression-oriented design**: Everything returns a value - commands, assignments, functions. This enables composition and functional patterns.
|
**Expression-oriented design**: Everything returns a value - commands, assignments, functions. This enables composition and functional patterns.
|
||||||
|
|
||||||
|
**Scope-aware property access (DotGet)**: The parser uses Lezer's `@context` feature to track variable scope at parse time. When it encounters `obj.prop`, it checks if `obj` is in scope:
|
||||||
|
- **In scope** → Parses as `DotGet(Identifier, Identifier)` → compiles to `TRY_LOAD obj; PUSH 'prop'; DOT_GET`
|
||||||
|
- **Not in scope** → Parses as `Word("obj.prop")` → compiles to `PUSH 'obj.prop'` (treated as file path/string)
|
||||||
|
|
||||||
|
Implementation files:
|
||||||
|
- **src/parser/scopeTracker.ts**: ContextTracker that maintains immutable scope chain
|
||||||
|
- **src/parser/tokenizer.ts**: External tokenizer checks `stack.context` to decide if dot creates DotGet or Word
|
||||||
|
- Scope tracking: Captures variables from assignments (`x = 5`) and function parameters (`fn x:`)
|
||||||
|
- See `src/parser/tests/dot-get.test.ts` for comprehensive examples
|
||||||
|
|
||||||
|
**Why this matters**: This enables shell-like file paths (`readme.txt`) while supporting dictionary/array access (`config.path`) without quotes, determined entirely at parse time based on lexical scope.
|
||||||
|
|
||||||
**EOF handling**: The grammar uses `(statement | newlineOrSemicolon)+ eof?` to handle empty lines and end-of-file without infinite loops.
|
**EOF handling**: The grammar uses `(statement | newlineOrSemicolon)+ eof?` to handle empty lines and end-of-file without infinite loops.
|
||||||
|
|
||||||
## Compiler Architecture
|
## Compiler Architecture
|
||||||
|
|
|
||||||
557
docs/parser-architecture.md
Normal file
557
docs/parser-architecture.md
Normal file
|
|
@ -0,0 +1,557 @@
|
||||||
|
# Shrimp Parser Architecture
|
||||||
|
|
||||||
|
This document explains the special cases, tricks, and design decisions in the Shrimp parser and tokenizer.
|
||||||
|
|
||||||
|
## Table of Contents
|
||||||
|
|
||||||
|
1. [Token Types and Their Purpose](#token-types-and-their-purpose)
|
||||||
|
2. [External Tokenizer Tricks](#external-tokenizer-tricks)
|
||||||
|
3. [Grammar Special Cases](#grammar-special-cases)
|
||||||
|
4. [Scope Tracking Architecture](#scope-tracking-architecture)
|
||||||
|
5. [Common Pitfalls](#common-pitfalls)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Token Types and Their Purpose
|
||||||
|
|
||||||
|
### Four Token Types from External Tokenizer
|
||||||
|
|
||||||
|
The external tokenizer (`src/parser/tokenizer.ts`) emits four different token types based on context:
|
||||||
|
|
||||||
|
| Token | Purpose | Example |
|
||||||
|
|-------|---------|---------|
|
||||||
|
| `Identifier` | Regular identifiers in expressions, function calls | `echo`, `x` in `x + 1` |
|
||||||
|
| `AssignableIdentifier` | Identifiers on LHS of `=` or in function params | `x` in `x = 5`, params in `fn x y:` |
|
||||||
|
| `Word` | Anything else: paths, URLs, @mentions, #hashtags | `./file.txt`, `@user`, `#tag` |
|
||||||
|
| `IdentifierBeforeDot` | Identifier that's in scope, followed by `.` | `obj` in `obj.prop` |
|
||||||
|
|
||||||
|
### Why We Need Both Identifier Types
|
||||||
|
|
||||||
|
**The Problem:** At the start of a statement like `x ...`, the parser doesn't know if it's:
|
||||||
|
- An assignment: `x = 5` (needs `AssignableIdentifier`)
|
||||||
|
- A function call: `x hello world` (needs `Identifier`)
|
||||||
|
|
||||||
|
**The Solution:** The external tokenizer uses a three-way decision:
|
||||||
|
|
||||||
|
1. **Only `AssignableIdentifier` can shift** (e.g., in `Params` rule) → emit `AssignableIdentifier`
|
||||||
|
2. **Only `Identifier` can shift** (e.g., in function arguments) → emit `Identifier`
|
||||||
|
3. **Both can shift** (ambiguous statement start) → peek ahead for `=` to disambiguate
|
||||||
|
|
||||||
|
See [`Identifier vs AssignableIdentifier Disambiguation`](#identifier-vs-assignableidentifier-disambiguation) below for implementation details.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## External Tokenizer Tricks
|
||||||
|
|
||||||
|
### 1. Identifier vs AssignableIdentifier Disambiguation
|
||||||
|
|
||||||
|
**Location:** `src/parser/tokenizer.ts` lines 88-118
|
||||||
|
|
||||||
|
**The Challenge:** When both `Identifier` and `AssignableIdentifier` are valid (at statement start), how do we choose?
|
||||||
|
|
||||||
|
**The Solution:** Three-way branching with lookahead:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
const canAssignable = stack.canShift(AssignableIdentifier)
|
||||||
|
const canRegular = stack.canShift(Identifier)
|
||||||
|
|
||||||
|
if (canAssignable && !canRegular) {
|
||||||
|
// Only AssignableIdentifier valid (e.g., in Params)
|
||||||
|
input.acceptToken(AssignableIdentifier)
|
||||||
|
} else if (canRegular && !canAssignable) {
|
||||||
|
// Only Identifier valid (e.g., in function args)
|
||||||
|
input.acceptToken(Identifier)
|
||||||
|
} else {
|
||||||
|
// BOTH possible - peek ahead for '='
|
||||||
|
// Skip whitespace, check if next char is '='
|
||||||
|
const nextCh = getFullCodePoint(input, peekPos)
|
||||||
|
if (nextCh === 61 /* = */) {
|
||||||
|
input.acceptToken(AssignableIdentifier) // It's an assignment
|
||||||
|
} else {
|
||||||
|
input.acceptToken(Identifier) // It's a function call
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key Insight:** `stack.canShift()` returns true for BOTH token types when the grammar has multiple valid paths. We can't just use `canShift()` alone - we need lookahead.
|
||||||
|
|
||||||
|
**Why This Works:**
|
||||||
|
- `fn x y: ...` → In `Params` rule, only `AssignableIdentifier` can shift → no lookahead needed
|
||||||
|
- `echo hello` → Both can shift, but no `=` ahead → emits `Identifier` → parses as `FunctionCall`
|
||||||
|
- `x = 5` → Both can shift, finds `=` ahead → emits `AssignableIdentifier` → parses as `Assign`
|
||||||
|
|
||||||
|
### 2. Surrogate Pair Handling for Emoji
|
||||||
|
|
||||||
|
**Location:** `src/parser/tokenizer.ts` lines 71-84, `getFullCodePoint()` function
|
||||||
|
|
||||||
|
**The Problem:** JavaScript strings use UTF-16, but emoji like 🍤 use code points outside the BMP (Basic Multilingual Plane), requiring surrogate pairs.
|
||||||
|
|
||||||
|
**The Solution:** When reading characters, check for high surrogates (0xD800-0xDBFF) and combine them with low surrogates (0xDC00-0xDFFF):
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
const getFullCodePoint = (input: InputStream, pos: number): number => {
|
||||||
|
const ch = input.peek(pos)
|
||||||
|
|
||||||
|
// Check if this is a high surrogate (0xD800-0xDBFF)
|
||||||
|
if (ch >= 0xd800 && ch <= 0xdbff) {
|
||||||
|
const low = input.peek(pos + 1)
|
||||||
|
// Check if next is low surrogate (0xDC00-0xDFFF)
|
||||||
|
if (low >= 0xdc00 && low <= 0xdfff) {
|
||||||
|
// Combine surrogate pair into full code point
|
||||||
|
return 0x10000 + ((ch & 0x3ff) << 10) + (low & 0x3ff)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return ch
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why This Matters:** Without this, `shrimp-🍤` would be treated as `shrimp-<high><low>` (4 characters) instead of `shrimp-🍤` (2 characters).
|
||||||
|
|
||||||
|
### 3. Context-Aware Termination for Semicolon and Colon
|
||||||
|
|
||||||
|
**Location:** `src/parser/tokenizer.ts` lines 51-57
|
||||||
|
|
||||||
|
**The Problem:** How do we parse `basename ./cool;` vs `basename ./cool; 2`?
|
||||||
|
|
||||||
|
**The Solution:** Only treat `;` and `:` as terminators if they're followed by whitespace (or EOF):
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
if (canBeWord && (ch === 59 /* ; */ || ch === 58) /* : */) {
|
||||||
|
const nextCh = getFullCodePoint(input, pos + 1)
|
||||||
|
if (!isWordChar(nextCh)) break // It's a terminator
|
||||||
|
// Otherwise, continue consuming as part of the Word
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Examples:**
|
||||||
|
- `basename ./cool;` → `;` is followed by EOF → terminates the word at `./cool`
|
||||||
|
- `basename ./cool;2` → `;` is followed by `2` → included in word as `./cool;2`
|
||||||
|
- `basename ./cool; 2` → `;` is followed by space → terminates at `./cool`, `2` is next arg
|
||||||
|
|
||||||
|
### 4. Scope-Aware Property Access (DotGet)
|
||||||
|
|
||||||
|
**Location:** `src/parser/tokenizer.ts` lines 19-48
|
||||||
|
|
||||||
|
**The Problem:** How do we distinguish `obj.prop` (property access) from `readme.txt` (filename)?
|
||||||
|
|
||||||
|
**The Solution:** When we see a `.` after an identifier, check if that identifier is in scope:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
if (ch === 46 /* . */ && isValidIdentifier) {
|
||||||
|
// Build identifier text
|
||||||
|
let identifierText = '...' // (surrogate-pair aware)
|
||||||
|
|
||||||
|
const scopeContext = stack.context as ScopeContext | undefined
|
||||||
|
const scope = scopeContext?.scope
|
||||||
|
|
||||||
|
if (scope?.has(identifierText)) {
|
||||||
|
// In scope - stop here, emit IdentifierBeforeDot
|
||||||
|
// Grammar will parse as DotGet
|
||||||
|
input.acceptToken(IdentifierBeforeDot)
|
||||||
|
return
|
||||||
|
}
|
||||||
|
// Not in scope - continue consuming as Word
|
||||||
|
// Will parse as Word("readme.txt")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Examples:**
|
||||||
|
- `config = {path: "..."}; config.path` → `config` is in scope → parses as `DotGet(IdentifierBeforeDot, Identifier)`
|
||||||
|
- `cat readme.txt` → `readme` is not in scope → parses as `Word("readme.txt")`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Grammar Special Cases
|
||||||
|
|
||||||
|
### 1. expressionWithoutIdentifier Pattern
|
||||||
|
|
||||||
|
**Location:** `src/parser/shrimp.grammar` lines 200-210
|
||||||
|
|
||||||
|
**The Problem:** GLR conflict in `consumeToTerminator` rule:
|
||||||
|
|
||||||
|
```lezer
|
||||||
|
consumeToTerminator {
|
||||||
|
ambiguousFunctionCall | // → FunctionCallOrIdentifier → Identifier
|
||||||
|
expression // → Identifier
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
When parsing `my-var` at statement level, both paths want the same `Identifier` token, causing a conflict.
|
||||||
|
|
||||||
|
**The Solution:** Remove `Identifier` from the `expression` path by creating `expressionWithoutIdentifier`:
|
||||||
|
|
||||||
|
```lezer
|
||||||
|
expression {
|
||||||
|
expressionWithoutIdentifier | DotGet | Identifier
|
||||||
|
}
|
||||||
|
|
||||||
|
expressionWithoutIdentifier {
|
||||||
|
ParenExpr | Word | String | Number | Boolean | Regex | Null
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Then use `expressionWithoutIdentifier` in places where we don't want bare identifiers:
|
||||||
|
|
||||||
|
```lezer
|
||||||
|
consumeToTerminator {
|
||||||
|
PipeExpr |
|
||||||
|
ambiguousFunctionCall | // ← Handles standalone identifiers
|
||||||
|
DotGet |
|
||||||
|
IfExpr |
|
||||||
|
FunctionDef |
|
||||||
|
Assign |
|
||||||
|
BinOp |
|
||||||
|
expressionWithoutIdentifier // ← No bare Identifier here
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why This Works:** Now standalone identifiers MUST go through `ambiguousFunctionCall`, which is semantically what we want (they're either function calls or variable references).
|
||||||
|
|
||||||
|
### 2. @skip {} Wrapper for DotGet
|
||||||
|
|
||||||
|
**Location:** `src/parser/shrimp.grammar` lines 176-183
|
||||||
|
|
||||||
|
**The Problem:** DotGet needs to be whitespace-sensitive (no spaces allowed around `.`), but the global `@skip { space }` would remove them.
|
||||||
|
|
||||||
|
**The Solution:** Use `@skip {}` (empty skip) wrapper to disable automatic whitespace skipping:
|
||||||
|
|
||||||
|
```lezer
|
||||||
|
@skip {} {
|
||||||
|
DotGet {
|
||||||
|
IdentifierBeforeDot "." Identifier
|
||||||
|
}
|
||||||
|
|
||||||
|
String { "'" stringContent* "'" }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why This Matters:**
|
||||||
|
- `obj.prop` → Parses as `DotGet` ✓
|
||||||
|
- `obj. prop` → Would parse as `obj` followed by `. prop` (error) if whitespace was skipped
|
||||||
|
- `obj .prop` → Would parse as `obj` followed by `.prop` (error) if whitespace was skipped
|
||||||
|
|
||||||
|
### 3. EOF Handling in item Rule
|
||||||
|
|
||||||
|
**Location:** `src/parser/shrimp.grammar` lines 54-58
|
||||||
|
|
||||||
|
**The Problem:** How do we handle empty lines and end-of-file without infinite loops?
|
||||||
|
|
||||||
|
**The Solution:** Use alternatives instead of repetition for EOF:
|
||||||
|
|
||||||
|
```lezer
|
||||||
|
item {
|
||||||
|
consumeToTerminator newlineOrSemicolon | // Statement with newline/semicolon
|
||||||
|
consumeToTerminator eof | // Statement at end of file
|
||||||
|
newlineOrSemicolon // Allow blank lines
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why Not Just `item { (statement | newlineOrSemicolon)+ eof? }`?**
|
||||||
|
|
||||||
|
That would match EOF multiple times (once after each statement), causing parser errors. By making EOF part of an alternative, it's only matched once per item.
|
||||||
|
|
||||||
|
### 4. Params Uses AssignableIdentifier
|
||||||
|
|
||||||
|
**Location:** `src/parser/shrimp.grammar` lines 153-155
|
||||||
|
|
||||||
|
```lezer
|
||||||
|
Params {
|
||||||
|
AssignableIdentifier*
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why This Matters:** Function parameters are in "assignable" positions - they're being bound to values when the function is called. Using `AssignableIdentifier` here:
|
||||||
|
1. Makes the grammar explicit about which identifiers create bindings
|
||||||
|
2. Enables the tokenizer to use `canShift(AssignableIdentifier)` to detect param context
|
||||||
|
3. Allows the scope tracker to only capture `AssignableIdentifier` tokens
|
||||||
|
|
||||||
|
### 5. String Interpolation Inside @skip {}
|
||||||
|
|
||||||
|
**Location:** `src/parser/shrimp.grammar` lines 181-198
|
||||||
|
|
||||||
|
**The Problem:** String contents need to preserve whitespace, but string interpolation `$identifier` needs to use the external tokenizer.
|
||||||
|
|
||||||
|
**The Solution:** Put `String` inside `@skip {}` and use the external tokenizer for `Identifier` within interpolation:
|
||||||
|
|
||||||
|
```lezer
|
||||||
|
@skip {} {
|
||||||
|
String { "'" stringContent* "'" }
|
||||||
|
}
|
||||||
|
|
||||||
|
stringContent {
|
||||||
|
StringFragment | // Matches literal text (preserves spaces)
|
||||||
|
Interpolation | // $identifier or $(expr)
|
||||||
|
EscapeSeq // \$, \n, etc.
|
||||||
|
}
|
||||||
|
|
||||||
|
Interpolation {
|
||||||
|
"$" Identifier | // Uses external tokenizer!
|
||||||
|
"$" ParenExpr
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key Insight:** External tokenizers work inside `@skip {}` blocks! The tokenizer gets called even when skip is disabled.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scope Tracking Architecture
|
||||||
|
|
||||||
|
### Overview
|
||||||
|
|
||||||
|
Scope tracking uses Lezer's `@context` feature to maintain a scope chain during parsing. This enables:
|
||||||
|
- Distinguishing `obj.prop` (property access) from `readme.txt` (filename)
|
||||||
|
- Tracking which variables are in scope for each position in the parse tree
|
||||||
|
|
||||||
|
### Architecture: Scope vs ScopeContext
|
||||||
|
|
||||||
|
**Two-Class Design:**
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// Pure, hashable scope - only variable tracking
|
||||||
|
class Scope {
|
||||||
|
constructor(
|
||||||
|
public parent: Scope | null,
|
||||||
|
public vars: Set<string>
|
||||||
|
) {}
|
||||||
|
|
||||||
|
has(name: string): boolean
|
||||||
|
add(...names: string[]): Scope
|
||||||
|
push(): Scope // Create child scope
|
||||||
|
pop(): Scope // Return to parent
|
||||||
|
hash(): number // For incremental parsing
|
||||||
|
}
|
||||||
|
|
||||||
|
// Wrapper with temporary state
|
||||||
|
export class ScopeContext {
|
||||||
|
constructor(
|
||||||
|
public scope: Scope,
|
||||||
|
public pendingIds: string[] = []
|
||||||
|
) {}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why This Separation?**
|
||||||
|
|
||||||
|
1. **Scope is pure and hashable** - Only contains committed variable bindings, no temporary state
|
||||||
|
2. **ScopeContext holds temporary state** - The `pendingIds` array captures identifiers during parsing but isn't part of the hash
|
||||||
|
3. **Hash function only hashes Scope** - Incremental parsing only cares about actual scope, not pending identifiers
|
||||||
|
|
||||||
|
### How Scope Tracking Works
|
||||||
|
|
||||||
|
**1. Capture Phase (shift):**
|
||||||
|
|
||||||
|
When the parser shifts an `AssignableIdentifier` token, the scope tracker captures its text:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
shift(context, term, stack, input) {
|
||||||
|
if (term === terms.AssignableIdentifier) {
|
||||||
|
// Build text by peeking at input
|
||||||
|
let text = '...' // (read from input.pos to stack.pos)
|
||||||
|
|
||||||
|
return new ScopeContext(
|
||||||
|
context.scope,
|
||||||
|
[...context.pendingIds, text] // Append to pending
|
||||||
|
)
|
||||||
|
}
|
||||||
|
return context
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**2. Commit Phase (reduce):**
|
||||||
|
|
||||||
|
When the parser reduces to `Assign` or `Params`, the scope tracker commits pending identifiers:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
reduce(context, term, stack, input) {
|
||||||
|
// Assignment: pop last identifier, add to scope
|
||||||
|
if (term === terms.Assign && context.pendingIds.length > 0) {
|
||||||
|
const varName = context.pendingIds[context.pendingIds.length - 1]!
|
||||||
|
return new ScopeContext(
|
||||||
|
context.scope.add(varName), // Add to scope
|
||||||
|
context.pendingIds.slice(0, -1) // Remove from pending
|
||||||
|
)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Function params: add all identifiers, push new scope
|
||||||
|
if (term === terms.Params) {
|
||||||
|
const newScope = context.scope.push()
|
||||||
|
return new ScopeContext(
|
||||||
|
context.pendingIds.length > 0
|
||||||
|
? newScope.add(...context.pendingIds)
|
||||||
|
: newScope,
|
||||||
|
[] // Clear pending
|
||||||
|
)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Function exit: pop scope
|
||||||
|
if (term === terms.FunctionDef) {
|
||||||
|
return new ScopeContext(context.scope.pop(), [])
|
||||||
|
}
|
||||||
|
|
||||||
|
return context
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**3. Usage in Tokenizer:**
|
||||||
|
|
||||||
|
The tokenizer accesses scope to check if identifiers are bound:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
const scopeContext = stack.context as ScopeContext | undefined
|
||||||
|
const scope = scopeContext?.scope
|
||||||
|
|
||||||
|
if (scope?.has(identifierText)) {
|
||||||
|
// Identifier is in scope - can use in DotGet
|
||||||
|
input.acceptToken(IdentifierBeforeDot)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Why Only Track AssignableIdentifier?
|
||||||
|
|
||||||
|
**Before (complex):**
|
||||||
|
- Tracked ALL identifiers with `term === terms.Identifier`
|
||||||
|
- Used `isInParams` flag to know which ones to keep
|
||||||
|
- Had to manually clear "stale" identifiers after DotGet, FunctionCall, etc.
|
||||||
|
|
||||||
|
**After (simple):**
|
||||||
|
- Only track `AssignableIdentifier` tokens
|
||||||
|
- These only appear in `Params` and `Assign` (by grammar design)
|
||||||
|
- No stale identifiers - they're consumed immediately
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
|
||||||
|
```shrimp
|
||||||
|
fn x y: echo x end
|
||||||
|
```
|
||||||
|
|
||||||
|
Scope tracking:
|
||||||
|
1. Shift `AssignableIdentifier("x")` → pending = ["x"]
|
||||||
|
2. Shift `AssignableIdentifier("y")` → pending = ["x", "y"]
|
||||||
|
3. Reduce `Params` → scope = {x, y}, pending = []
|
||||||
|
4. Shift `Identifier("echo")` → **not captured** (not AssignableIdentifier)
|
||||||
|
5. Shift `Identifier("x")` → **not captured**
|
||||||
|
6. Reduce `FunctionDef` → pop scope
|
||||||
|
|
||||||
|
No stale identifier clearing needed!
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Common Pitfalls
|
||||||
|
|
||||||
|
### 1. Forgetting Surrogate Pairs
|
||||||
|
|
||||||
|
**Problem:** Using `input.peek(i)` directly gives UTF-16 code units, not Unicode code points.
|
||||||
|
|
||||||
|
**Solution:** Always use `getFullCodePoint(input, pos)` when working with emoji.
|
||||||
|
|
||||||
|
**Example:**
|
||||||
|
```typescript
|
||||||
|
// ❌ Wrong - breaks on emoji
|
||||||
|
const ch = input.peek(pos)
|
||||||
|
if (isEmoji(ch)) { ... }
|
||||||
|
|
||||||
|
// ✓ Right - handles surrogate pairs
|
||||||
|
const ch = getFullCodePoint(input, pos)
|
||||||
|
if (isEmoji(ch)) { ... }
|
||||||
|
pos += getCharSize(ch) // Advance by 1 or 2 code units
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Adding Pending State to Hash
|
||||||
|
|
||||||
|
**Problem:** Including `pendingIds` or `isInParams` in the hash function breaks incremental parsing.
|
||||||
|
|
||||||
|
**Why?** The hash is used to determine if a cached parse tree node can be reused. If the hash includes temporary state that doesn't affect parsing decisions, nodes will be invalidated unnecessarily.
|
||||||
|
|
||||||
|
**Solution:** Only hash the `Scope` (vars + parent chain), not the `ScopeContext` wrapper.
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// ✓ Right
|
||||||
|
const hashScope = (context: ScopeContext): number => {
|
||||||
|
return context.scope.hash() // Only hash committed scope
|
||||||
|
}
|
||||||
|
|
||||||
|
// ❌ Wrong
|
||||||
|
const hashScope = (context: ScopeContext): number => {
|
||||||
|
let h = context.scope.hash()
|
||||||
|
h = (h << 5) - h + context.pendingIds.length // Don't do this!
|
||||||
|
return h
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Using canShift() Alone for Disambiguation
|
||||||
|
|
||||||
|
**Problem:** `stack.canShift(AssignableIdentifier)` returns true when BOTH paths are possible (e.g., at statement start).
|
||||||
|
|
||||||
|
**Why?** The GLR parser maintains multiple parse states. If any state can shift the token, `canShift()` returns true.
|
||||||
|
|
||||||
|
**Solution:** Check BOTH token types and use lookahead when both are possible:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
const canAssignable = stack.canShift(AssignableIdentifier)
|
||||||
|
const canRegular = stack.canShift(Identifier)
|
||||||
|
|
||||||
|
if (canAssignable && canRegular) {
|
||||||
|
// Both possible - need lookahead
|
||||||
|
const hasEquals = peekForEquals(input, pos)
|
||||||
|
input.acceptToken(hasEquals ? AssignableIdentifier : Identifier)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Clearing Pending Identifiers Too Eagerly
|
||||||
|
|
||||||
|
**Problem:** In the old code, we had to clear pending identifiers after DotGet, FunctionCall, etc. to prevent state leakage. This was fragile and easy to forget.
|
||||||
|
|
||||||
|
**Why This Happened:** We were tracking ALL identifiers, not just assignable ones.
|
||||||
|
|
||||||
|
**Solution:** Only track `AssignableIdentifier` tokens. They only appear in contexts where they'll be consumed (Params, Assign), so no clearing needed.
|
||||||
|
|
||||||
|
### 5. Line Number Confusion in Edit Tool
|
||||||
|
|
||||||
|
**Problem:** The Edit tool shows line numbers with a prefix (like ` 5→`), but these aren't the real line numbers.
|
||||||
|
|
||||||
|
**How to Read:**
|
||||||
|
- The number before `→` is the actual line number
|
||||||
|
- Use that number when referencing code in comments or documentation
|
||||||
|
- Example: ` 5→export const foo` means the code is on line 5
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Testing Strategy
|
||||||
|
|
||||||
|
### Parser Tests
|
||||||
|
|
||||||
|
Use the `toMatchTree` helper to verify parse tree structure:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
test('assignment with AssignableIdentifier', () => {
|
||||||
|
expect('x = 5').toMatchTree(`
|
||||||
|
Assign
|
||||||
|
AssignableIdentifier x
|
||||||
|
operator =
|
||||||
|
Number 5
|
||||||
|
`)
|
||||||
|
})
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key Testing Patterns:**
|
||||||
|
- Test both token type expectations (Identifier vs AssignableIdentifier)
|
||||||
|
- Test scope-aware features (DotGet for in-scope vs Word for out-of-scope)
|
||||||
|
- Test edge cases (empty lines, EOF, surrogate pairs)
|
||||||
|
|
||||||
|
### Debugging Parser Issues
|
||||||
|
|
||||||
|
1. **Check token types:** Run parser on input and examine tree structure
|
||||||
|
2. **Test canShift():** Add logging to tokenizer to see what `canShift()` returns
|
||||||
|
3. **Verify scope state:** Log scope contents during parsing
|
||||||
|
4. **Use GLR visualization:** Lezer has tools for visualizing parse states
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Further Reading
|
||||||
|
|
||||||
|
- [Lezer System Guide](https://lezer.codemirror.net/docs/guide/)
|
||||||
|
- [Lezer API Reference](https://lezer.codemirror.net/docs/ref/)
|
||||||
|
- [CLAUDE.md](../CLAUDE.md) - General project guidance
|
||||||
|
- [Scope Tracker Source](../src/parser/scopeTracker.ts)
|
||||||
|
- [Tokenizer Source](../src/parser/tokenizer.ts)
|
||||||
|
|
@ -9,6 +9,7 @@ import {
|
||||||
getAllChildren,
|
getAllChildren,
|
||||||
getAssignmentParts,
|
getAssignmentParts,
|
||||||
getBinaryParts,
|
getBinaryParts,
|
||||||
|
getDotGetParts,
|
||||||
getFunctionCallParts,
|
getFunctionCallParts,
|
||||||
getFunctionDefParts,
|
getFunctionDefParts,
|
||||||
getIfExprParts,
|
getIfExprParts,
|
||||||
|
|
@ -17,8 +18,8 @@ import {
|
||||||
getStringParts,
|
getStringParts,
|
||||||
} from '#compiler/utils'
|
} from '#compiler/utils'
|
||||||
|
|
||||||
// const DEBUG = false
|
const DEBUG = false
|
||||||
const DEBUG = true
|
// const DEBUG = true
|
||||||
|
|
||||||
type Label = `.${string}`
|
type Label = `.${string}`
|
||||||
|
|
||||||
|
|
@ -189,6 +190,19 @@ export class Compiler {
|
||||||
return [[`TRY_LOAD`, value]]
|
return [[`TRY_LOAD`, value]]
|
||||||
}
|
}
|
||||||
|
|
||||||
|
case terms.Word: {
|
||||||
|
return [['PUSH', value]]
|
||||||
|
}
|
||||||
|
|
||||||
|
case terms.DotGet: {
|
||||||
|
const { objectName, propertyName } = getDotGetParts(node, input)
|
||||||
|
const instructions: ProgramItem[] = []
|
||||||
|
instructions.push(['TRY_LOAD', objectName])
|
||||||
|
instructions.push(['PUSH', propertyName])
|
||||||
|
instructions.push(['DOT_GET'])
|
||||||
|
return instructions
|
||||||
|
}
|
||||||
|
|
||||||
case terms.BinOp: {
|
case terms.BinOp: {
|
||||||
const { left, op, right } = getBinaryParts(node)
|
const { left, op, right } = getBinaryParts(node)
|
||||||
const instructions: ProgramItem[] = []
|
const instructions: ProgramItem[] = []
|
||||||
|
|
|
||||||
|
|
@ -213,7 +213,7 @@ describe('Regex', () => {
|
||||||
})
|
})
|
||||||
})
|
})
|
||||||
|
|
||||||
describe.only('native functions', () => {
|
describe.skip('native functions', () => {
|
||||||
test('print function', () => {
|
test('print function', () => {
|
||||||
const add = (x: number, y: number) => x + y
|
const add = (x: number, y: number) => x + y
|
||||||
expect(`add 5 9`).toEvaluateTo(14, { add })
|
expect(`add 5 9`).toEvaluateTo(14, { add })
|
||||||
|
|
|
||||||
|
|
@ -40,9 +40,9 @@ export const getAssignmentParts = (node: SyntaxNode) => {
|
||||||
const children = getAllChildren(node)
|
const children = getAllChildren(node)
|
||||||
const [left, equals, right] = children
|
const [left, equals, right] = children
|
||||||
|
|
||||||
if (!left || left.type.id !== terms.Identifier) {
|
if (!left || left.type.id !== terms.AssignableIdentifier) {
|
||||||
throw new CompilerError(
|
throw new CompilerError(
|
||||||
`Assign left child must be an Identifier, got ${left ? left.type.name : 'none'}`,
|
`Assign left child must be an AssignableIdentifier, got ${left ? left.type.name : 'none'}`,
|
||||||
node.from,
|
node.from,
|
||||||
node.to
|
node.to
|
||||||
)
|
)
|
||||||
|
|
@ -70,9 +70,9 @@ export const getFunctionDefParts = (node: SyntaxNode, input: string) => {
|
||||||
}
|
}
|
||||||
|
|
||||||
const paramNames = getAllChildren(paramsNode).map((param) => {
|
const paramNames = getAllChildren(paramsNode).map((param) => {
|
||||||
if (param.type.id !== terms.Identifier) {
|
if (param.type.id !== terms.AssignableIdentifier) {
|
||||||
throw new CompilerError(
|
throw new CompilerError(
|
||||||
`FunctionDef params must be Identifiers, got ${param.type.name}`,
|
`FunctionDef params must be AssignableIdentifiers, got ${param.type.name}`,
|
||||||
param.from,
|
param.from,
|
||||||
param.to
|
param.to
|
||||||
)
|
)
|
||||||
|
|
@ -198,3 +198,37 @@ export const getStringParts = (node: SyntaxNode, input: string) => {
|
||||||
|
|
||||||
return { parts, hasInterpolation: parts.length > 0 }
|
return { parts, hasInterpolation: parts.length > 0 }
|
||||||
}
|
}
|
||||||
|
|
||||||
|
export const getDotGetParts = (node: SyntaxNode, input: string) => {
|
||||||
|
const children = getAllChildren(node)
|
||||||
|
const [object, property] = children
|
||||||
|
|
||||||
|
if (children.length !== 2) {
|
||||||
|
throw new CompilerError(
|
||||||
|
`DotGet expected 2 identifier children, got ${children.length}`,
|
||||||
|
node.from,
|
||||||
|
node.to
|
||||||
|
)
|
||||||
|
}
|
||||||
|
|
||||||
|
if (object.type.id !== terms.IdentifierBeforeDot) {
|
||||||
|
throw new CompilerError(
|
||||||
|
`DotGet object must be an IdentifierBeforeDot, got ${object.type.name}`,
|
||||||
|
object.from,
|
||||||
|
object.to
|
||||||
|
)
|
||||||
|
}
|
||||||
|
|
||||||
|
if (property.type.id !== terms.Identifier) {
|
||||||
|
throw new CompilerError(
|
||||||
|
`DotGet property must be an Identifier, got ${property.type.name}`,
|
||||||
|
property.from,
|
||||||
|
property.to
|
||||||
|
)
|
||||||
|
}
|
||||||
|
|
||||||
|
const objectName = input.slice(object.from, object.to)
|
||||||
|
const propertyName = input.slice(property.from, property.to)
|
||||||
|
|
||||||
|
return { objectName, propertyName }
|
||||||
|
}
|
||||||
|
|
|
||||||
|
|
@ -1,42 +1,11 @@
|
||||||
import { ContextTracker } from '@lezer/lr'
|
import { ContextTracker, InputStream } from '@lezer/lr'
|
||||||
import * as terms from './shrimp.terms'
|
import * as terms from './shrimp.terms'
|
||||||
|
|
||||||
export class Scope {
|
export class Scope {
|
||||||
constructor(
|
constructor(public parent: Scope | null, public vars = new Set<string>()) {}
|
||||||
public parent: Scope | null,
|
|
||||||
public vars: Set<string>,
|
|
||||||
public pendingIdentifiers: string[] = [],
|
|
||||||
public isInParams: boolean = false
|
|
||||||
) {}
|
|
||||||
|
|
||||||
has(name: string): boolean {
|
has(name: string): boolean {
|
||||||
return this.vars.has(name) ?? this.parent?.has(name)
|
return this.vars.has(name) || (this.parent?.has(name) ?? false)
|
||||||
}
|
|
||||||
|
|
||||||
add(...names: string[]): Scope {
|
|
||||||
const newVars = new Set(this.vars)
|
|
||||||
names.forEach((name) => newVars.add(name))
|
|
||||||
return new Scope(this.parent, newVars, [], this.isInParams)
|
|
||||||
}
|
|
||||||
|
|
||||||
push(): Scope {
|
|
||||||
return new Scope(this, new Set(), [], false)
|
|
||||||
}
|
|
||||||
|
|
||||||
pop(): Scope {
|
|
||||||
return this.parent ?? new Scope(null, new Set(), [], false)
|
|
||||||
}
|
|
||||||
|
|
||||||
withPendingIdentifiers(ids: string[]): Scope {
|
|
||||||
return new Scope(this.parent, this.vars, ids, this.isInParams)
|
|
||||||
}
|
|
||||||
|
|
||||||
withIsInParams(value: boolean): Scope {
|
|
||||||
return new Scope(this.parent, this.vars, this.pendingIdentifiers, value)
|
|
||||||
}
|
|
||||||
|
|
||||||
clearPending(): Scope {
|
|
||||||
return new Scope(this.parent, this.vars, [], this.isInParams)
|
|
||||||
}
|
}
|
||||||
|
|
||||||
hash(): number {
|
hash(): number {
|
||||||
|
|
@ -51,76 +20,77 @@ export class Scope {
|
||||||
h = (h << 5) - h + this.parent.hash()
|
h = (h << 5) - h + this.parent.hash()
|
||||||
h |= 0
|
h |= 0
|
||||||
}
|
}
|
||||||
// Include pendingIdentifiers and isInParams in hash
|
|
||||||
h = (h << 5) - h + this.pendingIdentifiers.length
|
|
||||||
h = (h << 5) - h + (this.isInParams ? 1 : 0)
|
|
||||||
h |= 0
|
|
||||||
return h
|
return h
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Static methods that return new Scopes (immutable operations)
|
||||||
|
|
||||||
|
static add(scope: Scope, ...names: string[]): Scope {
|
||||||
|
const newVars = new Set(scope.vars)
|
||||||
|
names.forEach((name) => newVars.add(name))
|
||||||
|
return new Scope(scope.parent, newVars)
|
||||||
|
}
|
||||||
|
|
||||||
|
push(): Scope {
|
||||||
|
return new Scope(this, new Set())
|
||||||
|
}
|
||||||
|
|
||||||
|
pop(): Scope {
|
||||||
|
return this.parent ?? this
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
export const trackScope = new ContextTracker<Scope>({
|
// Tracker context that combines Scope with temporary pending identifiers
|
||||||
start: new Scope(null, new Set(), [], false),
|
class TrackerContext {
|
||||||
|
constructor(public scope: Scope, public pendingIds: string[] = []) {}
|
||||||
|
}
|
||||||
|
|
||||||
shift(context, term, stack, input) {
|
// Extract identifier text from input stream
|
||||||
// Track fn keyword to enter param capture mode
|
const readIdentifierText = (input: InputStream, start: number, end: number): string => {
|
||||||
if (term === terms.Fn) {
|
|
||||||
return context.withIsInParams(true).withPendingIdentifiers([])
|
|
||||||
}
|
|
||||||
|
|
||||||
// Capture identifiers
|
|
||||||
if (term === terms.Identifier) {
|
|
||||||
// Build text by peeking backwards from stack.pos to input.pos
|
|
||||||
let text = ''
|
let text = ''
|
||||||
const start = input.pos
|
|
||||||
const end = stack.pos
|
|
||||||
for (let i = start; i < end; i++) {
|
for (let i = start; i < end; i++) {
|
||||||
const offset = i - input.pos
|
const offset = i - input.pos
|
||||||
const ch = input.peek(offset)
|
const ch = input.peek(offset)
|
||||||
if (ch === -1) break
|
if (ch === -1) break
|
||||||
text += String.fromCharCode(ch)
|
text += String.fromCharCode(ch)
|
||||||
}
|
}
|
||||||
|
return text
|
||||||
|
}
|
||||||
|
|
||||||
// Capture ALL identifiers when in params
|
export const trackScope = new ContextTracker<TrackerContext>({
|
||||||
if (context.isInParams) {
|
start: new TrackerContext(new Scope(null, new Set())),
|
||||||
return context.withPendingIdentifiers([...context.pendingIdentifiers, text])
|
|
||||||
}
|
|
||||||
// Capture FIRST identifier for assignments
|
|
||||||
else if (context.pendingIdentifiers.length === 0) {
|
|
||||||
return context.withPendingIdentifiers([text])
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return context
|
shift(context, term, stack, input) {
|
||||||
|
if (term !== terms.AssignableIdentifier) return context
|
||||||
|
|
||||||
|
const text = readIdentifierText(input, input.pos, stack.pos)
|
||||||
|
return new TrackerContext(context.scope, [...context.pendingIds, text])
|
||||||
},
|
},
|
||||||
|
|
||||||
reduce(context, term, stack, input) {
|
reduce(context, term) {
|
||||||
// Add assignment variable to scope
|
// Add assignment variable to scope
|
||||||
if (term === terms.Assign && context.pendingIdentifiers.length > 0) {
|
if (term === terms.Assign) {
|
||||||
return context.add(context.pendingIdentifiers[0]!)
|
const varName = context.pendingIds.at(-1)
|
||||||
|
if (!varName) return context
|
||||||
|
return new TrackerContext(Scope.add(context.scope, varName), context.pendingIds.slice(0, -1))
|
||||||
}
|
}
|
||||||
|
|
||||||
// Push new scope and add parameters
|
// Push new scope and add all parameters
|
||||||
if (term === terms.Params) {
|
if (term === terms.Params) {
|
||||||
const newScope = context.push()
|
let newScope = context.scope.push()
|
||||||
if (context.pendingIdentifiers.length > 0) {
|
if (context.pendingIds.length > 0) {
|
||||||
return newScope.add(...context.pendingIdentifiers).withIsInParams(false)
|
newScope = Scope.add(newScope, ...context.pendingIds)
|
||||||
}
|
}
|
||||||
return newScope.withIsInParams(false)
|
return new TrackerContext(newScope, [])
|
||||||
}
|
}
|
||||||
|
|
||||||
// Pop scope when exiting function
|
// Pop scope when exiting function
|
||||||
if (term === terms.FunctionDef) {
|
if (term === terms.FunctionDef) {
|
||||||
return context.pop()
|
return new TrackerContext(context.scope.pop(), [])
|
||||||
}
|
|
||||||
|
|
||||||
// Clear stale identifiers after non-assignment statements
|
|
||||||
if (term === terms.DotGet || term === terms.FunctionCallOrIdentifier || term === terms.FunctionCall) {
|
|
||||||
return context.clearPending()
|
|
||||||
}
|
}
|
||||||
|
|
||||||
return context
|
return context
|
||||||
},
|
},
|
||||||
|
|
||||||
hash: (context) => context.hash(),
|
hash: (context) => context.scope.hash(),
|
||||||
})
|
})
|
||||||
|
|
|
||||||
|
|
@ -43,7 +43,7 @@
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
@external tokens tokenizer from "./tokenizer" { Identifier, Word, IdentifierBeforeDot }
|
@external tokens tokenizer from "./tokenizer" { Identifier, AssignableIdentifier, Word, IdentifierBeforeDot }
|
||||||
|
|
||||||
@precedence {
|
@precedence {
|
||||||
pipe @left,
|
pipe @left,
|
||||||
|
|
@ -151,11 +151,11 @@ ConditionalOp {
|
||||||
}
|
}
|
||||||
|
|
||||||
Params {
|
Params {
|
||||||
Identifier*
|
AssignableIdentifier*
|
||||||
}
|
}
|
||||||
|
|
||||||
Assign {
|
Assign {
|
||||||
Identifier "=" consumeToTerminator
|
AssignableIdentifier "=" consumeToTerminator
|
||||||
}
|
}
|
||||||
|
|
||||||
BinOp {
|
BinOp {
|
||||||
|
|
|
||||||
|
|
@ -1,35 +1,36 @@
|
||||||
// This file was generated by lezer-generator. You probably shouldn't edit it.
|
// This file was generated by lezer-generator. You probably shouldn't edit it.
|
||||||
export const
|
export const
|
||||||
Identifier = 1,
|
Identifier = 1,
|
||||||
Word = 2,
|
AssignableIdentifier = 2,
|
||||||
IdentifierBeforeDot = 3,
|
Word = 3,
|
||||||
Program = 4,
|
IdentifierBeforeDot = 4,
|
||||||
PipeExpr = 5,
|
Program = 5,
|
||||||
FunctionCall = 6,
|
PipeExpr = 6,
|
||||||
PositionalArg = 7,
|
FunctionCall = 7,
|
||||||
ParenExpr = 8,
|
PositionalArg = 8,
|
||||||
FunctionCallOrIdentifier = 9,
|
ParenExpr = 9,
|
||||||
BinOp = 10,
|
FunctionCallOrIdentifier = 10,
|
||||||
ConditionalOp = 15,
|
BinOp = 11,
|
||||||
String = 24,
|
ConditionalOp = 16,
|
||||||
StringFragment = 25,
|
String = 25,
|
||||||
Interpolation = 26,
|
StringFragment = 26,
|
||||||
EscapeSeq = 27,
|
Interpolation = 27,
|
||||||
Number = 28,
|
EscapeSeq = 28,
|
||||||
Boolean = 29,
|
Number = 29,
|
||||||
Regex = 30,
|
Boolean = 30,
|
||||||
Null = 31,
|
Regex = 31,
|
||||||
DotGet = 32,
|
Null = 32,
|
||||||
FunctionDef = 33,
|
DotGet = 33,
|
||||||
Fn = 34,
|
FunctionDef = 34,
|
||||||
Params = 35,
|
Fn = 35,
|
||||||
colon = 36,
|
Params = 36,
|
||||||
end = 37,
|
colon = 37,
|
||||||
Underscore = 38,
|
end = 38,
|
||||||
NamedArg = 39,
|
Underscore = 39,
|
||||||
NamedArgPrefix = 40,
|
NamedArg = 40,
|
||||||
IfExpr = 42,
|
NamedArgPrefix = 41,
|
||||||
ThenBlock = 45,
|
IfExpr = 43,
|
||||||
ElsifExpr = 46,
|
ThenBlock = 46,
|
||||||
ElseExpr = 48,
|
ElsifExpr = 47,
|
||||||
Assign = 50
|
ElseExpr = 49,
|
||||||
|
Assign = 51
|
||||||
|
|
|
||||||
|
|
@ -5,21 +5,21 @@ import {trackScope} from "./scopeTracker"
|
||||||
import {highlighting} from "./highlight"
|
import {highlighting} from "./highlight"
|
||||||
export const parser = LRParser.deserialize({
|
export const parser = LRParser.deserialize({
|
||||||
version: 14,
|
version: 14,
|
||||||
states: ".jQVQaOOO#UQbO'#CeO#fQPO'#CfO#tQPO'#DlO$wQaO'#CdO%OOSO'#CtOOQ`'#Dp'#DpO%^OPO'#C|O%cQPO'#DoO%zQaO'#D{OOQ`'#C}'#C}OOQO'#Dm'#DmO&SQPO'#DlO&bQaO'#EPOOQO'#DW'#DWOOQO'#Dl'#DlO&iQPO'#DkOOQ`'#Dk'#DkOOQ`'#Da'#DaQVQaOOOOQ`'#Do'#DoOOQ`'#Cc'#CcO&qQaO'#DTOOQ`'#Dn'#DnOOQ`'#Db'#DbO'OQbO,58|O'oQaO,59zO&bQaO,59QO&bQaO,59QO'|QbO'#CeO)XQPO'#CfO)iQPO,59OO)zQPO,59OO)uQPO,59OO*uQPO,59OO*}QaO'#CvO+VQWO'#CwOOOO'#Dt'#DtOOOO'#Dc'#DcO+kOSO,59`OOQ`,59`,59`O+yO`O,59hOOQ`'#Dd'#DdO,OQaO'#DPO,WQPO,5:gO,]QaO'#DfO,bQPO,58{O,sQPO,5:kO,zQPO,5:kOOQ`,5:V,5:VOOQ`-E7_-E7_OOQ`,59o,59oOOQ`-E7`-E7`OOQO1G/f1G/fOOQO1G.l1G.lO-PQPO1G.lO&bQaO,59VO&bQaO,59VOOQ`1G.j1G.jOOOO,59b,59bOOOO,59c,59cOOOO-E7a-E7aOOQ`1G.z1G.zOOQ`1G/S1G/SOOQ`-E7b-E7bO-kQaO1G0RO-{QbO'#CeOOQO,5:Q,5:QOOQO-E7d-E7dO.lQaO1G0VOOQO1G.q1G.qO.|QPO1G.qO/WQPO7+%mO/]QaO7+%nOOQO'#DY'#DYOOQO7+%q7+%qO/mQaO7+%rOOQ`<<IX<<IXO0TQPO'#DeO0YQaO'#EOO0pQPO<<IYOOQO'#DZ'#DZO0uQPO<<I^OOQ`,5:P,5:POOQ`-E7c-E7cOOQ`AN>tAN>tO&bQaO'#D[OOQO'#Dg'#DgO1QQPOAN>xO1]QPO'#D^OOQOAN>xAN>xO1bQPOAN>xO1gQPO,59vO1nQPO,59vOOQO-E7e-E7eOOQOG24dG24dO1sQPOG24dO1xQPO,59xO1}QPO1G/bOOQOLD*OLD*OO/]QaO1G/dO/mQaO7+$|OOQO7+%O7+%OOOQO<<Hh<<Hh",
|
states: ".jQVQaOOO#XQbO'#CfO$RQPO'#CgO$aQPO'#DmO$xQaO'#CeO%gOSO'#CuOOQ`'#Dq'#DqO%uOPO'#C}O%zQPO'#DpO&cQaO'#D|OOQ`'#DO'#DOOOQO'#Dn'#DnO&kQPO'#DmO&yQaO'#EQOOQO'#DX'#DXO'hQPO'#DaOOQO'#Dm'#DmO'mQPO'#DlOOQ`'#Dl'#DlOOQ`'#Db'#DbQVQaOOOOQ`'#Dp'#DpOOQ`'#Cd'#CdO'uQaO'#DUOOQ`'#Do'#DoOOQ`'#Dc'#DcO(PQbO,58}O&yQaO,59RO&yQaO,59RO)XQPO'#CgO)iQPO,59PO)zQPO,59PO)uQPO,59PO*uQPO,59PO*}QaO'#CwO+VQWO'#CxOOOO'#Du'#DuOOOO'#Dd'#DdO+kOSO,59aOOQ`,59a,59aO+yO`O,59iOOQ`'#De'#DeO,OQaO'#DQO,WQPO,5:hO,]QaO'#DgO,bQPO,58|O,sQPO,5:lO,zQPO,5:lO-PQaO,59{OOQ`,5:W,5:WOOQ`-E7`-E7`OOQ`,59p,59pOOQ`-E7a-E7aOOQO1G.m1G.mO-^QPO1G.mO&yQaO,59WO&yQaO,59WOOQ`1G.k1G.kOOOO,59c,59cOOOO,59d,59dOOOO-E7b-E7bOOQ`1G.{1G.{OOQ`1G/T1G/TOOQ`-E7c-E7cO-xQaO1G0SO!QQbO'#CfOOQO,5:R,5:ROOQO-E7e-E7eO.YQaO1G0WOOQO1G/g1G/gOOQO1G.r1G.rO.jQPO1G.rO.tQPO7+%nO.yQaO7+%oOOQO'#DZ'#DZOOQO7+%r7+%rO/ZQaO7+%sOOQ`<<IY<<IYO/qQPO'#DfO/vQaO'#EPO0^QPO<<IZOOQO'#D['#D[O0cQPO<<I_OOQ`,5:Q,5:QOOQ`-E7d-E7dOOQ`AN>uAN>uO&yQaO'#D]OOQO'#Dh'#DhO0nQPOAN>yO0yQPO'#D_OOQOAN>yAN>yO1OQPOAN>yO1TQPO,59wO1[QPO,59wOOQO-E7f-E7fOOQOG24eG24eO1aQPOG24eO1fQPO,59yO1kQPO1G/cOOQOLD*PLD*PO.yQaO1G/eO/ZQaO7+$}OOQO7+%P7+%POOQO<<Hi<<Hi",
|
||||||
stateData: "2Y~O!^OS~OPPOQUORVOlUOmUOnUOoUOrXO{]O!eSO!gTO!qaO~OPdOQUORVOlUOmUOnUOoUOrXOveOxfO!eSO!gTOZ!cX[!cX]!cX^!cXyXX~O`jO!qXX!uXXuXX~P}OZkO[kO]lO^lO~OZkO[kO]lO^lO!q!`X!u!`Xu!`X~OQUORVOlUOmUOnUOoUO!eSO!gTO~OPmO~P$]OiuO!gxO!isO!jtO~O!nyO~OZ!cX[!cX]!cX^!cX!q!`X!u!`Xu!`X~OPzOtsP~Oy}O!q!`X!u!`Xu!`X~OPdO~P$]O!q!RO!u!RO~OPdOrXOv!TO~P$]OPdOrXOveOxfOyUa!qUa!uUa!fUauUa~P$]OPPOrXO{]O~P$]O`!cXa!cXb!cXc!cXd!cXe!cXf!cXg!cX!fXX~P}O`!YOa!YOb!YOc!YOd!YOe!YOf!ZOg!ZO~OZkO[kO]lO^lO~P(mOZkO[kO]lO^lO!f![O~O!f![OZ!cX[!cX]!cX^!cX`!cXa!cXb!cXc!cXd!cXe!cXf!cXg!cX~Oy}O!f![O~OP!]O!eSO~O!g!^O!i!^O!j!^O!k!^O!l!^O!m!^O~OiuO!g!`O!isO!jtO~OP!aO~OPzOtsX~Ot!cO~OP!dO~Oy}O!qTa!uTa!fTauTa~Ot!gO~P(mOt!gO~OZkO[kO]Yi^Yi!qYi!uYi!fYiuYi~OPPOrXO{]O!q!kO~P$]OPdOrXOveOxfOyXX!qXX!uXX!fXXuXX~P$]OPPOrXO{]O!q!nO~P$]O!f_it_i~P(mOu!oO~OPPOrXO{]Ou!rP~P$]OPPOrXO{]Ou!rP!P!rP!R!rP~P$]O!q!uO~OPPOrXO{]Ou!rX!P!rX!R!rX~P$]Ou!wO~Ou!|O!P!xO!R!{O~Ou#RO!P!xO!R!{O~Ot#TO~Ou#RO~Ot#UO~P(mOt#UO~Ou#VO~O!q#WO~O!q#XO~Ol^n[n~",
|
stateData: "1v~O!_OS~OPPOQ_ORUOSVOmUOnUOoUOpUOsXO|]O!fSO!hTO!rbO~OPeORUOSVOmUOnUOoUOpUOsXOwfOygO!fSO!hTOzYX!rYX!vYX!gYXvYX~O[!dX]!dX^!dX_!dXa!dXb!dXc!dXd!dXe!dXf!dXg!dXh!dX~P!QO[kO]kO^lO_lO~O[kO]kO^lO_lO!r!aX!v!aXv!aX~OPPORUOSVOmUOnUOoUOpUO!fSO!hTO~OjtO!hwO!jrO!ksO~O!oxO~O[!dX]!dX^!dX_!dX!r!aX!v!aXv!aX~OQyOutP~Oz|O!r!aX!v!aXv!aX~OPeORUOSVOmUOnUOoUOpUO!fSO!hTO~Oa!QO~O!r!RO!v!RO~OsXOw!TO~P&yOsXOwfOygOzVa!rVa!vVa!gVavVa~P&yOa!XOb!XOc!XOd!XOe!XOf!XOg!YOh!YO~O[kO]kO^lO_lO~P(mO[kO]kO^lO_lO!g!ZO~O!g!ZO[!dX]!dX^!dX_!dXa!dXb!dXc!dXd!dXe!dXf!dXg!dXh!dX~Oz|O!g!ZO~OP![O!fSO~O!h!]O!j!]O!k!]O!l!]O!m!]O!n!]O~OjtO!h!_O!jrO!ksO~OP!`O~OQyOutX~Ou!bO~OP!cO~Oz|O!rUa!vUa!gUavUa~Ou!fO~P(mOu!fO~OQ_OsXO|]O~P$xO[kO]kO^Zi_Zi!rZi!vZi!gZivZi~OQ_OsXO|]O!r!kO~P$xOQ_OsXO|]O!r!nO~P$xO!g`iu`i~P(mOv!oO~OQ_OsXO|]Ov!sP~P$xOQ_OsXO|]Ov!sP!Q!sP!S!sP~P$xO!r!uO~OQ_OsXO|]Ov!sX!Q!sX!S!sX~P$xOv!wO~Ov!|O!Q!xO!S!{O~Ov#RO!Q!xO!S!{O~Ou#TO~Ov#RO~Ou#UO~P(mOu#UO~Ov#VO~O!r#WO~O!r#XO~Om_o]o~",
|
||||||
goto: "+v!uPPPPP!v#V#e#k#V$WPPPP$mPPPPPPPP$yP%c%cPPPP%g&RP&hPPP#ePP&kP&w&z'TP'XP&k'_'e'm's'y(S(ZPPP(a(e(y)])c*_PPP*{PPPPPP+P+PP+b+j+jd_Ocj!c!g!k!n!q#W#XRqSiZOScj}!c!g!k!n!q#W#XXgPim!d|UOPS]cfijklm!Y!Z!c!d!g!k!n!q!x#W#XR!]sdROcj!c!g!k!n!q#W#XQoSQ!WkR!XlQqSQ!Q]Q!h!ZR#P!x}UOPS]cfijklm!Y!Z!c!d!g!k!n!q!x#W#XTuTwdWOcj!c!g!k!n!q#W#XidPS]fiklm!Y!Z!d!xd_Ocj!c!g!k!n!q#W#XWePim!dR!TfR|Xe_Ocj!c!g!k!n!q#W#XR!m!gQ!t!nQ#Y#WR#Z#XT!y!t!zQ!}!tR#S!zQcOR!ScUiPm!dR!UiQwTR!_wQ{XR!b{W!q!k!n#W#XR!v!qS!O[rR!f!OQ!z!tR#Q!zTbOcS`OcQ!VjQ!j!cQ!l!gZ!p!k!n!q#W#Xd[Ocj!c!g!k!n!q#W#XQrSR!e}XhPim!ddQOcj!c!g!k!n!q#W#XWePim!dQnSQ!P]Q!TfQ!WkQ!XlQ!h!YQ!i!ZR#O!xdWOcj!c!g!k!n!q#W#XfdP]fiklm!Y!Z!d!xRpSTvTwoYOPcfijm!c!d!g!k!n!q#W#XQ!r!kV!s!n#W#Xe^Ocj!c!g!k!n!q#W#X",
|
goto: "+m!vPPPPPP!w#W#f#k#W$VPPPP$lPPPPPPPP$xP%a%aPPPP%e&OP&dPPP#fPP&gP&s&v'PP'TP&g'Z'a'h'n't'}(UPPP([(`(t)W)]*WPPP*sPPPPPP*w*wP+X+a+ad`Od!Q!b!f!k!n!q#W#XRpSiZOSd|!Q!b!f!k!n!q#W#XVhPj!czUOPS]dgjkl!Q!X!Y!b!c!f!k!n!q!x#W#XR![rdROd!Q!b!f!k!n!q#W#XQnSQ!VkR!WlQpSQ!P]Q!h!YR#P!x{UOPS]dgjkl!Q!X!Y!b!c!f!k!n!q!x#W#XTtTvdWOd!Q!b!f!k!n!q#W#XgePS]gjkl!X!Y!c!xd`Od!Q!b!f!k!n!q#W#XUfPj!cR!TgR{Xe`Od!Q!b!f!k!n!q#W#XR!m!fQ!t!nQ#Y#WR#Z#XT!y!t!zQ!}!tR#S!zQdOR!SdSjP!cR!UjQvTR!^vQzXR!azW!q!k!n#W#XR!v!qS}[qR!e}Q!z!tR#Q!zTcOdSaOdQ!g!QQ!j!bQ!l!fZ!p!k!n!q#W#Xd[Od!Q!b!f!k!n!q#W#XQqSR!d|ViPj!cdQOd!Q!b!f!k!n!q#W#XUfPj!cQmSQ!O]Q!TgQ!VkQ!WlQ!h!XQ!i!YR#O!xdWOd!Q!b!f!k!n!q#W#XdeP]gjkl!X!Y!c!xRoSTuTvmYOPdgj!Q!b!c!f!k!n!q#W#XQ!r!kV!s!n#W#Xe^Od!Q!b!f!k!n!q#W#X",
|
||||||
nodeNames: "⚠ Identifier Word IdentifierBeforeDot Program PipeExpr FunctionCall PositionalArg ParenExpr FunctionCallOrIdentifier BinOp operator operator operator operator ConditionalOp operator operator operator operator operator operator operator operator String StringFragment Interpolation EscapeSeq Number Boolean Regex Null DotGet FunctionDef keyword Params colon end Underscore NamedArg NamedArgPrefix operator IfExpr keyword ThenBlock ThenBlock ElsifExpr keyword ElseExpr keyword Assign",
|
nodeNames: "⚠ Identifier AssignableIdentifier Word IdentifierBeforeDot Program PipeExpr FunctionCall PositionalArg ParenExpr FunctionCallOrIdentifier BinOp operator operator operator operator ConditionalOp operator operator operator operator operator operator operator operator String StringFragment Interpolation EscapeSeq Number Boolean Regex Null DotGet FunctionDef keyword Params colon end Underscore NamedArg NamedArgPrefix operator IfExpr keyword ThenBlock ThenBlock ElsifExpr keyword ElseExpr keyword Assign",
|
||||||
maxTerm: 83,
|
maxTerm: 84,
|
||||||
context: trackScope,
|
context: trackScope,
|
||||||
nodeProps: [
|
nodeProps: [
|
||||||
["closedBy", 36,"end"],
|
["closedBy", 37,"end"],
|
||||||
["openedBy", 37,"colon"]
|
["openedBy", 38,"colon"]
|
||||||
],
|
],
|
||||||
propSources: [highlighting],
|
propSources: [highlighting],
|
||||||
skippedNodes: [0],
|
skippedNodes: [0],
|
||||||
repeatNodeCount: 7,
|
repeatNodeCount: 7,
|
||||||
tokenData: "!&X~R!SOX$_XY$|YZ%gZp$_pq$|qr&Qrt$_tu'Yuw$_wx'_xy'dyz'}z{(h{|)R|}$_}!O)l!O!P,b!P!Q,{!Q![*]![!]5j!]!^%g!^!_6T!_!`7_!`!a7x!a#O$_#O#P9S#P#R$_#R#S9X#S#T$_#T#U9r#U#X;W#X#Y=m#Y#ZDs#Z#];W#]#^JO#^#b;W#b#cKp#c#d! Y#d#f;W#f#g!!z#g#h;W#h#i!#q#i#o;W#o#p$_#p#q!%i#q;'S$_;'S;=`$v<%l~$_~O$_~~!&SS$dUiSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_S$yP;=`<%l$__%TUiS!^ZOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V%nUiS!qROt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V&VWiSOt$_uw$_x!_$_!_!`&o!`#O$_#P;'S$_;'S;=`$v<%lO$_V&vUaRiSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_~'_O!i~~'dO!g~V'kUiS!eROt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V(UUiS!fROt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V(oUZRiSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V)YU]RiSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V)sWiS^ROt$_uw$_x!Q$_!Q![*]![#O$_#P;'S$_;'S;=`$v<%lO$_V*dYiSlROt$_uw$_x!O$_!O!P+S!P!Q$_!Q![*]![#O$_#P;'S$_;'S;=`$v<%lO$_V+XWiSOt$_uw$_x!Q$_!Q![+q![#O$_#P;'S$_;'S;=`$v<%lO$_V+xWiSlROt$_uw$_x!Q$_!Q![+q![#O$_#P;'S$_;'S;=`$v<%lO$_T,iU!nPiSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V-SWiS[ROt$_uw$_x!P$_!P!Q-l!Q#O$_#P;'S$_;'S;=`$v<%lO$_V-q^iSOY.mYZ$_Zt.mtu/puw.mwx/px!P.m!P!Q$_!Q!}.m!}#O4c#O#P2O#P;'S.m;'S;=`5d<%lO.mV.t^iSnROY.mYZ$_Zt.mtu/puw.mwx/px!P.m!P!Q2e!Q!}.m!}#O4c#O#P2O#P;'S.m;'S;=`5d<%lO.mR/uXnROY/pZ!P/p!P!Q0b!Q!}/p!}#O1P#O#P2O#P;'S/p;'S;=`2_<%lO/pR0eP!P!Q0hR0mUnR#Z#[0h#]#^0h#a#b0h#g#h0h#i#j0h#m#n0hR1SVOY1PZ#O1P#O#P1i#P#Q/p#Q;'S1P;'S;=`1x<%lO1PR1lSOY1PZ;'S1P;'S;=`1x<%lO1PR1{P;=`<%l1PR2RSOY/pZ;'S/p;'S;=`2_<%lO/pR2bP;=`<%l/pV2jWiSOt$_uw$_x!P$_!P!Q3S!Q#O$_#P;'S$_;'S;=`$v<%lO$_V3ZbiSnROt$_uw$_x#O$_#P#Z$_#Z#[3S#[#]$_#]#^3S#^#a$_#a#b3S#b#g$_#g#h3S#h#i$_#i#j3S#j#m$_#m#n3S#n;'S$_;'S;=`$v<%lO$_V4h[iSOY4cYZ$_Zt4ctu1Puw4cwx1Px#O4c#O#P1i#P#Q.m#Q;'S4c;'S;=`5^<%lO4cV5aP;=`<%l4cV5gP;=`<%l.mT5qUiStPOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V6[WbRiSOt$_uw$_x!_$_!_!`6t!`#O$_#P;'S$_;'S;=`$v<%lO$_V6{UcRiSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V7fU`RiSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V8PWdRiSOt$_uw$_x!_$_!_!`8i!`#O$_#P;'S$_;'S;=`$v<%lO$_V8pUeRiSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_~9XO!j~V9`UiSvROt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V9w[iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#b;W#b#c;{#c#o;W#o;'S$_;'S;=`$v<%lO$_U:tUxQiSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_U;]YiSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_V<Q[iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#W;W#W#X<v#X#o;W#o;'S$_;'S;=`$v<%lO$_V<}YfRiSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_V=r^iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#`;W#`#a>n#a#b;W#b#cCR#c#o;W#o;'S$_;'S;=`$v<%lO$_V>s[iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#g;W#g#h?i#h#o;W#o;'S$_;'S;=`$v<%lO$_V?n^iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#X;W#X#Y@j#Y#];W#]#^Aa#^#o;W#o;'S$_;'S;=`$v<%lO$_V@qY!RPiSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_VAf[iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#Y;W#Y#ZB[#Z#o;W#o;'S$_;'S;=`$v<%lO$_VBcY!PPiSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_VCW[iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#W;W#W#XC|#X#o;W#o;'S$_;'S;=`$v<%lO$_VDTYiSuROt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_VDx]iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#UEq#U#b;W#b#cIX#c#o;W#o;'S$_;'S;=`$v<%lO$_VEv[iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#`;W#`#aFl#a#o;W#o;'S$_;'S;=`$v<%lO$_VFq[iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#g;W#g#hGg#h#o;W#o;'S$_;'S;=`$v<%lO$_VGl[iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#X;W#X#YHb#Y#o;W#o;'S$_;'S;=`$v<%lO$_VHiYmRiSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_VI`YrRiSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_VJT[iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#Y;W#Y#ZJy#Z#o;W#o;'S$_;'S;=`$v<%lO$_VKQY{PiSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$__Kw[!kWiSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#i;W#i#jLm#j#o;W#o;'S$_;'S;=`$v<%lO$_VLr[iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#`;W#`#aMh#a#o;W#o;'S$_;'S;=`$v<%lO$_VMm[iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#`;W#`#aNc#a#o;W#o;'S$_;'S;=`$v<%lO$_VNjYoRiSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_V! _[iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#f;W#f#g!!T#g#o;W#o;'S$_;'S;=`$v<%lO$_V!![YgRiSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_^!#RY!mWiSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$__!#x[!lWiSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#f;W#f#g!$n#g#o;W#o;'S$_;'S;=`$v<%lO$_V!$s[iSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#i;W#i#jGg#j#o;W#o;'S$_;'S;=`$v<%lO$_V!%pUyRiSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_~!&XO!u~",
|
tokenData: "!&X~R!SOX$_XY$|YZ%gZp$_pq$|qr&Qrt$_tu'Yuw$_wx'_xy'dyz'}z{(h{|)R|}$_}!O)l!O!P,b!P!Q,{!Q![*]![!]5j!]!^%g!^!_6T!_!`7_!`!a7x!a#O$_#O#P9S#P#R$_#R#S9X#S#T$_#T#U9r#U#X;W#X#Y=m#Y#ZDs#Z#];W#]#^JO#^#b;W#b#cKp#c#d! Y#d#f;W#f#g!!z#g#h;W#h#i!#q#i#o;W#o#p$_#p#q!%i#q;'S$_;'S;=`$v<%l~$_~O$_~~!&SS$dUjSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_S$yP;=`<%l$__%TUjS!_ZOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V%nUjS!rROt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V&VWjSOt$_uw$_x!_$_!_!`&o!`#O$_#P;'S$_;'S;=`$v<%lO$_V&vUbRjSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_~'_O!j~~'dO!h~V'kUjS!fROt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V(UUjS!gROt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V(oU[RjSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V)YU^RjSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V)sWjS_ROt$_uw$_x!Q$_!Q![*]![#O$_#P;'S$_;'S;=`$v<%lO$_V*dYjSmROt$_uw$_x!O$_!O!P+S!P!Q$_!Q![*]![#O$_#P;'S$_;'S;=`$v<%lO$_V+XWjSOt$_uw$_x!Q$_!Q![+q![#O$_#P;'S$_;'S;=`$v<%lO$_V+xWjSmROt$_uw$_x!Q$_!Q![+q![#O$_#P;'S$_;'S;=`$v<%lO$_T,iU!oPjSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V-SWjS]ROt$_uw$_x!P$_!P!Q-l!Q#O$_#P;'S$_;'S;=`$v<%lO$_V-q^jSOY.mYZ$_Zt.mtu/puw.mwx/px!P.m!P!Q$_!Q!}.m!}#O4c#O#P2O#P;'S.m;'S;=`5d<%lO.mV.t^jSoROY.mYZ$_Zt.mtu/puw.mwx/px!P.m!P!Q2e!Q!}.m!}#O4c#O#P2O#P;'S.m;'S;=`5d<%lO.mR/uXoROY/pZ!P/p!P!Q0b!Q!}/p!}#O1P#O#P2O#P;'S/p;'S;=`2_<%lO/pR0eP!P!Q0hR0mUoR#Z#[0h#]#^0h#a#b0h#g#h0h#i#j0h#m#n0hR1SVOY1PZ#O1P#O#P1i#P#Q/p#Q;'S1P;'S;=`1x<%lO1PR1lSOY1PZ;'S1P;'S;=`1x<%lO1PR1{P;=`<%l1PR2RSOY/pZ;'S/p;'S;=`2_<%lO/pR2bP;=`<%l/pV2jWjSOt$_uw$_x!P$_!P!Q3S!Q#O$_#P;'S$_;'S;=`$v<%lO$_V3ZbjSoROt$_uw$_x#O$_#P#Z$_#Z#[3S#[#]$_#]#^3S#^#a$_#a#b3S#b#g$_#g#h3S#h#i$_#i#j3S#j#m$_#m#n3S#n;'S$_;'S;=`$v<%lO$_V4h[jSOY4cYZ$_Zt4ctu1Puw4cwx1Px#O4c#O#P1i#P#Q.m#Q;'S4c;'S;=`5^<%lO4cV5aP;=`<%l4cV5gP;=`<%l.mT5qUjSuPOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V6[WcRjSOt$_uw$_x!_$_!_!`6t!`#O$_#P;'S$_;'S;=`$v<%lO$_V6{UdRjSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V7fUaRjSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V8PWeRjSOt$_uw$_x!_$_!_!`8i!`#O$_#P;'S$_;'S;=`$v<%lO$_V8pUfRjSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_~9XO!k~V9`UjSwROt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_V9w[jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#b;W#b#c;{#c#o;W#o;'S$_;'S;=`$v<%lO$_U:tUyQjSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_U;]YjSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_V<Q[jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#W;W#W#X<v#X#o;W#o;'S$_;'S;=`$v<%lO$_V<}YgRjSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_V=r^jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#`;W#`#a>n#a#b;W#b#cCR#c#o;W#o;'S$_;'S;=`$v<%lO$_V>s[jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#g;W#g#h?i#h#o;W#o;'S$_;'S;=`$v<%lO$_V?n^jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#X;W#X#Y@j#Y#];W#]#^Aa#^#o;W#o;'S$_;'S;=`$v<%lO$_V@qY!SPjSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_VAf[jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#Y;W#Y#ZB[#Z#o;W#o;'S$_;'S;=`$v<%lO$_VBcY!QPjSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_VCW[jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#W;W#W#XC|#X#o;W#o;'S$_;'S;=`$v<%lO$_VDTYjSvROt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_VDx]jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#UEq#U#b;W#b#cIX#c#o;W#o;'S$_;'S;=`$v<%lO$_VEv[jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#`;W#`#aFl#a#o;W#o;'S$_;'S;=`$v<%lO$_VFq[jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#g;W#g#hGg#h#o;W#o;'S$_;'S;=`$v<%lO$_VGl[jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#X;W#X#YHb#Y#o;W#o;'S$_;'S;=`$v<%lO$_VHiYnRjSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_VI`YsRjSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_VJT[jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#Y;W#Y#ZJy#Z#o;W#o;'S$_;'S;=`$v<%lO$_VKQY|PjSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$__Kw[!lWjSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#i;W#i#jLm#j#o;W#o;'S$_;'S;=`$v<%lO$_VLr[jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#`;W#`#aMh#a#o;W#o;'S$_;'S;=`$v<%lO$_VMm[jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#`;W#`#aNc#a#o;W#o;'S$_;'S;=`$v<%lO$_VNjYpRjSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_V! _[jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#f;W#f#g!!T#g#o;W#o;'S$_;'S;=`$v<%lO$_V!![YhRjSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$_^!#RY!nWjSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#o;W#o;'S$_;'S;=`$v<%lO$__!#x[!mWjSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#f;W#f#g!$n#g#o;W#o;'S$_;'S;=`$v<%lO$_V!$s[jSOt$_uw$_x!_$_!_!`:m!`#O$_#P#T$_#T#i;W#i#jGg#j#o;W#o;'S$_;'S;=`$v<%lO$_V!%pUzRjSOt$_uw$_x#O$_#P;'S$_;'S;=`$v<%lO$_~!&XO!v~",
|
||||||
tokenizers: [0, 1, 2, 3, tokenizer],
|
tokenizers: [0, 1, 2, 3, tokenizer],
|
||||||
topRules: {"Program":[0,4]},
|
topRules: {"Program":[0,5]},
|
||||||
tokenPrec: 786
|
tokenPrec: 768
|
||||||
})
|
})
|
||||||
|
|
|
||||||
|
|
@ -10,7 +10,7 @@ describe('null', () => {
|
||||||
test('parses null in assignments', () => {
|
test('parses null in assignments', () => {
|
||||||
expect('a = null').toMatchTree(`
|
expect('a = null').toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier a
|
AssignableIdentifier a
|
||||||
operator =
|
operator =
|
||||||
Null null`)
|
Null null`)
|
||||||
})
|
})
|
||||||
|
|
@ -212,11 +212,11 @@ describe('newlines', () => {
|
||||||
expect(`x = 5
|
expect(`x = 5
|
||||||
y = 2`).toMatchTree(`
|
y = 2`).toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier x
|
AssignableIdentifier x
|
||||||
operator =
|
operator =
|
||||||
Number 5
|
Number 5
|
||||||
Assign
|
Assign
|
||||||
Identifier y
|
AssignableIdentifier y
|
||||||
operator =
|
operator =
|
||||||
Number 2`)
|
Number 2`)
|
||||||
})
|
})
|
||||||
|
|
@ -224,11 +224,11 @@ y = 2`).toMatchTree(`
|
||||||
test('parses statements separated by semicolons', () => {
|
test('parses statements separated by semicolons', () => {
|
||||||
expect(`x = 5; y = 2`).toMatchTree(`
|
expect(`x = 5; y = 2`).toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier x
|
AssignableIdentifier x
|
||||||
operator =
|
operator =
|
||||||
Number 5
|
Number 5
|
||||||
Assign
|
Assign
|
||||||
Identifier y
|
AssignableIdentifier y
|
||||||
operator =
|
operator =
|
||||||
Number 2`)
|
Number 2`)
|
||||||
})
|
})
|
||||||
|
|
@ -236,7 +236,7 @@ y = 2`).toMatchTree(`
|
||||||
test('parses statement with word and a semicolon', () => {
|
test('parses statement with word and a semicolon', () => {
|
||||||
expect(`a = hello; 2`).toMatchTree(`
|
expect(`a = hello; 2`).toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier a
|
AssignableIdentifier a
|
||||||
operator =
|
operator =
|
||||||
FunctionCallOrIdentifier
|
FunctionCallOrIdentifier
|
||||||
Identifier hello
|
Identifier hello
|
||||||
|
|
@ -248,7 +248,7 @@ describe('Assign', () => {
|
||||||
test('parses simple assignment', () => {
|
test('parses simple assignment', () => {
|
||||||
expect('x = 5').toMatchTree(`
|
expect('x = 5').toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier x
|
AssignableIdentifier x
|
||||||
operator =
|
operator =
|
||||||
Number 5`)
|
Number 5`)
|
||||||
})
|
})
|
||||||
|
|
@ -256,7 +256,7 @@ describe('Assign', () => {
|
||||||
test('parses assignment with addition', () => {
|
test('parses assignment with addition', () => {
|
||||||
expect('x = 5 + 3').toMatchTree(`
|
expect('x = 5 + 3').toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier x
|
AssignableIdentifier x
|
||||||
operator =
|
operator =
|
||||||
BinOp
|
BinOp
|
||||||
Number 5
|
Number 5
|
||||||
|
|
@ -267,13 +267,13 @@ describe('Assign', () => {
|
||||||
test('parses assignment with functions', () => {
|
test('parses assignment with functions', () => {
|
||||||
expect('add = fn a b: a + b end').toMatchTree(`
|
expect('add = fn a b: a + b end').toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier add
|
AssignableIdentifier add
|
||||||
operator =
|
operator =
|
||||||
FunctionDef
|
FunctionDef
|
||||||
keyword fn
|
keyword fn
|
||||||
Params
|
Params
|
||||||
Identifier a
|
AssignableIdentifier a
|
||||||
Identifier b
|
AssignableIdentifier b
|
||||||
colon :
|
colon :
|
||||||
BinOp
|
BinOp
|
||||||
Identifier a
|
Identifier a
|
||||||
|
|
@ -287,7 +287,7 @@ describe('DotGet whitespace sensitivity', () => {
|
||||||
test('no whitespace - DotGet works when identifier in scope', () => {
|
test('no whitespace - DotGet works when identifier in scope', () => {
|
||||||
expect('basename = 5; basename.prop').toMatchTree(`
|
expect('basename = 5; basename.prop').toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier basename
|
AssignableIdentifier basename
|
||||||
operator =
|
operator =
|
||||||
Number 5
|
Number 5
|
||||||
DotGet
|
DotGet
|
||||||
|
|
@ -298,7 +298,7 @@ describe('DotGet whitespace sensitivity', () => {
|
||||||
test('space before dot - NOT DotGet, parses as division', () => {
|
test('space before dot - NOT DotGet, parses as division', () => {
|
||||||
expect('basename = 5; basename / prop').toMatchTree(`
|
expect('basename = 5; basename / prop').toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier basename
|
AssignableIdentifier basename
|
||||||
operator =
|
operator =
|
||||||
Number 5
|
Number 5
|
||||||
BinOp
|
BinOp
|
||||||
|
|
|
||||||
|
|
@ -19,7 +19,7 @@ describe('if/elsif/else', () => {
|
||||||
|
|
||||||
expect('a = if x: 2').toMatchTree(`
|
expect('a = if x: 2').toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier a
|
AssignableIdentifier a
|
||||||
operator =
|
operator =
|
||||||
IfExpr
|
IfExpr
|
||||||
keyword if
|
keyword if
|
||||||
|
|
|
||||||
|
|
@ -17,7 +17,7 @@ describe('DotGet', () => {
|
||||||
test('obj.prop is DotGet when obj is assigned', () => {
|
test('obj.prop is DotGet when obj is assigned', () => {
|
||||||
expect('obj = 5; obj.prop').toMatchTree(`
|
expect('obj = 5; obj.prop').toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier obj
|
AssignableIdentifier obj
|
||||||
operator =
|
operator =
|
||||||
Number 5
|
Number 5
|
||||||
DotGet
|
DotGet
|
||||||
|
|
@ -31,7 +31,7 @@ describe('DotGet', () => {
|
||||||
FunctionDef
|
FunctionDef
|
||||||
keyword fn
|
keyword fn
|
||||||
Params
|
Params
|
||||||
Identifier config
|
AssignableIdentifier config
|
||||||
colon :
|
colon :
|
||||||
DotGet
|
DotGet
|
||||||
IdentifierBeforeDot config
|
IdentifierBeforeDot config
|
||||||
|
|
@ -45,7 +45,7 @@ describe('DotGet', () => {
|
||||||
FunctionDef
|
FunctionDef
|
||||||
keyword fn
|
keyword fn
|
||||||
Params
|
Params
|
||||||
Identifier x
|
AssignableIdentifier x
|
||||||
colon :
|
colon :
|
||||||
DotGet
|
DotGet
|
||||||
IdentifierBeforeDot x
|
IdentifierBeforeDot x
|
||||||
|
|
@ -63,8 +63,8 @@ end`).toMatchTree(`
|
||||||
FunctionDef
|
FunctionDef
|
||||||
keyword fn
|
keyword fn
|
||||||
Params
|
Params
|
||||||
Identifier x
|
AssignableIdentifier x
|
||||||
Identifier y
|
AssignableIdentifier y
|
||||||
colon :
|
colon :
|
||||||
DotGet
|
DotGet
|
||||||
IdentifierBeforeDot x
|
IdentifierBeforeDot x
|
||||||
|
|
@ -84,7 +84,7 @@ end`).toMatchTree(`
|
||||||
FunctionDef
|
FunctionDef
|
||||||
keyword fn
|
keyword fn
|
||||||
Params
|
Params
|
||||||
Identifier x
|
AssignableIdentifier x
|
||||||
colon :
|
colon :
|
||||||
DotGet
|
DotGet
|
||||||
IdentifierBeforeDot x
|
IdentifierBeforeDot x
|
||||||
|
|
@ -92,7 +92,7 @@ end`).toMatchTree(`
|
||||||
FunctionDef
|
FunctionDef
|
||||||
keyword fn
|
keyword fn
|
||||||
Params
|
Params
|
||||||
Identifier y
|
AssignableIdentifier y
|
||||||
colon :
|
colon :
|
||||||
DotGet
|
DotGet
|
||||||
IdentifierBeforeDot y
|
IdentifierBeforeDot y
|
||||||
|
|
@ -105,7 +105,7 @@ end`).toMatchTree(`
|
||||||
test('dot get works as function argument', () => {
|
test('dot get works as function argument', () => {
|
||||||
expect('config = 42; echo config.path').toMatchTree(`
|
expect('config = 42; echo config.path').toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier config
|
AssignableIdentifier config
|
||||||
operator =
|
operator =
|
||||||
Number 42
|
Number 42
|
||||||
FunctionCall
|
FunctionCall
|
||||||
|
|
@ -120,7 +120,7 @@ end`).toMatchTree(`
|
||||||
test('mixed file paths and dot get', () => {
|
test('mixed file paths and dot get', () => {
|
||||||
expect('config = 42; cat readme.txt; echo config.path').toMatchTree(`
|
expect('config = 42; cat readme.txt; echo config.path').toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier config
|
AssignableIdentifier config
|
||||||
operator =
|
operator =
|
||||||
Number 42
|
Number 42
|
||||||
FunctionCall
|
FunctionCall
|
||||||
|
|
|
||||||
|
|
@ -72,7 +72,7 @@ describe('Fn', () => {
|
||||||
FunctionDef
|
FunctionDef
|
||||||
keyword fn
|
keyword fn
|
||||||
Params
|
Params
|
||||||
Identifier x
|
AssignableIdentifier x
|
||||||
colon :
|
colon :
|
||||||
BinOp
|
BinOp
|
||||||
Identifier x
|
Identifier x
|
||||||
|
|
@ -86,8 +86,8 @@ describe('Fn', () => {
|
||||||
FunctionDef
|
FunctionDef
|
||||||
keyword fn
|
keyword fn
|
||||||
Params
|
Params
|
||||||
Identifier x
|
AssignableIdentifier x
|
||||||
Identifier y
|
AssignableIdentifier y
|
||||||
colon :
|
colon :
|
||||||
BinOp
|
BinOp
|
||||||
Identifier x
|
Identifier x
|
||||||
|
|
@ -104,8 +104,8 @@ end`).toMatchTree(`
|
||||||
FunctionDef
|
FunctionDef
|
||||||
keyword fn
|
keyword fn
|
||||||
Params
|
Params
|
||||||
Identifier x
|
AssignableIdentifier x
|
||||||
Identifier y
|
AssignableIdentifier y
|
||||||
colon :
|
colon :
|
||||||
BinOp
|
BinOp
|
||||||
Identifier x
|
Identifier x
|
||||||
|
|
|
||||||
|
|
@ -21,16 +21,16 @@ describe('multiline', () => {
|
||||||
add 3 4
|
add 3 4
|
||||||
`).toMatchTree(`
|
`).toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier add
|
AssignableIdentifier add
|
||||||
operator =
|
operator =
|
||||||
FunctionDef
|
FunctionDef
|
||||||
keyword fn
|
keyword fn
|
||||||
Params
|
Params
|
||||||
Identifier a
|
AssignableIdentifier a
|
||||||
Identifier b
|
AssignableIdentifier b
|
||||||
colon :
|
colon :
|
||||||
Assign
|
Assign
|
||||||
Identifier result
|
AssignableIdentifier result
|
||||||
operator =
|
operator =
|
||||||
BinOp
|
BinOp
|
||||||
Identifier a
|
Identifier a
|
||||||
|
|
@ -63,8 +63,8 @@ end
|
||||||
FunctionDef
|
FunctionDef
|
||||||
keyword fn
|
keyword fn
|
||||||
Params
|
Params
|
||||||
Identifier x
|
AssignableIdentifier x
|
||||||
Identifier y
|
AssignableIdentifier y
|
||||||
colon :
|
colon :
|
||||||
FunctionCallOrIdentifier
|
FunctionCallOrIdentifier
|
||||||
Identifier x
|
Identifier x
|
||||||
|
|
|
||||||
|
|
@ -50,7 +50,7 @@ describe('pipe expressions', () => {
|
||||||
test('pipe expression in assignment', () => {
|
test('pipe expression in assignment', () => {
|
||||||
expect('result = echo hello | grep h').toMatchTree(`
|
expect('result = echo hello | grep h').toMatchTree(`
|
||||||
Assign
|
Assign
|
||||||
Identifier result
|
AssignableIdentifier result
|
||||||
operator =
|
operator =
|
||||||
PipeExpr
|
PipeExpr
|
||||||
FunctionCall
|
FunctionCall
|
||||||
|
|
@ -77,7 +77,7 @@ describe('pipe expressions', () => {
|
||||||
FunctionDef
|
FunctionDef
|
||||||
keyword fn
|
keyword fn
|
||||||
Params
|
Params
|
||||||
Identifier x
|
AssignableIdentifier x
|
||||||
colon :
|
colon :
|
||||||
FunctionCallOrIdentifier
|
FunctionCallOrIdentifier
|
||||||
Identifier x
|
Identifier x
|
||||||
|
|
|
||||||
|
|
@ -1,63 +1,107 @@
|
||||||
import { ExternalTokenizer, InputStream, Stack } from '@lezer/lr'
|
import { ExternalTokenizer, InputStream, Stack } from '@lezer/lr'
|
||||||
import { Identifier, Word, IdentifierBeforeDot } from './shrimp.terms'
|
import { Identifier, AssignableIdentifier, Word, IdentifierBeforeDot } from './shrimp.terms'
|
||||||
import type { Scope } from './scopeTracker'
|
|
||||||
|
|
||||||
// The only chars that can't be words are whitespace, apostrophes, closing parens, and EOF.
|
// The only chars that can't be words are whitespace, apostrophes, closing parens, and EOF.
|
||||||
|
|
||||||
export const tokenizer = new ExternalTokenizer(
|
export const tokenizer = new ExternalTokenizer(
|
||||||
(input: InputStream, stack: Stack) => {
|
(input: InputStream, stack: Stack) => {
|
||||||
let ch = getFullCodePoint(input, 0)
|
const ch = getFullCodePoint(input, 0)
|
||||||
console.log(`🌭 checking char ${String.fromCodePoint(ch)}`)
|
|
||||||
if (!isWordChar(ch)) return
|
if (!isWordChar(ch)) return
|
||||||
|
|
||||||
let pos = getCharSize(ch)
|
const isValidStart = isLowercaseLetter(ch) || isEmoji(ch)
|
||||||
let isValidIdentifier = isLowercaseLetter(ch) || isEmoji(ch)
|
|
||||||
const canBeWord = stack.canShift(Word)
|
const canBeWord = stack.canShift(Word)
|
||||||
|
|
||||||
while (true) {
|
// Consume all word characters, tracking if it remains a valid identifier
|
||||||
ch = getFullCodePoint(input, pos)
|
const { pos, isValidIdentifier, stoppedAtDot } = consumeWordToken(
|
||||||
|
input,
|
||||||
|
isValidStart,
|
||||||
|
canBeWord
|
||||||
|
)
|
||||||
|
|
||||||
// Check for dot and scope - property access detection
|
// Check if we should emit IdentifierBeforeDot for property access
|
||||||
if (ch === 46 /* . */ && isValidIdentifier) {
|
if (stoppedAtDot) {
|
||||||
// Build identifier text by peeking character by character
|
const dotGetToken = checkForDotGet(input, stack, pos)
|
||||||
let identifierText = ''
|
|
||||||
for (let i = 0; i < pos; i++) {
|
if (dotGetToken) {
|
||||||
|
input.advance(pos)
|
||||||
|
input.acceptToken(dotGetToken)
|
||||||
|
} else {
|
||||||
|
// Not in scope - continue consuming the dot as part of the word
|
||||||
|
const afterDot = consumeRestOfWord(input, pos + 1, canBeWord)
|
||||||
|
input.advance(afterDot)
|
||||||
|
input.acceptToken(Word)
|
||||||
|
}
|
||||||
|
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
// Advance past the token we consumed
|
||||||
|
input.advance(pos)
|
||||||
|
|
||||||
|
// Choose which token to emit
|
||||||
|
if (isValidIdentifier) {
|
||||||
|
const token = chooseIdentifierToken(input, stack)
|
||||||
|
input.acceptToken(token)
|
||||||
|
} else {
|
||||||
|
input.acceptToken(Word)
|
||||||
|
}
|
||||||
|
},
|
||||||
|
{ contextual: true }
|
||||||
|
)
|
||||||
|
|
||||||
|
// Build identifier text from input stream, handling surrogate pairs for emoji
|
||||||
|
const buildIdentifierText = (input: InputStream, length: number): string => {
|
||||||
|
let text = ''
|
||||||
|
for (let i = 0; i < length; i++) {
|
||||||
const charCode = input.peek(i)
|
const charCode = input.peek(i)
|
||||||
if (charCode === -1) break
|
if (charCode === -1) break
|
||||||
// Handle surrogate pairs for emoji
|
|
||||||
if (charCode >= 0xd800 && charCode <= 0xdbff && i + 1 < pos) {
|
// Handle surrogate pairs for emoji (UTF-16 encoding)
|
||||||
|
if (charCode >= 0xd800 && charCode <= 0xdbff && i + 1 < length) {
|
||||||
const low = input.peek(i + 1)
|
const low = input.peek(i + 1)
|
||||||
if (low >= 0xdc00 && low <= 0xdfff) {
|
if (low >= 0xdc00 && low <= 0xdfff) {
|
||||||
identifierText += String.fromCharCode(charCode, low)
|
text += String.fromCharCode(charCode, low)
|
||||||
i++ // Skip the low surrogate
|
i++ // Skip the low surrogate
|
||||||
continue
|
continue
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
identifierText += String.fromCharCode(charCode)
|
text += String.fromCharCode(charCode)
|
||||||
}
|
}
|
||||||
|
return text
|
||||||
const scope = stack.context as Scope | undefined
|
}
|
||||||
|
|
||||||
if (scope?.has(identifierText)) {
|
// Consume word characters, tracking if it remains a valid identifier
|
||||||
// In scope - stop here, let grammar parse property access
|
// Returns the position after consuming, whether it's a valid identifier, and if we stopped at a dot
|
||||||
input.advance(pos)
|
const consumeWordToken = (
|
||||||
input.acceptToken(IdentifierBeforeDot)
|
input: InputStream,
|
||||||
return
|
isValidStart: boolean,
|
||||||
}
|
canBeWord: boolean
|
||||||
// Not in scope - continue consuming as Word (fall through)
|
): { pos: number; isValidIdentifier: boolean; stoppedAtDot: boolean } => {
|
||||||
|
let pos = getCharSize(getFullCodePoint(input, 0))
|
||||||
|
let isValidIdentifier = isValidStart
|
||||||
|
let stoppedAtDot = false
|
||||||
|
|
||||||
|
while (true) {
|
||||||
|
const ch = getFullCodePoint(input, pos)
|
||||||
|
|
||||||
|
// Stop at dot if we have a valid identifier (might be property access)
|
||||||
|
if (ch === 46 /* . */ && isValidIdentifier) {
|
||||||
|
stoppedAtDot = true
|
||||||
|
break
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Stop if we hit a non-word character
|
||||||
if (!isWordChar(ch)) break
|
if (!isWordChar(ch)) break
|
||||||
|
|
||||||
// Certain characters might end a word or identifier if they are followed by whitespace.
|
// Context-aware termination: semicolon/colon can end a word if followed by whitespace
|
||||||
// This allows things like `a = hello; 2` of if `x: y` to parse correctly.
|
// This allows `hello; 2` to parse correctly while `hello;world` stays as one word
|
||||||
if (canBeWord && (ch === 59 /* ; */ || ch === 58) /* : */) {
|
if (canBeWord && (ch === 59 /* ; */ || ch === 58) /* : */) {
|
||||||
const nextCh = getFullCodePoint(input, pos + 1)
|
const nextCh = getFullCodePoint(input, pos + 1)
|
||||||
if (!isWordChar(nextCh)) break
|
if (!isWordChar(nextCh)) break
|
||||||
}
|
}
|
||||||
|
|
||||||
// Track identifier validity
|
// Track identifier validity: must be lowercase, digit, dash, or emoji
|
||||||
if (!isLowercaseLetter(ch) && !isDigit(ch) && ch !== 45 && !isEmoji(ch)) {
|
if (!isLowercaseLetter(ch) && !isDigit(ch) && ch !== 45 /* - */ && !isEmoji(ch)) {
|
||||||
if (!canBeWord) break
|
if (!canBeWord) break
|
||||||
isValidIdentifier = false
|
isValidIdentifier = false
|
||||||
}
|
}
|
||||||
|
|
@ -65,21 +109,73 @@ export const tokenizer = new ExternalTokenizer(
|
||||||
pos += getCharSize(ch)
|
pos += getCharSize(ch)
|
||||||
}
|
}
|
||||||
|
|
||||||
input.advance(pos)
|
return { pos, isValidIdentifier, stoppedAtDot }
|
||||||
input.acceptToken(isValidIdentifier ? Identifier : Word)
|
}
|
||||||
},
|
|
||||||
{ contextual: true }
|
|
||||||
)
|
|
||||||
|
|
||||||
|
// Consume the rest of a word after we've decided not to treat a dot as DotGet
|
||||||
|
// Used when we have "file.txt" - we already consumed "file", now consume ".txt"
|
||||||
|
const consumeRestOfWord = (input: InputStream, startPos: number, canBeWord: boolean): number => {
|
||||||
|
let pos = startPos
|
||||||
|
while (true) {
|
||||||
|
const ch = getFullCodePoint(input, pos)
|
||||||
|
|
||||||
|
// Stop if we hit a non-word character
|
||||||
|
if (!isWordChar(ch)) break
|
||||||
|
|
||||||
|
// Context-aware termination for semicolon/colon
|
||||||
|
if (canBeWord && (ch === 59 /* ; */ || ch === 58) /* : */) {
|
||||||
|
const nextCh = getFullCodePoint(input, pos + 1)
|
||||||
|
if (!isWordChar(nextCh)) break
|
||||||
|
}
|
||||||
|
|
||||||
|
pos += getCharSize(ch)
|
||||||
|
}
|
||||||
|
return pos
|
||||||
|
}
|
||||||
|
|
||||||
|
// Check if this identifier is in scope (for property access detection)
|
||||||
|
// Returns IdentifierBeforeDot token if in scope, null otherwise
|
||||||
|
const checkForDotGet = (input: InputStream, stack: Stack, pos: number): number | null => {
|
||||||
|
const identifierText = buildIdentifierText(input, pos)
|
||||||
|
const context = stack.context as { scope: { has(name: string): boolean } } | undefined
|
||||||
|
|
||||||
|
// If identifier is in scope, this is property access (e.g., obj.prop)
|
||||||
|
// If not in scope, it should be consumed as a Word (e.g., file.txt)
|
||||||
|
return context?.scope.has(identifierText) ? IdentifierBeforeDot : null
|
||||||
|
}
|
||||||
|
|
||||||
|
// Decide between AssignableIdentifier and Identifier using grammar state + peek-ahead
|
||||||
|
const chooseIdentifierToken = (input: InputStream, stack: Stack): number => {
|
||||||
|
const canAssignable = stack.canShift(AssignableIdentifier)
|
||||||
|
const canRegular = stack.canShift(Identifier)
|
||||||
|
|
||||||
|
// Only one option is valid - use it
|
||||||
|
if (canAssignable && !canRegular) return AssignableIdentifier
|
||||||
|
if (canRegular && !canAssignable) return Identifier
|
||||||
|
|
||||||
|
// Both possible (ambiguous context) - peek ahead for '=' to disambiguate
|
||||||
|
// This happens at statement start where both `x = 5` (assign) and `echo x` (call) are valid
|
||||||
|
let peekPos = 0
|
||||||
|
while (true) {
|
||||||
|
const ch = getFullCodePoint(input, peekPos)
|
||||||
|
if (isWhiteSpace(ch)) {
|
||||||
|
peekPos += getCharSize(ch)
|
||||||
|
} else {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
const nextCh = getFullCodePoint(input, peekPos)
|
||||||
|
return nextCh === 61 /* = */ ? AssignableIdentifier : Identifier
|
||||||
|
}
|
||||||
|
|
||||||
|
// Character classification helpers
|
||||||
const isWhiteSpace = (ch: number): boolean => {
|
const isWhiteSpace = (ch: number): boolean => {
|
||||||
return ch === 32 /* space */ || ch === 10 /* \n */ || ch === 9 /* tab */ || ch === 13 /* \r */
|
return ch === 32 /* space */ || ch === 9 /* tab */ || ch === 13 /* \r */
|
||||||
}
|
}
|
||||||
|
|
||||||
const isWordChar = (ch: number): boolean => {
|
const isWordChar = (ch: number): boolean => {
|
||||||
const closingParen = ch === 41 /* ) */
|
return !isWhiteSpace(ch) && ch !== 10 /* \n */ && ch !== 41 /* ) */ && ch !== -1 /* EOF */
|
||||||
const eof = ch === -1
|
|
||||||
|
|
||||||
return !isWhiteSpace(ch) && !closingParen && !eof
|
|
||||||
}
|
}
|
||||||
|
|
||||||
const isLowercaseLetter = (ch: number): boolean => {
|
const isLowercaseLetter = (ch: number): boolean => {
|
||||||
|
|
@ -103,7 +199,7 @@ const getFullCodePoint = (input: InputStream, pos: number): number => {
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
return ch // Single code unit
|
return ch
|
||||||
}
|
}
|
||||||
|
|
||||||
const isEmoji = (ch: number): boolean => {
|
const isEmoji = (ch: number): boolean => {
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue
Block a user