Chester Programming Language
Welcome to the Chester Programming Language documentation! Chester is a modern, statically-typed functional language that compiles to TypeScript and features an advanced effect system.
[!WARNING] Development Status: Chester is under active development and not ready for production use. Many features are still being implemented and the language design may change.
What is Chester?
Chester is a statically-typed language designed to bring advanced type system features and effect tracking to the JavaScript/TypeScript ecosystem. It combines ideas from functional programming languages like Haskell and dependently-typed languages with practical interoperability with the JavaScript ecosystem.
Key Features:
- TypeScript Backend: Compiles Chester code to readable, idiomatic TypeScript
- Effect System: Track and manage side effects with CPS transformation
- Strong Type System: Type inference, dependent types, and algebraic data types
- JavaScript/TypeScript Interop: Import and use npm packages with automatic type signature extraction
- Multi-Platform: Runs on JVM, JavaScript (via Scala.js), and Native (via Scala Native)
- Developer Tools: REPL, LSP server, IntelliJ plugin, and browser-based REPL
Quick Example
module example;
def greet(name: String): String =
"Hello, " ++ name ++ "!";
def main: String = greet("World");
Getting Started
To start using Chester:
- Getting Started Guide - Build from source and run your first program
- CLI Usage - Learn the command-line interface
- Language Guide - Understand Chester’s syntax and features
Architecture
Chester is implemented in Scala 3 and consists of several modules:
- CLI: Command-line interface with REPL
- Core: Parser, elaborator, type checker, and backends
- LSP: Language Server Protocol implementation
- Web REPL: Browser-based REPL (Scala.js)
- IntelliJ Plugin: IDE integration
For detailed internals, see the Development Documentation.
Current Backends
- TypeScript ✅ - Fully implemented and tested
- Go 🚧 - Type signatures implemented, codegen in progress
Website & REPL
Try Chester in your browser at the interactive REPL (see the site/ directory for the Next.js-based website).
Chester Guide
Welcome to the Chester language guide! This section covers practical usage of Chester.
Getting Started
If you’re new to Chester, start here:
- Getting Started - Build Chester and run your first program
- CLI Usage - Command-line interface reference
Language Features
- Record Syntax - Defining and using records
- Statements -
letanddefbindings, scoping rules - Syntax Grammar - BNF-like grammar reference
More Resources
For implementation details and architecture, see the Development Documentation.
Getting Started with Chester
This guide will help you build Chester from source and write your first program.
Prerequisites
Before building Chester, ensure you have:
- Java 11 or later (for running sbt and the JVM build)
- sbt (Scala Build Tool) - Installation guide
- Node.js 18+ and pnpm (optional, for the website and web REPL)
Building from Source
Clone the Chester repository:
git clone https://github.com/chester-lang/chester.git
cd chester
Build the CLI (JVM)
sbt cliJVM/assembly
This creates an executable JAR at modules/cli/jvm/target/scala-3.7.4/cli-assembly-*.jar.
For convenience, create an alias:
alias chester='java -jar modules/cli/jvm/target/scala-3.7.4/cli-assembly-*.jar'
Build the Native Binary (Optional)
For faster startup, build a native executable:
sbt cliNative/nativeLink
The binary will be at modules/cli/native/target/scala-3.7.4/cli-out.
Your First Chester Program
Create a file called hello.chester:
module hello;
def greet(name: String): String =
"Hello, " ++ name ++ "!";
def main: IO Unit =
println(greet("Chester"));
Run it:
chester hello.chester
Using the REPL
Start the interactive REPL:
chester
Try some expressions:
chester> 2 + 3
=> 5
chester> def double(x: Int): Int = x * 2
chester> double(21)
=> 42
chester> :t "hello"
Type: String
REPL Commands
:t <expr>- Show the type of an expression:l <file>- Load a Chester file:qor:quit- Exit the REPL
Compiling to TypeScript
Chester’s primary backend compiles to TypeScript. Create example.chester:
module math;
def add(a: Int, b: Int): Int = a + b;
def multiply(a: Int, b: Int): Int = a * b;
export def calculate: Int =
add(10, multiply(5, 3));
Compile to TypeScript:
chester ts example.chester --output ts-out
This generates ts-out/example.ts:
export function add(a: number, b: number): number {
return a + b;
}
export function multiply(a: number, b: number): number {
return a * b;
}
export const calculate: number = add(10, multiply(5, 3));
Using JavaScript/TypeScript Libraries
Chester can import npm packages automatically. Create web-demo.chester:
module web;
import "chalk";
def main: IO Unit = {
let red = chalk.red("Error:");
println(red ++ " Something went wrong!")
};
Before running, install the package:
npm install chalk @types/chalk
Then run:
chester web-demo.chester
Chester will:
- Detect the
import "chalk"statement - Find
node_modules/chalkand read its TypeScript definitions - Generate appropriate Chester type signatures
- Type-check your code against the library
Next Steps
- CLI Usage Guide - Learn all CLI commands and options
- Language Syntax - Understand Chester’s grammar
- Effect System - Learn about effect tracking
- TypeScript Backend - How compilation works
Troubleshooting
“Command not found: chester”
Make sure you’ve created the alias or use the full path to the JAR/binary.
TypeScript Import Errors
Ensure the package has TypeScript definitions:
- Install
@types/<package>if available - Or use packages that bundle their own
.d.tsfiles
Build Errors
Try cleaning and rebuilding:
sbt clean
sbt cliJVM/assembly
Development Setup
For contributors, see the Development Documentation for information on:
- Module structure
- Running tests
- LSP and IntelliJ plugin setup
- Building the website
CLI Usage
The Chester command-line interface provides tools for running, compiling, and formatting Chester code.
Installation
After building Chester (see Getting Started), you can run the CLI via:
# Using the JAR
java -jar modules/cli/jvm/target/scala-3.7.4/cli-assembly-*.jar
# Or create an alias
alias chester='java -jar modules/cli/jvm/target/scala-3.7.4/cli-assembly-*.jar'
# Or use the native binary
modules/cli/native/target/scala-3.7.4/cli-out
Commands
REPL (Interactive Mode)
Start the Read-Eval-Print Loop:
chester
REPL Commands:
| Command | Description | Example |
|---|---|---|
:t <expr> or :type <expr> | Show the type of an expression | :t 42 |
:l <file> or :load <file> | Load and evaluate a Chester file | :l hello.chester |
:q or :quit | Exit the REPL | :q |
Example REPL Session:
chester> def square(x: Int): Int = x * x
=> [function]
chester> square(7)
=> 49
chester> :t square
Type: Int -> Int
chester> :l examples/math.chester
Loaded examples/math.chester.
chester> :quit
Goodbye.
Run a File
Execute a Chester file:
chester <file>
Example:
chester hello.chester
This will:
- Parse and elaborate the file
- Type-check the code
- Apply effect CPS transformation if needed
- Evaluate the result
- Print the output
Compile to TypeScript
Generate TypeScript code from Chester source:
chester ts <input> [--output <directory>]
Arguments:
<input>- Chester source file or directory--output <directory>- Output directory (default:ts-outfor directories, or<filename>.tsfor files)
Examples:
# Compile a single file
chester ts math.chester --output lib
# Compile all .chester files in a directory
chester ts src/ --output dist
Output Structure:
For a file src/utils.chester, the output will be:
- Single file:
dist/utils.ts - Directory: Each
.chesterfile becomes a.tsfile
Compile and Show AST
Elaborate code and save the AST representation:
chester compile <input> [--output <file>]
This command:
- Parses and elaborates the input
- Type-checks the AST
- Optionally saves the AST to a file
Example:
chester compile example.chester --output example.ast
The output shows:
Type: String
AST:
(def greet (lambda (name String) ...))
Format Code
Format a Chester source file in-place:
chester format <file>
Example:
chester format messy.chester
This will:
- Parse the file (preserving comments)
- Pretty-print according to Chester style
- Overwrite the original file
[!WARNING]
formatoverwrites the file in-place. Ensure you have backups or use version control.
Version and Help
# Show version
chester --version
# Show help
chester --help
Advanced Features
JavaScript/TypeScript Imports
Chester automatically resolves TypeScript definitions for imported packages:
import "lodash";
def doubled: Array[Int] = lodash.map([1, 2, 3], (x) => x * 2);
Requirements:
- Package must be in
node_modules/ - TypeScript definitions available (either bundled or via
@types/*)
Supported Import Styles:
import "package-name"; // NPM package
import "package/submodule"; // Submodule
import "@scope/package"; // Scoped package
import "fs"; // Node.js built-in (via @types/node)
Chester will:
- Locate the package in
node_modules/ - Extract TypeScript definitions (
.d.tsfiles) - Parse the definitions into Chester types
- Make them available for use
Effect CPS Transformation
When a function’s type includes IO effects, Chester applies a Continuation-Passing Style transformation:
def main: IO Unit = {
println("Starting...");
let x = readLine();
println("You entered: " ++ x)
};
The TypeScript output uses async/await or callbacks to properly sequence effects.
Configuration and Cache
TypeScript Definitions Cache
Downloaded npm package metadata and tarballs are cached in:
<working-directory>/js-typings/npm-cache/
To clear the cache:
rm -rf js-typings/
Node.js Type Definitions
Chester looks for @types/node in:
node_modules/@types/node/(direct dependency)node_modules/.pnpm/@types+node@*/(pnpm)
For Node.js built-ins to work, install:
npm install --save-dev @types/node
# or
pnpm add -D @types/node
Troubleshooting
“Input file does not exist”
Ensure the path is correct and the file has a .chester extension.
TypeScript Import Errors
Problem: Cannot find module 'some-package'
Solutions:
- Install the package:
npm install some-package - Install types:
npm install @types/some-package - Check that
node_modules/exists in the working directory
Performance Tips
For faster startup, use the native binary instead of the JAR:
sbt cliNative/nativeLink
REPL Not Working
If the REPL doesn’t start, check:
- Java version (requires Java 11+)
- Terminal supports ANSI escape codes
- Try running with
--no-colors(future flag)
Examples
Complete Workflow
# 1. Create a Chester file
cat > calculate.chester << 'EOF'
module calculate;
import "chalk";
def add(a: Int, b: Int): Int = a + b;
def main: IO Unit = {
let result = add(10, 32);
let message = chalk.green("Result: " ++ toString(result));
println(message)
};
EOF
# 2. Install dependencies
npm install chalk @types/chalk
# 3. Run it
chester calculate.chester
# 4. Compile to TypeScript
chester ts calculate.chester --output dist
# 5. Run the TypeScript output
node dist/calculate.ts
Next Steps
- Language Syntax - Learn Chester’s full grammar
- Effect System - Understand IO and effect tracking
- TypeScript Backend - How compilation works
Record Syntax in Chester
Chester provides a concise and powerful syntax for defining records, which are similar to structs or classes in other languages. Records in Chester are immutable by default and provide a convenient way to group related data.
Basic Record Syntax
The basic syntax for defining a record in Chester is as follows:
record RecordName(field1: Type1, field2: Type2, ...) { ... }
Here’s a simple example of a Person record:
record Person(name: String, age: Int);
Using Records
Once defined, you can create instances of records and access their fields:
let alice = Person("Alice", 30);
println(alice.name); // Outputs: Alice
println(alice.age); // Outputs: 30
Statements
Scope of let and def
In Chester, let and def are used to declare bindings, but they differ in how they handle scoping and forward references. Understanding these differences is crucial for writing correct and efficient Chester programs.
let Bindings
- Local Scope:
letbindings are only visible after their declaration within the current block. - No Forward References: You cannot reference a
letbinding before it’s declared. - Type Inference: If no type annotation is provided, the compiler infers the type from the binding’s body.
Example:
// Correct usage of 'let'
let x = 5;
let y = x; // 'x' is defined before use
// Incorrect usage of 'let'
let y = x + 2; // Error: 'x' is not defined yet
let x = 5;
def Bindings
- Global Scope:
defbindings are visible throughout the entire block, even before their declaration. - Allows Forward References: You can reference a
defbinding before it’s declared. - Type Annotation Required for Forward References: If you use a
defbinding before its declaration, you must provide a type annotation.
Example:
// Correct usage of 'def' with type annotation
def y = square(5); // 'square' is used before its declaration
def square(n: Int) = n * n; // Type annotation for 'n' is required
// Incorrect usage of 'def' without type annotation
def y = increment(5); // 'increment' is used before its declaration
def increment(n) = n + 1; // Error: Missing type annotation for 'n'
Summary of Scoping Rules
-
letBindings:- Visible only after their declaration within the current block.
- Do not allow forward references.
- Type annotations are optional if the type can be inferred.
-
defBindings:- Visible throughout the entire block.
- Allow forward references.
- Require type annotations when used before their declarations.
Compiler Behavior
When processing a block, the Chester compiler handles let and def bindings differently to manage scope and type checking.
Processing def Bindings
-
Collection Phase:
- The compiler collects all
defbindings, noting their names, type annotations, and identifiers. - It tracks forward references to detect usages before declarations.
- The compiler collects all
-
Type Annotation Checks:
- For forward-referenced
defbindings without type annotations, the compiler reports aMissingTypeAnnotationError.
- For forward-referenced
-
Context Updates:
- The compiler adds placeholders or inferred types to the context, allowing forward-referenced
defbindings to be used.
- The compiler adds placeholders or inferred types to the context, allowing forward-referenced
Processing let Bindings
- Sequential Processing:
letbindings are processed in order of their appearance.- Each
letbinding is added to the context after its declaration.
- No Forward References:
- Referencing a
letbinding before its declaration results in an error.
- Referencing a
Best Practices
- Use
letwhen you don’t need to reference the binding before its declaration. - Use
defwhen you need forward references or are defining recursive functions. - Always provide type annotations for
defbindings that are forward-referenced to avoid compilation errors.
By understanding these scoping rules, you can write more predictable and maintainable Chester code.
Chester Syntax Grammar (BNF-like)
IMPORTANT NOTE: This document provides only a rough description of the Chester language syntax and might be highly incorrect or not fully aligned with the current implementation. It should be used as a general guide rather than a precise specification.
This document attempts to describe the Chester language syntax using a BNF-like notation.
Note: Whitespace and comments (// ...) are generally allowed between tokens and are not explicitly shown in the grammar rules unless significant (like newlines in blocks).
Top Level
Toplevel ::= Statement* (Expr | ';')?
Statement ::= Expr (';' | Newline)
Note: In many contexts (like blocks or top-level), newlines can act as statement separators similar to semicolons. The last expression in a block acts as its return value unless followed by a semicolon.
Expressions
Expr ::= Atom | Expr Operator Expr | Operator Expr | Expr Operator
Atom ::= ID
| Literal
| Tuple
| List
| Block
| Object
| Application
| MethodCall
| SymbolLookup
| Keyword
Tuple ::= '(' ExprList? ')'
List ::= '[' ExprList? ']'
Block ::= '{' Statement* Expr? '}'
ExprList ::= Expr (',' Expr)* ','?
Application ::= GenericApplication | FunctionCall | BlockApplication | CombinedApplication
GenericApplication ::= Atom '[' ExprList ']'
FunctionCall ::= Atom '(' ExprList? ')'
BlockApplication ::= Atom Block
CombinedApplication ::= GenericApplication FunctionCall
| GenericApplication BlockApplication
| FunctionCall BlockApplication
| GenericApplication FunctionCall BlockApplication
MethodCall ::= Atom '.' ID ('(' ExprList? ')')? // Parentheses are optional for field access
Attribute ::= '@' ID // NOT IMPLEMENTED YET
Keyword ::= '#' ID (GenericApplication | FunctionCall)*
Note: All identifiers are treated uniformly at the syntax level, with no distinction between different kinds of identifiers. Operator precedence and associativity are handled during semantic analysis, not in the grammar.
Object Literals
Object ::= '{' ObjectClauseList? '}'
ObjectClauseList ::= ObjectClause (',' ObjectClause)* ','?
ObjectClause ::= ObjectKey ('=' | '=>') Expr
ObjectKey ::= ID
| QualifiedName
| STRING_LITERAL
| SYMBOL_LITERAL
QualifiedName ::= ID ('.' ID)*
Literals
Literal ::= INT_LITERAL
| RAT_LITERAL
| STRING_LITERAL
| SYMBOL_LITERAL
SYMBOL_LITERAL ::= "'" ID
Terminals
ID ::= /* Starts with letter, _, or emoji; contains letters, digits, _, emoji */
Operator ::= /* Sequence of operator symbols like +, -, *, /, =, ->, =>, :, ., etc. */
INT_LITERAL ::= /* Integer in decimal, hex (0x...), or binary (0b...) format */
RAT_LITERAL ::= /* Rational/float literal (e.g., 1.2, 3.14e-5) */
STRING_LITERAL ::= /* Double-quoted string "..." with escapes */
Newline ::= /* \n or \r\n */
Punctuation and Symbols (Terminals)
'(' | ')' | '{' | '}' | '[' | ']' | ',' | ';' | ':' | '.' | '=' | '=>' | '->' | '@' | '#'
Chester Development Documentation
This section contains technical documentation for Chester’s implementation details and development notes.
[!NOTE] These documents are intended for both humans and LLMs to understand Chester’s internals.
Quick Links
- Module Structure - Codebase organization and architecture
- TypeScript Backend - How Chester compiles to TypeScript
- Go Backend - Go backend status and implementation
- Type Checking System - Type inference and checking
- Elaboration System - The elaboration algorithm
- Effect System - Effect tracking and CPS transformation
Current Implementation
Chester is implemented in Scala 3.7.4 with multi-platform support:
- JVM: Standard Java Virtual Machine target
- JavaScript: Via Scala.js for browser/Node.js
- Native: Via Scala Native for compiled binaries
Key Components:
- Parser and tokenizer
- Type checker with constraint solving
- TypeScript code generator (fully implemented)
- Go code generator (in progress)
- LSP server for IDE integration
- IntelliJ IDEA plugin
Contributing
When adding development documentation:
- Create markdown files in
docs/src/dev/ - Follow existing documentation style
- Include code examples where appropriate
- Link to relevant source files using
file://links - Update
SUMMARY.mdto include new pages
Chester Module Structure
This document explains the organization of the Chester codebase, which is structured as a multi-platform Scala 3 project.
Overview
Chester uses sbt (Scala Build Tool) with cross-compilation for three platforms:
- JVM - Standard Java Virtual Machine
- JavaScript - Via Scala.js
- Native - Via Scala Native
Top-Level Directory Structure
chester/
├── modules/ # All source code modules
├── vendor/ # Vendored dependencies (Kiama)
├── site/ # Next.js website
├── docs/ # Documentation (mdbook)
├── build.sbt # Build configuration
└── project/ # sbt build definition
Modules
Core Modules
modules/cli/
Command-line interface for Chester.
Platforms: JVM, JavaScript, Native
Key Components:
CLI.scala- Main CLI logic (REPL, file execution, compilation)Config.scala- Command-line argument parsingMain.scala- Entry pointEvaluator.scala- AST evaluation
Responsibilities:
- Parse command-line arguments
- Run the REPL
- Execute and compile Chester files
- Coordinate TypeScript code generation
- Manage import resolution (JS/TypeScript packages)
modules/core/
Core compiler components.
Platforms: JVM, JavaScript, Native
Key Components:
Syntax & Parsing:
CST.scala- Concrete Syntax TreeAST.scala- Abstract Syntax TreeParser.scala- Recursive descent parserTokenizer.scala- Lexical analysis
Type System:
tyck/- Type checking and elaborationElabContext.scala- Elaboration contextCoreTypeChecker.scala- Core type checkerGoImportSignature.scala- Go package signaturesJSImportSignature.scala- JavaScript/TypeScript signatures
Backends:
backend/TypeScriptBackend.scala- TypeScript code generationbackend/GoBackend.scala- Go code generation (in progress)
Transforms:
transform/EffectCPS.scala- Effect CPS transformation
Interop:
interop/typescript/- TypeScript.d.tsparsinginterop/golang/- Go type extraction
Responsibilities:
- Parse Chester source to CST/AST
- Elaborate and type-check programs
- Lower to target backends
- Apply transformations (CPS, etc.)
modules/utils/
Shared utilities used across modules.
Platforms: JVM, JavaScript, Native
Key Components:
doc/- Pretty-printing and document renderingelab/- Elaboration solver infrastructureio/- Cross-platform I/O abstractionsterm/- Terminal/REPL utilities
Responsibilities:
- Provide platform-agnostic I/O
- Document pretty-printing
- Constraint solver for elaboration
modules/lsp/
Language Server Protocol implementation.
Platforms: JVM
Key Components:
- LSP server handlers (not yet fully implemented)
Responsibilities:
- Provide IDE integration via LSP
- Support VS Code, IntelliJ, and other LSP clients
modules/web-repl/
Browser-based REPL.
Platforms: JavaScript (Scala.js)
Key Components:
- Web entry point that reuses CLI logic
Build:
sbt webRepl/copyWebRepl
This generates JavaScript bundles for the browser.
Responsibilities:
- Provide in-browser Chester REPL
- Integrate with the Next.js website
modules/intellij-plugin/
IntelliJ IDEA plugin.
Platform: JVM
Responsibilities:
- Syntax highlighting
- Basic IDE features for Chester
Platform-Specific Code
Each cross-compiled module has platform-specific subdirectories:
modules/cli/
├── shared/ # Shared code (all platforms)
├── jvm/ # JVM-specific
├── js/ # JavaScript-specific (Scala.js)
└── native/ # Native-specific (Scala Native)
Shared code goes in shared/src/main/scala/.
Platform code goes in <platform>/src/main/scala/.
Vendored Dependencies
vendor/kiama/
Chester vendors a modified version of the Kiama library for:
- Tree manipulation
- Attribute grammars
- Rewriting
This is cross-compiled for JVM, JS, and Native.
Build System (sbt)
Key Build Definitions
build.sbt:
- Defines all modules and their dependencies
- Configures cross-platform settings
- Sets Scala version (3.7.4)
Common Settings:
- Semantic DB enabled (for Metals/LSP)
- UTF-8 encoding
- Explicit nulls (
-Yexplicit-nulls)
Useful sbt Commands
# Compile all modules
sbt compile
# Run JVM CLI
sbt "cliJVM/run"
# Build CLI assembly (fat JAR)
sbt cliJVM/assembly
# Build native binary
sbt cliNative/nativeLink
# Run tests
sbt test
# Build web REPL
sbt webRepl/fullOptJS
# Copy web REPL to website
sbt webRepl/copyWebRepl
Data Flow
Compilation Pipeline
Source File (.chester)
↓
CharReader → Tokenizer → Parser
↓
CST (Concrete Syntax Tree)
↓
Elaborator (with type inference)
↓
AST (Abstract Syntax Tree)
↓
Core Type Checker
↓
Transformations (Effect CPS, etc.)
↓
Backend (TypeScript, Go)
↓
Target Code (.ts, .go)
Import Resolution (TypeScript)
import "package-name" in Chester source
↓
CLI detects import
↓
Resolve package in node_modules/
↓
Extract .d.ts files
↓
TypeScriptDeclParser parses definitions
↓
TypeScriptToChester converts to Chester signatures
↓
ElabContext includes import signatures
↓
Type-check against imported types
Module Dependencies
cli → core, utils
core → utils
lsp → core, utils
web-repl → cli (reuses CLI for browser)
intellij-plugin → (standalone)
All modules depend on vendored Kiama.
Testing
Tests are colocated with source code in src/test/scala/.
Test Framework: MUnit
Running Tests:
# All tests
sbt test
# Specific module
sbt core/test
# Specific test
sbt "core/testOnly chester.backend.TypeScriptBackendTest"
Development Workflow
- Edit code in
modules/<module>/shared/src/main/scala/ - Compile with
sbt compile - Run tests with
sbt test - Test CLI with
sbt "cliJVM/run <args>" - Build assembly with
sbt cliJVM/assembly
Next Steps
- Development Guide - Contribution guidelines
- Type Checking System - How type inference works
- Elaboration System - The elaboration algorithm
- TypeScript Backend - Code generation details
TODO
parse a.b.
import a.b.{x,y};
prompt
create a concise prompt introducing the language so that large langiage model can read and write chester
abstract syntax
maybe create a trait and let graalvm’s syntax tree variant implement it
block type leak
{let a = Int; let b: a = 1; b}
Chester Compiler Backend Architecture
Overview
This document outlines the backend architecture of the Chester compiler system. The backend is responsible for transforming Chester’s internal representation into executable code for various target platforms.
Supported Compiler Targets
Chester currently supports two compiler backends with different maturity levels:
TypeScript Backend ✅ FULLY IMPLEMENTED
Status: Production-ready, actively used
The TypeScript backend (TypeScriptBackend.scala) transforms Chester code into readable TypeScript.
Features:
- Complete AST lowering from Chester to TypeScript
- Type annotation generation
- Function declarations and arrow functions
- Record → interface transformation
- Enum support
- ES module import/export
- Effect CPS transformation (optional)
Usage:
chester ts input.chester --output output.ts
For detailed documentation, see TypeScript Backend Implementation.
Go Backend 🚧 IN PROGRESS
Status: Type signatures implemented, code generation in progress
The Go backend (GoBackend.scala) aims to compile Chester to Go.
Current State:
- ✅ Go import signatures (
GoImportSignature.scala) - ✅ Package path normalization
- ✅ Go AST structure defined
- 🚧 Code generation partially implemented
- ⏳ CLI integration pending
For detailed documentation, see Go Backend.
Backend Pipeline
The Chester compiler backend follows a multi-phase code generation pipeline:
Core AST (type-checked)
↓
Backend Lowering (target-specific)
↓
Target AST (TypeScript AST / Go AST)
↓
Pretty Printing
↓
Source Code (.ts / .go)
Multi-Platform Execution
Chester itself is implemented in Scala 3 and runs on multiple platforms:
- JVM: Standard Java Virtual Machine (primary development platform)
- JavaScript: Via Scala.js (for browser-based REPL)
- Native: Via Scala Native (for fast CLI startup)
This is separate from the compilation targets (TypeScript, Go) which are what Chester programs compile to.
Type System Mapping
Chester’s rich type system needs careful mapping to target language types:
| Chester Type | JavaScript/TypeScript | JVM | Native (Planned) |
|---|---|---|---|
| Integer | number | scala.BigInt | int64_t |
| Natural | number | scala.BigInt | uint64_t |
| Boolean | boolean | scala.Boolean | bool |
| String | string | java.lang.String | std::string |
| Union Types (A | B) | A | B | Specialized classes |
| Record | interface/class | case class | struct |
| Functions | function | Function objects | Function pointers |
Effects Handling
Chester’s effect system is implemented differently for each target language:
- JavaScript/TypeScript: Using promises or custom effect handlers
- JVM: Using exceptions and monadic structures
- Native: Using error codes or custom effect handling
Implementation Example: JavaScript Backend
JavaScript/TypeScript AST Example
The JavaScript target provides a good example of target-specific AST:
// Example: Function declaration in JavaScript AST
FunctionDeclaration(
id = Some(Identifier("greet")),
params = List(Parameter(TypedIdentifier("name", StringTypeAnnotation()))),
returnType = Some(StringTypeAnnotation()),
body = BlockStatement(List(
ReturnStatement(Some(
BinaryExpression(
BinaryOperator.Plus,
StringLiteral("Hello, "),
Identifier("name")
)
))
))
)
This represents the TypeScript function:
function greet(name: string): string {
return "Hello, " + name;
}
JavaScript AST Node Categories
The JavaScript AST supports a wide range of node types:
Expressions
- Literals: Numbers, strings, booleans, null, BigInt, RegExp
- Identifiers: Named references (typed and untyped)
- Operators: Binary, logical, assignment, unary, update
- Function Expressions: Regular functions and arrow functions
- Object and Array Expressions: Object literals and array literals
- Class Expressions: Class definitions with inheritance and method definitions
Statements
- Block Statements: Groups of statements
- Expression Statements: Expressions used as statements
- Control Flow Statements: if/else, while, do-while, for, switch
- Declaration Statements: let, const, var declarations
TypeScript Features
- Type Annotations: For variables, parameters, return types
- Interface and Type Declarations: For defining complex types
- Generics: Type parameters for functions and classes
- Union and Intersection Types: Type combinations
Build System Integration
The Chester compiler backend integrates with build systems through:
- SBT Plugin: For JVM builds
- NPM Package: For JavaScript/TypeScript integration
- CLI Interface: For command-line usage
Future Directions
Planned improvements to the compiler backend include:
- WebAssembly Support: Direct compilation to WebAssembly
- More Native Targets: Support for various native platforms
- Interoperability Enhancements: Better interop with target languages
- Performance Optimizations: Target-specific optimizations
- Cross-Compilation: Single-command compilation to multiple targets
- Advanced Optimizations: Target-specific performance improvements
References
- JavaScript AST is inspired by the ESTree Spec
- JVM codegen draws from Scala 3 compiler techniques
- LLVM-based compilation follows the LLVM Language Reference
TypeScript Backend
The Chester TypeScript backend transforms type-checked Chester AST into readable TypeScript code.
[!NOTE] Status: ✅ Fully Implemented and Working
Overview
Located at modules/core/shared/src/main/scala/chester/backend/TypeScriptBackend.scala, the TypeScript backend provides a complete lowering from Chester’s core AST to TypeScript.
Features
| Feature | Status | TypeScript Output |
|---|---|---|
Functions (def) | ✅ | Function declarations |
| Lambdas | ✅ | Arrow functions |
| Records | ✅ | Interface declarations |
| Enums | ✅ | TypeScript enums |
| Coenums | ✅ | TypeScript enums |
| Let bindings | ✅ | IIFEs (Immediately Invoked Function Expressions) |
| Blocks | ✅ | IIFEs for expression contexts |
| Literals | ✅ | Number, string, boolean literals |
| Lists/Tuples | ✅ | Arrays |
| Field access | ✅ | Property access |
| Binary operators | ✅ | TypeScript operators |
| Type annotations | ✅ | TypeScript type annotations |
| JS/TS imports | ✅ | Import declarations |
API
Main Entry Point
def lowerProgram(ast: AST, config: Config = Config()): TypeScriptAST.Program
Converts a Chester AST into a complete TypeScript program.
Configuration
case class Config(
applyEffectCPS: Boolean = false,
cpsConfig: EffectCPS.Config = EffectCPS.Config()
)
applyEffectCPS: Enable CPS transformation for effect typescpsConfig: Configuration for effect CPS transformation
How It Works
Lowering Pipeline
Chester AST → TypeScriptBackend.lowerProgram → TypeScript AST → Pretty Print → .ts file
Statement Lowering
Chester statements are transformed as follows:
def Declarations:
def greet(name: String): String = "Hello, " ++ name
↓
function greet(name: string): string {
return "Hello, " + name;
}
Records:
record Person(name: String, age: Int);
↓
interface Person {
name: string;
age: number;
}
Enums:
enum Color { Red, Green, Blue };
↓
enum Color {
Red,
Green,
Blue
}
JS/TS Imports:
import "lodash";
↓
import * as lodash from "lodash";
Expression Lowering
Lambdas:
(x: Int) => x * 2
↓
(x: number) => x * 2
Let Bindings (as IIFEs):
let x = 5; x + 10
↓
(() => {
const x = 5;
return x + 10;
})()
List Literals:
[1, 2, 3]
↓
[1, 2, 3]
Type Lowering
| Chester Type | TypeScript Type |
|---|---|
String | string |
Int / Natural | number |
Boolean | boolean |
Any | any |
Type / TypeOmega | any |
List[T] | Array<T> |
(T1, T2) | [T1, T2] (tuple) |
(A) -> B | (A) => B (function) |
| Record types | Type references |
| Enum types | Type references |
Effect System Integration
The backend supports optional CPS transformation for Chester’s effect system:
val config = Config(applyEffectCPS = true)
TypeScriptBackend.lowerProgram(ast, config)
When enabled, effect types are transformed before lowering (handled by EffectCPS.transformType).
Usage in CLI
The TypeScript backend is integrated into the Chester CLI via the ts command:
chester ts input.chester --output output.ts
This command:
- Parses and elaborates the Chester source
- Type-checks the AST
- Calls
TypeScriptBackend.lowerProgram - Pretty-prints the TypeScript AST to a
.tsfile
See CLI.scala for implementation.
Example: Complete Transformation
Chester Input
module math;
record Point(x: Int, y: Int);
def distance(p1: Point, p2: Point): Int = {
let dx = p1.x - p2.x;
let dy = p1.y - p2.y;
dx * dx + dy * dy
};
def main: Int = {
let origin = Point(0, 0);
let point = Point(3, 4);
distance(origin, point)
};
TypeScript Output
interface Point {
x: number;
y: number;
}
function distance(p1: Point, p2: Point): number {
return (() => {
const dx = p1.x - p2.x;
return (() => {
const dy = p1.y - p2.y;
return dx * dx + dy * dy;
})();
})();
}
function main(): number {
return (() => {
const origin = { _tag: "Point", _1: 0, _2: 0 };
return (() => {
const point = { _tag: "Point", _1: 3, _2: 4 };
return distance(origin, point);
})();
})();
}
Design Decisions
Why IIFEs for Let Bindings?
Chester’s let bindings are expressions that return values. TypeScript’s let statements don’t return values, so we wrap them in immediately-invoked arrow functions to preserve expression semantics.
Best-Effort Lowering
The backend uses a “best-effort” approach: if a Chester construct has no obvious TypeScript equivalent, it falls back to undefined or an identifier rather than failing. This ensures partial code generation even for experimental features.
Blocks as Statements vs. Expressions
- Top-level blocks: Lowered as statement sequences
- Expression blocks: Wrapped in IIFEs to preserve expression context
Implementation Details
Key Functions
lowerProgram: Entry point, converts AST to TypeScript programlowerStmt: Converts Chester statements to TypeScript statementslowerExpr: Converts Chester expressions to TypeScript expressionslowerType: Maps Chester types to TypeScript typeslowerParam: Converts function parameters
AST Representation
The backend uses a TypeScript-specific AST defined in syntax/TypeScriptAST.scala, which includes:
- Expression nodes (literals, identifiers, calls, arrows, etc.)
- Statement nodes (declarations, blocks, returns, etc.)
- Type annotation nodes
- TypeScript-specific constructs (interfaces, enums, namespaces)
Testing
The TypeScript backend is tested via:
sbt "core/testOnly chester.backend.TypeScriptBackendTest"
Tests cover:
- Basic type lowering
- Function declarations
- Expression transformations
- Edge cases
Future Enhancements
Potential improvements:
- More idiomatic TypeScript output (avoid excessive IIFEs)
- Better pattern matching → switch/if-else lowering
- Optimization passes to simplify generated code
- Source map generation for debugging
- Module system improvements
Related Documentation
- CLI Usage - How to use the
chester tscommand - Go Backend - Similar backend for Go (in progress)
- Module Structure - Codebase organization
Go Backend
Chester includes a Go code generation backend, currently in development.
[!NOTE] Status: The Go backend type signature system is implemented, but code generation and CLI integration are still in progress.
Overview
The Go backend aims to allow Chester code to compile to Go, similar to how the TypeScript backend works. This enables Chester to target Go’s ecosystem while maintaining type safety.
Current Implementation
Type Signatures (GoImportSignature)
Located in modules/core/shared/src/main/scala/chester/tyck/GoImportSignature.scala.
Purpose: Represent Go package exports for type checking Chester imports.
final case class GoImportSignature(
fields: Vector[Param],
packageName: String
)
Features:
- Package path normalization
- Type name generation for imports
- Parameter freshening for safe reuse
Example:
For a Go package go:fmt with exports like Println, Printf, etc., Chester generates a record type signature:
// When you write:
import "go:fmt";
// Chester treats it as:
type GoImport_fmt = {
Println: (a: Any) -> IO Unit,
Printf: (format: String, args: Array[Any]) -> IO Unit,
...
};
Code Generation (GoBackend)
Located in modules/core/shared/src/main/scala/chester/backend/GoBackend.scala.
Current Status: Structure defined, code generation in progress.
Planned Features:
- AST → Go AST lowering
- Function and type declarations
- Module/package generation
- Import management
Architecture
Import Resolution Flow
Chester source with `import "go:..."`
↓
Extract Go import specifiers
↓
Call go-type-extractor tool
↓
Parse Go package types
↓
Generate GoImportSignature
↓
Add to ElabContext
↓
Type-check Chester code against Go types
Code Generation Flow (Planned)
Type-checked Chester AST
↓
GoBackend.lowerProgram
↓
Go AST representation
↓
Pretty-print to .go file
Comparison with TypeScript Backend
| Feature | TypeScript Backend | Go Backend |
|---|---|---|
| Import signatures | ✅ Fully implemented | ✅ Implemented |
| CLI integration | ✅ chester ts command | ⏳ Not yet |
| Code generation | ✅ Complete | ⏳ In progress |
| npm integration | ✅ Auto-resolve packages | N/A |
| Testing | ✅ Comprehensive tests | ⏳ Basic tests |
Go Type Extraction Tool
Chester includes a Go type extraction utility in tools/go-type-extractor/ (planned).
Purpose: Analyze Go packages and extract type information.
Requirements:
- Go toolchain installed
- Access to Go package sources
Usage (planned):
# Extract type signatures from a Go package
go-type-extractor go:fmt > fmt-signature.json
# Use in Chester
import "go:fmt";
fmt.Println("Hello from Chester!");
Example Usage (Future)
Once complete, the Go backend will work like this:
module hello;
import "go:fmt";
import "go:time";
def greet(name: String): IO Unit = {
let now = time.Now();
fmt.Printf("Hello, %s! Time: %v\n", [name, now])
};
def main: IO Unit = greet("Gopher");
Compile to Go:
chester go hello.chester --output hello.go
Generated hello.go:
package hello
import (
"fmt"
"time"
)
func greet(name string) {
now := time.Now()
fmt.Printf("Hello, %s! Time: %v\n", name, now)
}
func main() {
greet("Gopher")
}
Implementation Roadmap
Phase 1: Type Signatures ✅
-
GoImportSignaturedata structure - Package path normalization
- Type name generation
- Integration with elaboration context
Phase 2: Type Extraction ⏳
-
Build
go-type-extractortool - Parse Go type declarations
- Handle Go generics (latest Go versions)
- Extract function signatures
- Extract struct definitions
Phase 3: Code Generation ⏳
- Lower Chester AST to Go
- Function declarations
- Variable bindings
- Control flow (if, loops)
- Type definitions
- Import management
Phase 4: CLI Integration ⏳
-
Add
chester gocommand - Auto-resolve Go packages
- Directory compilation
- Integration tests
Phase 5: Advanced Features 📋
- Go generics mapping
- Interface implementation
- Goroutines and channels (effects)
- CGo interop
- Build tool integration
Technical Challenges
Type System Differences
Go:
- Structural typing for interfaces
- No sum types (use interfaces + type assertions)
- Simpler generics (type parameters)
Chester:
- Dependent types
- Sum types (algebraic data types)
- Advanced type inference
Strategy: Lower Chester’s richer types to Go patterns (interface{}, type assertions, etc.).
Effect System
Go doesn’t have an effect system. Chester’s IO effects need to map to:
- Functions with side effects
- Error handling (return multiple values)
- Goroutines for concurrency
Memory Management
- Go: Garbage collected
- Chester: Abstract (platform-dependent)
Strategy: Rely on Go’s GC, avoid manual memory management in generated code.
Testing
Run Go backend tests:
sbt "core/testOnly chester.backend.GoBackendTest"
Current tests cover:
- Import signature generation
- Package path normalization
- Type name generation
Contributing
To contribute to the Go backend:
- Type Extraction: Implement
go-type-extractorin Go - Code Generation: Extend
GoBackend.scalawith lowering logic - Testing: Add test cases for various Chester constructs
- Documentation: Keep this doc updated with progress
See the TypeScript Backend for a reference implementation.
Resources
Development Memo
NOTE THAT ALL CODE AND DOCUMENTS CAN AND WILL BE OUTDATED OR CONFLICTING. ANALYSE INFORMATION AVAIABLE TO YOU CAREFULLY
Documentation Guidelines
-
Avoid Meaningless Adjectives
- ❌ NEVER USE subjective qualifiers like: “better”, “improved”, “good”, “great”, “enhanced”, “advanced”, “beautiful”, “powerful”, “robust”, “excellent”, “high-quality”
- ✅ ALWAYS USE factual, specific, measurable descriptions
- Describe concrete characteristics and behaviors
- Focus on correctness and functionality first
-
Focus on Facts
- Document what something IS, not how “good” you think it is
- Identify concrete capabilities and limitations
- Omit subjective assessments and emotional language (BUT EMPHASIZE ON WHAT ALWAYS DID WRONG BY DEVELOPERS IS OK LIKE EMOJI EMPHASIZE AND UPPERCASE AND BOLD)
- Avoid superlatives and value judgments
-
Eliminate Fluff Phrases
- Remove sentences that don’t add information
- Avoid concluding paragraphs that just say “this is useful”
- Don’t add generic statements about quality or value
- Delete phrases like “comprehensive framework”, “elegant solution”, etc.
-
Be Specific and Concrete
- Instead of “improved performance”, describe the specific optimization technique
- Instead of “enhanced error reporting”, specify exactly what information is included in errors
- Replace “powerful features” with a specific list of capabilities
Development Practices
Planning Changes
-
Document Before Implementing
- Always document the steps you plan to take BEFORE making any code changes
- Break down complex changes into clearly defined smaller steps
- For each step, explain:
- What will be changed
- Why the change is needed
- How the change relates to the larger goal
- What tests will verify the change
- Review your plan for completeness before starting implementation
- Document any deviations from the plan that occur during implementation
-
Use Step-by-Step Implementation
- After documenting your plan, implement one step at a time
- Run the full test suite (
sbt rootJVM/test) after each step - Commit logical units of work with clear messages
- Do not proceed to the next step until the current step passes all tests
Making Changes
-
Keep Changes Small and Focused
- Make one logical change at a time
- Break down large changes into smaller, independent steps
- Each change should be easily reviewable and testable
-
Testing Requirements
- ALWAYS use the following commands for running tests:
# Run all tests from the root project sbt rootJVM/test # Run a specific test class from the root project sbt "rootJVM/testOnly chester.elab.ElabHoleTest" # You can also run tests for specific modules when needed sbt reader/test sbt semantics/test # Run specific test classes in modules sbt "reader/testOnly chester.reader.FileParserTest" sbt "semantics/testOnly chester.elab.ElabLiteralAndListTest" - DO NOT navigate into subdirectories to run tests (e.g.,
cd reader && sbt test) - ALWAYS run tests from the root project directory
- ⚠️ CRITICAL: NEVER use the
-ztest filter option ⚠️-zis NOT the correct syntax for filtering tests in MUnit- This option is broken and produces unreliable results
- Tests may appear to pass when they actually fail
- This can lead to false confidence in your changes
- ScalaTest uses
-zfor filtering, but MUnit uses--tests=instead
- ⚠️ IMPORTANT: Only use correct MUnit filter syntax with
--⚠️- When filtering tests, always use proper MUnit syntax:
- CORRECT MUnit syntax examples:
# Filter by test name sbt "rootJVM/testOnly -- --tests=myTestName" # Filter by glob pattern sbt "rootJVM/testOnly -- *MyTest" - INCORRECT syntax from other frameworks (DO NOT USE):
# ScalaTest style (WRONG with MUnit) sbt "rootJVM/testOnly -- -t MyTest" # JUnit style (WRONG with MUnit) sbt "rootJVM/testOnly -- -n MyTest" # Custom incorrect style sbt "rootJVM/testOnly -- -only testname"
- ALWAYS run
sbt rootJVM/testbefore committing changes - Fix any test failures before committing
- Add new tests for new functionality
- Update existing tests when modifying behavior
- Test both success and failure cases
- 💡 Development Tip: For quickly adding and testing new elaboration scenarios during development, you can add test cases to existing test files in the
semantics/shared/src/test/scala/chester/elab/directory or create new test files following the existing patterns. To run specific elaboration tests for rapid feedback, use commands like:
This approach targets elaboration tests directly within the semantics module, providing a faster feedback loop than running the full suite viasbt "semantics/testOnly chester.elab.ElabLiteralAndListTest" | cat sbt "semantics/testOnly chester.elab.ElabHoleTest" | catsbt rootJVM/test. Remember to usesbt rootJVM/test | catfor final verification before committing. - For parser changes:
- Many tests now run against both old and new readers (V1 and V2)
- Some complex tests currently only run against V1 (original reader)
- When adding new parser tests:
- Use
parseAndCheckBothby default for new tests - Only use
parseAndCheckif testing V1-specific features - Document if test is V1-only and why
- Plan to migrate V1-only tests to V2 when ready
- Use
- Test function usage:
parseAndCheck: V1 parser onlyparseAndCheckBoth: Both V1 and V2 parsersparseAndCheckV1: Deprecated alias for parseAndCheckBoth
- Recently migrated tests:
- Basic operator sequence tests
- Pattern matching tests with uniform symbol treatment
- Simple expression tests
- Function call tests
- Dot notation tests
- Object tests
- Tuple tests
- Vararg tests
- Floating-point number parsing tests
- List tests with mixed types
- Tests still needing migration:
- Complex operator sequences (prefix, mixfix)
- Telescope parsing
- Error handling
- Source position tracking
- For type checking changes:
- Test term preservation in elaborated results
- Test type-level computation works correctly
- Test error reporting is accurate
- Test edge cases and corner cases
- ALWAYS use the following commands for running tests:
-
Verify Changes with Git
# After each change - ALWAYS use | cat to prevent terminal control issues: git diff | cat # Review what changed git add <files> # Stage specific files git status | cat # Verify staged changes git commit -m "..." # Commit with clear message⚠️ Always append
| catto git diff commands to avoid paging issues. -
Change Verification Checklist
-
Changes are minimal and focused
-
Git diff shows only intended changes
-
Tests pass after changes
-
Changes align with existing code style
-
Review the git diff output carefully
# Before committing, ALWAYS verify changes with: git diff | cat💡 WHY THIS MATTERS: Failure to review diffs properly is the #1 cause of accidental code deletions and introduction of subtle bugs.
-
Reviewing git diff output is essential for catching:
- Accidental deletions of important methods or logic
- Unintended modification of critical code
- Formatting changes that might impact behavior
- Changes to files you didn’t intend to modify
-
Pay special attention to large diffs that might hide important changes
-
Verify no unrelated changes were included
-
When making multiple changes, review each file’s diff separately for clarity
-
-
Post-Commit Verification
- ⚠️ MANDATORY: Always verify your changes after committing with
git diff HEAD^ HEAD | cat - Check the diff output carefully to ensure:
- No unintended changes were included
- All intended changes were properly committed
- File renames and deletions are correctly reflected
- No sensitive or debug code was accidentally committed
- No accidental deletions of important logic
- Verify the commit message accurately describes the changes
- For complex changes involving multiple files, check each file’s changes individually
- ⚠️ MANDATORY: Always verify your changes after committing with
-
Git Command Tips
- Always use
| catwith git commands that might trigger paging:git diff | cat git log | cat git show | cat - This ensures consistent output and avoids interactive paging
- Always use
Terminal Control with Git Commands
-
⚠️ CRITICAL: ALWAYS Use
| catSuffix- Git commands that might trigger paging or interactive prompts MUST ALWAYS end with
| cat - This is a MANDATORY practice, not a suggestion
- This ensures consistent output and prevents terminal control issues
- Failure to use
| catis the leading cause of incomplete reviews and missed errors - Examples:
git checkout main | cat git merge --no-ff branch | cat git log | cat git diff | cat git show | cat git branch | cat
- Git commands that might trigger paging or interactive prompts MUST ALWAYS end with
-
Common Git Operations
# Switching branches git checkout main | cat git checkout -b new-branch | cat # Merging git merge --no-ff feature-branch | cat git merge --abort | cat # If merge conflicts occur # Viewing changes git status | cat git log --oneline | cat git show HEAD | cat # Committing git add . | cat git commit -m "type: description" | cat -
Why This Matters
- Prevents terminal from entering interactive mode
- Ensures consistent output formatting
- Avoids getting stuck in pagers like
less - Makes automation and scripting more reliable
Troubleshooting Development Issues
- Recovering from Broken Edit Tools
- If edit tools in your IDE or development environment are broken/malfunctioning, you can use git to recover:
# Discard changes to a specific file git checkout -- path/to/file | cat # Discard all changes in the working directory git checkout -- . | cat # Revert to a specific commit git checkout [commit-hash] -- path/to/file | cat - This approach is especially useful when tools that normally handle editing break unexpectedly
- Always verify what you’re checking out before executing the command to avoid losing important changes
- If edit tools in your IDE or development environment are broken/malfunctioning, you can use git to recover:
AI Agent Testing Instructions
-
Terminal Interruption Issues
- If you are an AI agent working on Chester code and notice:
- Frequent
^Ccharacters appearing in command output - Commands being interrupted prematurely
- Test results not displaying properly
- Terminal output being cut off
- Frequent
- STOP attempting to run tests and:
- Inform the user about the terminal connection issues
- Ask the user to run the tests manually
- Request that the user provide the test results
- This indicates a problem with the terminal connection, not with the code itself
- If you are an AI agent working on Chester code and notice:
-
Test Running Best Practices for AI Agents
- ALWAYS use these exact commands for running tests:
# Run all tests sbt rootJVM/test | cat # Run a specific test class (include quotation marks) sbt "semantics/testOnly chester.elab.ElabLiteralAndListTest" | cat - NEVER attempt to run tests with other project paths like
cli/test,semantics/test, etc. - ⚠️ CRITICAL: NEVER use the
-ztest filter option ⚠️- Example of what NOT to do:
sbt "semantics/testOnly chester.elab.ElabLiteralAndListTest -z myTest" - The
-zflag is completely broken and will cause misleading results - Tests might appear to pass when they should fail
- Using
-zwill lead to incorrect conclusions about code behavior
- Example of what NOT to do:
- ⚠️ CRITICAL: NEVER use
--to pass arguments to tests ⚠️- Example of what NOT to do:
sbt "rootJVM/testOnly -- -t MyTest" - This will cause tests to run incorrectly or not at all
- No arguments should be passed after the test class name
- Example of what NOT to do:
- Always run full test suites rather than individual tests when possible
- Verify that terminal commands execute completely before proceeding
- If a test command produces an error about not finding the test class:
- First try the full
rootJVM/testcommand to run all tests - Then check if the test class path is correct
- Do not experiment with different project paths
- First try the full
- If tests are taking too long to complete, inform the user and suggest they run the tests locally
- ALWAYS use these exact commands for running tests:
Term System Architecture
Chester uses a unified term representation architecture to support multiple platforms:
Term Definition Structure
- Unified Term Definition
- All term types are defined in a single file:
syntax/shared/src/main/scala/chester/syntax/core/Term.scala - This approach simplifies the codebase and eliminates the need for separate platform-specific implementations
- Each term type follows a consistent pattern with standard methods and field annotations
- All term types are defined in a single file:
Import Guidelines
- DO use
import chester.syntax.core.*- This will give you access to all term implementations
// CORRECT
import chester.syntax.core.*
// INCORRECT - unnecessarily specific imports
import chester.syntax.core.BlockTerm
import chester.syntax.core.FCallTerm
Pattern Matching and Type Usage
Use concrete term types directly for pattern matching:
// CORRECT
case t: BlockTerm => {
val reducedStatements = t.statements.map(stmt => r.reduce(stmt))
val reducedResult = r.reduce(t.result)
BlockTerm(reducedStatements, reducedResult, t.meta)
}
Term Type Implementation Pattern
All term types follow a consistent implementation pattern:
case class ExampleTerm(
@child var field1: Term, // Use @child for term fields that should be traversed
@const val field2: String, // Use @const for non-term fields
@const meta: OptionTermMeta
) extends BaseTerm {
override type ThisTree = ExampleTerm
// Pretty printing method
override def toDoc(using PrettierOptions): Doc =
Doc.text("ExampleTerm(") <> field1.toDoc <> Doc.text(")")
// Tree traversal method
override def descent(f: Term => Term, g: TreeMap[Term]): Term =
thisOr(copy(field1 = g(field1)))
}
Adding New Term Types
When adding a new term type:
- Add it directly to
Term.scala - Follow the existing pattern for similar term types
- Implement all required methods (
toDoc,descent, etc.) - Use correct field annotations (
@child,@const) - Extend the appropriate base type (e.g.,
TypeTerm,ExprTerm)
Example: Adding a New Term Type
For example, to add a new term type for union types:
case class UnionTypeTerm(
@child var types: Vector[Term],
@const meta: OptionTermMeta
) extends TypeTerm {
override type ThisTree = UnionTypeTerm
override def toDoc(using PrettierOptions): Doc =
Doc.text("UnionType(") <> Doc.join(Doc.text(", "), types.map(_.toDoc)) <> Doc.text(")")
override def descent(f: Term => Term, g: TreeMap[Term]): Term =
thisOr(copy(types = types.map(g)))
}
Key Term Types
The system includes several important term categories:
- Expression Terms: Represent runtime values (variables, function calls, literals)
- Type Terms: Represent type information (primitive types, function types, union types)
- Statement Terms: Represent declarations and control flow (let/def bindings, trait definitions)
- Pattern Terms: Represent pattern matching constructs
- Special Terms: Represent special language constructs (holes, placeholders)
Each category has a base trait that defines its common behavior.
Why This Matters
- Simplified Architecture: The unified term definition makes the codebase more maintainable
- Cross-Platform Compatibility: All platforms use the same term representation
- Consistent Patterns: All term types follow the same implementation pattern
- Easier Extensions: Adding new term types follows a clear and consistent approach
Elaboration and Reduction Strategy
Reduction During Type Checking
-
Keep Original Forms
- The elaborator MUST preserve original terms in the elaborated result
- NEVER reduce during elaboration
- Only use reduction internally during type checking when absolutely necessary
- This makes the elaborated code identical to source code, making it:
- Easier to debug
- Easier to understand
- Better for error messages
- More suitable for further transformations
-
When to Reduce
- Only TWO places should use reduction:
- Type equality checking in unification
- Field access checking on type-level terms
- Use
ReduceMode.TypeLevelfor these internal reductions - NEVER use reduction in elaborated results
- Only TWO places should use reduction:
Example:
// Original code
def idType(x: Type): Type = x;
let aT = idType(A);
def getA(x: aT): Integer = x.a;
// WRONG - reducing during elaboration:
LetStmtTerm(localv, reducer.reduce(idType(A)), ty, meta)
// RIGHT - keeping original term:
LetStmtTerm(localv, idType(A), ty, meta)
// RIGHT - internal reduction only for field checking:
def checkFieldAccess(recordTy: Term, field: Name): Term = {
// Use type-level reduction only for checking field existence
// Keep original term in result
// ...
}
Reduction Context and Type Checking
-
Reduction Context Setup
- Each
Contextinstance provides its own reduction context viatoReduceContext - This ensures consistent reduction behavior during type checking
- Allows for future extensions to reduction context
- Each
-
Type-Level Reduction
- Only reduce type-level terms when necessary for type checking
- Keep original terms in elaborated results
- Use
ReduceMode.TypeLevelto control reduction behavior
-
Field Access Checking
- Use type-level reduction to verify field existence
- Keep original terms in field access expressions
- Report errors using original terms for better error messages
Common Pitfalls
-
Over-reduction
- Don’t reduce terms during elaboration
- Don’t reduce terms when adding to context
- Only reduce when needed for type checking
-
Loss of Original Terms
- Always preserve original terms in elaborated results
- Don’t reflect internal reductions in output
- Keep source code structure intact
-
Incorrect Reduction Context
- Always use proper reduction context from current context
- Don’t create new reduction contexts unnecessarily
- Use consistent reduction mode for type checking
Coding Conventions
Imports
- Document Utilities: When using utilities from the
chester.utils.docpackage (such asDoc,PrettierOptions, or extension methods likerender), prefer using a single wildcard import:import chester.utils.doc.*.
String Formatting and Internationalization
-
Use Template Strings for User-Facing Text
- ALWAYS use template strings (
t"") for user-facing messages, not string interpolation (s"") - ALWAYS use template strings (
t"") for plain user-facing text, even without variables - Always import the internationalization package:
import chester.i18n.* - This ensures proper internationalization and localization support
// CORRECT - using template strings for user-facing text import chester.i18n.* val username = "Alice" val message = t"Hello $username, welcome to Chester!" // CORRECT - using template strings for plain text without variables val errorMessage = t"Operation failed. Please try again." // INCORRECT - using string interpolation for user-facing text val message = s"Hello $username, welcome to Chester!" // INCORRECT - using regular string literals for user-facing text val errorMessage = "Operation failed. Please try again." - ALWAYS use template strings (
-
String Interpolation for Internal Use Only
- Only use string interpolation (
s"") for internal, non-user-facing strings - Examples include debug logging, internal identifiers, and non-displayed text
// CORRECT - using string interpolation for internal/technical content val logMessage = s"DEBUG: Processing request from $username with params $params" val technicalId = s"${prefix}_${uuid}" - Only use string interpolation (
-
Why This Matters
- Template strings enable automatic translation and localization
- They maintain consistent messaging across the application
- They allow for future language additions without code changes
- They ensure a better experience for non-English users
Core Principles
-
Use C-style Braces
- Always use braces for control structures, even for single-line blocks
- Opening brace on the same line
- Closing brace on its own line
// Good if (condition) { doSomething() } else { doSomethingElse() } // Bad - No braces if (condition) doSomething() // Bad - Indentation-based syntax if (condition) doSomething() andThenThis() // Unclear scope -
No Indentation-Based Syntax
- Do not rely on indentation for scope
- Always use explicit braces to define scope
// Good def method() = { val result = { val x = compute() transform(x) } result } // Bad - Indentation-based def method() = val result = val x = compute() transform(x) result -
Function Definitions
- Opening brace on the same line as the function definition
- Use explicit return types
// Good def process(input: String): Result = { // implementation } // Bad def process(input: String): Result = // implementation -
Pattern Matching
- Use braces for case blocks
- Align case statements
// Good expr match { case Literal(value) => { process(value) } case Identifier(name) => { lookup(name) } } // Bad expr match case Literal(value) => process(value) case Identifier(name) => lookup(name) -
For Comprehensions
- Use braces instead of indentation
// Good for { x <- xs y <- ys } yield { combine(x, y) } // Bad for x <- xs y <- ys yield combine(x, y)
Additional Guidelines
- Use parentheses for method calls even when they could be omitted
- Prefer multi-line formatting with braces for complex expressions
- Use explicit type annotations for public APIs
- Keep line length reasonable (max 120 characters)
- Use two-space indentation within braces
Enum Usage
-
Prefer Enums Over String Literals
- Use enums for representing categories, types, or states
- Never use string literals as pseudo-enums
// Good enum DebugCategory { case Cell case Tyck case Reducer } // Bad val category = "CELL" // Using string literals as enum values -
Enum Naming Conventions
- Use PascalCase for enum type names
- Use PascalCase for enum values
- Keep enum names clear and descriptive
-
Enum Usage
- Import enum values when needed
- Use qualified access for clarity in other contexts
- Use pattern matching for exhaustive handling
// Good usage import DebugCategory.* val category = Cell category match { case Cell => handleCell() case Tyck => handleTyck() case Reducer => handleReducer() }
Debugging Practices
-
Understand Before Fixing
- Always understand the root cause of an issue before attempting to fix it
- Use the Debug utility with appropriate categories to trace program execution
- Analyze call stacks to identify where issues occur
- Create minimal test cases that reproduce the issue
-
Systematic Debugging Process
- Enable relevant debug logging (
Debug.enable(DebugCategory.XXX)) - Add strategic logging points to track object state and method execution
- Verify assumptions about code behavior using logs and assertions
- Isolate the issue by creating focused test cases
- Document your findings to help others understand the problem
- Enable relevant debug logging (
-
Debug-First Approach
- When facing complex issues, prioritize debugging over immediate fixes
- Add temporary debugging code when needed, but mark it clearly and remove when done
- Consider adding permanent debugging hooks for areas prone to issues
- Document debugging insights even if they seem obvious
Development Log
2025-04-24
LexerV2 Parser Completion
-
Completed Implementation:
-
Block Termination Pattern
- Implemented robust
}\npattern detection with context awareness - Added state tracking via
newLineAfterBlockMeansEndsflag in LexerState - Ensured consistent handling between V1/V2 parsers
- Preserved uniform symbol treatment principles
- Implemented robust
-
Object Expression Enhancements
- Added comprehensive support for multiple key types:
- Identifier keys (e.g.,
{ x = 1 }) - String literal keys (e.g.,
{ "x" = 1 }) - Symbol literal keys (e.g.,
{ 'x = 1 })
- Identifier keys (e.g.,
- Implemented support for both
=and=>operators in object clauses - Added proper field parsing with comma-separated clauses
- Enhanced error reporting for malformed objects
- Added comprehensive support for multiple key types:
-
Comment Handling Optimization
- Replaced recursive comment collection with more efficient methods
- Implemented
skipComments()andpullComments()for better performance - Added metadata merging for proper comment preservation
- Ensured consistent handling across different parsing contexts
-
Function Call and Block Argument Improvements
- Added special handling for blocks used as function arguments
- Implemented context-aware parsing of function calls with block arguments
- Ensured proper metadata preservation for source positions
- Added consistent handling of nested function calls
-
-
Implementation Strategy:
- Used token-based state machine approach for better performance
- Maintained uniform symbol treatment for all operators
- Implemented comprehensive error handling
- Added extensive debug logging for easier maintenance
-
Completed Features:
- ✅ Block termination with context tracking
- ✅ Object expressions with multiple key types
- ✅ Comment preservation and optimization
- ✅ Function call and block argument handling
- ✅ ⚠️ IMPORTANT: Flat operator sequence representation without precedence handling (intentional design principle, often misunderstood)
-
Files Modified:
reader/shared/src/main/scala/chester/readerv2/LexerV2.scala- Documentation files
2025-04-21
Simplified Type Constraint System by Removing Hacky Approach
-
Problem: The previous approach for handling type constraints used a complicated “cell coverage” system that felt like a hack. The
AutoConnectpropagator andcreateTermStructureConstraintsmethod added unnecessary complexity and indirection to the type system. -
Solution: Completely redesigned the approach to directly handle type relationships without intermediate propagators.
-
Implementation Details:
-
Removed Hacky Components:
- Eliminated the
AutoConnectpropagator entirely - Removed
establishTypeConstraintsand all related “cell coverage” methods - Simplified the code by removing several layers of indirection
- Eliminated the
-
Direct Type Relationship Handling:
- Added direct connections between types directly during unification
- Created explicit relationships between union types and their components
- Connected function call components immediately when encountered
- Simplified codebase by handling type constraints at the point where types are created or unified
-
Improved Union Type Management:
- Enhanced handling of union-to-union subtyping with direct component connections
- Improved union-to-specific and specific-to-union handling
- Created clearer, more maintainable code for union type relationships
-
Function Call Processing:
- Implemented direct processing of function call components
- Added recursive handling for nested function calls
- Eliminated redundant constraint establishment code
-
-
Benefits:
- Cleaner Code: Removed a complicated system that was difficult to reason about
- More Direct: Handles type constraints at the point where types are created or unified
- Better Maintainability: Simpler, more understandable code with fewer moving parts
- Less Indirection: Eliminated multiple layers of function calls for basic operations
- Same Functionality: Maintains the same type checking capabilities with cleaner implementation
-
Files Modified:
semantic/shared/src/main/scala/chester/tyck/TyckPropagator.scalasemantic/shared/src/main/scala/chester/tyck/Elaborater.scala
-
Verification:
- All existing tests continue to pass
- Compiler successfully builds the project
- Type error reporting works as expected
Replaced EnsureCellCoverage Hack with AutoConnect Propagator
-
Implemented Improvements:
-
Replaced EnsureCellCoverage with AutoConnect
- Removed the old placeholder EnsureCellCoverage propagator that only marked cells as covered
- Implemented new AutoConnect propagator that establishes proper type connections based on term structure
- Added logic to analyze different term types (unions, intersections, function calls) and create appropriate connections
-
Enhanced Cell Coverage Mechanisms
- Added support for creating proper connections between terms and their components
- Implemented smart term structure analysis to establish meaningful type relationships
- Added default value support for unconstrained type variables using Any type
-
Improved Function Call Handling
- Added recursive connection for function calls and their arguments
- Special handling for complex argument terms including nested function calls
- Improved handling of CallingArgTerm to ensure all components are properly connected
-
Implementation Approach:
- Replaced all EnsureCellCoverage instances with AutoConnect calls
- Updated Elaborater and TyckPropagator to use the new approach
- Added cell management helper methods and proper zonking support
- Added support for default values in unconstrained cells
-
-
Files Modified:
semantic/shared/src/main/scala/chester/tyck/TyckPropagator.scalasemantic/shared/src/main/scala/chester/tyck/Elaborater.scalasemantic/shared/src/main/scala/chester/utils/propagator/ProvideCellId.scala
-
Benefits:
- Eliminated artificial cell coverage without meaningful constraints
- Improved type error detection through proper constraint checking
- Reduced conceptual complexity in the propagator network
- Enhanced maintainability as the type system evolves
- Fixed cell coverage issues for union types and other complex types
2025-04-05
LexerV2 Parser Enhancements
-
Block Termination: Implemented context tracking for
}\npattern detection, ensuring consistent handling between V1/V2 parsers while preserving uniform symbol treatment -
Object Expressions: Added support for identifier, string, and symbol keys with both
=and=>operators -
Token Optimization:
- Simplified token extractors using common helper methods
- Enhanced comment collection and attachment
- Improved handling of leading/trailing comments
-
Implementation Strategy:
- Added context tracking for block termination
- Maintained uniform symbol treatment for all operators
- Enhanced object expression parsing
- Optimized token handling for better maintainability
-
Migration Status:
- ✅ Pattern matching with proper block handling
- ✅ Block termination with context tracking
- ✅ Basic object expressions
- ✅ Comment preservation
- 🟡 Complex object expressions (in progress)
- 🔴 Source maps and error recovery (planned)
-
Modified:
LexerV2.scala, documentation files
2025-03-14
Parser Improvements and Refactoring
-
Completed Improvements:
-
Operator Precedence Enhancement
- Issue: Complex operator sequences not handled correctly
- Improvement: Added operator precedence table and parsing logic
- Benefits: Better handling of complex expressions
- Implementation: Added precedence table and parsing logic
-
Whitespace Handling Enhancement
- Issue: Inconsistent whitespace handling
- Improvement: Added dedicated whitespace parsing
- Benefits: More consistent parsing behavior
- Implementation: Added whitespace parsing logic
-
Error Recovery Enhancement
- Issue: Parser failed on first error
- Improvement: Added error recovery mechanisms
- Benefits: Better error reporting and recovery
- Implementation: Added error recovery logic
-
Token Type Enhancement
- Issue: Limited token type support
- Improvement: Added more token types
- Benefits: Better token categorization
- Implementation: Added new token types
-
Source Position Tracking
- Issue: Inaccurate error locations
- Improvement: Enhanced position tracking
- Benefits: Better error messages
- Implementation: Added position tracking
-
Test Coverage Enhancement
- Issue: Limited test coverage
- Improvement: Added more test cases
- Benefits: Better code quality
- Implementation: Added test cases
-
Uniform Operator Handling
- Issue: Special case handling for “=>” operator
- Improvement: Removed special cases, unified operator parsing
- Benefits: More consistent operator handling
- Implementation:
- Removed special case in
parseOperator() - Now using StringBuilder for all operators
- All tests passing
- Removed special case in
-
LexerV2 Optimization and Refactoring
- Issue: Code duplication and maintainability issues
- Improvement: Restructured for modularity and conciseness
- Benefits: Better code organization and maintainability
- Implementation:
- Extracted repeated logic into helper methods
- Improved state management
- Fixed compilation error (replaced advance() with state.advance())
- Remaining warnings about unused private members and pattern match exhaustiveness
-
-
Files Modified:
reader/src/main/scala/chester/readerv2/LexerV2.scalareader/src/main/scala/chester/readerv2/Tokenizer.scala
-
Tests: All existing tests passing
V1/V2 Semantic Consistency
- Implementation Plan:
- Analyze test files still using
parseAndCheckto identify semantic differences - Prioritize addressing the complex operator sequence handling first
- Implement proper handling for prefix and mixfix operators in V2
- Test and verify with existing test cases
- Update tests to use
parseAndCheckBothonce they pass - Document any intentional semantic differences that won’t be addressed
- Analyze test files still using
- Benefits:
- More consistent parsing behavior between V1 and V2
- Higher confidence in V2 parser for all use cases
- Easier migration path from V1 to V2
- More tests running against both parsers
Object Expressions
- Implementation Plan:
- Review current object parsing implementation
- Identify missing features compared to V1
- Implement support for complex object syntax
- Test with a variety of object expressions
Telescope Parsing
- Issue: Telescope parsing is not yet implemented in V2
- Improvement: Implement telescope parsing in V2 to match V1 semantics
- Implementation Plan:
- Analyze V1 telescope parsing implementation
- Design and implement equivalent functionality in V2
- Test with existing telescope tests
Block Termination and Newline Handling in V2 Parser
Problem Analysis:
When examining why the pattern matching test fails with V2 parser, I identified several issues:
-
Newline Handling:
- V1 parser has implicit newline handling that affects expression termination
- This is particularly important for blocks that end with
} - V2 parser needs to check for
Token.Newlineafter a block and terminate expressions appropriately - This affects the
parseRestmethod inLexerV2.scala
-
Pattern Matching Block Structure:
- Pattern matching has a unique structure:
identifier match { ... } - The V2 parser needs a general approach to handle this construct without introducing special cases
- The challenge is maintaining uniform handling while correctly parsing pattern matching
- Pattern matching has a unique structure:
-
Test Compatibility:
- Many tests use
parseAndCheckBothwhich runs both V1 and V2 parsers - Tests with newlines after blocks fail because V2 doesn’t terminate expressions correctly
- Pattern matching tests are particularly affected by this issue
- Many tests use
-
StringIndexOutOfBoundsException in Error Reporting:
- When using
parseAndCheckBoth, error reporting code inparseAndCheck.scalacan throwStringIndexOutOfBoundsException - This happens when trying to extract line information for error messages
- Requires bounds checking to prevent exceptions
- When using
-
Parser Architecture Tradeoffs:
- We need to balance flexibility with consistency
- Simple tokenization approach makes it hard to handle significant whitespace/newlines
- Excessive special cases make the parser harder to maintain and reason about
- Context-free parsing is strongly preferred over context-dependent approaches
- A simple, uniform rule (like always ending an OpSeq when seeing
}\n) is better than complex contextual rules
Possible Approaches:
-
Context-Free Newline Handling (PREFERRED):
- Always end OpSeq expression when encountering
}\n(closing brace followed by newline) - Apply this rule uniformly regardless of surrounding context
- Uniform treatment of all block terminations without special cases
- No need to track or analyze the meaning of identifiers like “match”
- Simple, predictable parsing behavior that aligns with Chester’s design principles
- Always end OpSeq expression when encountering
-
Token Differentiation Strategy:
- Enhance tokenizer to differentiate between whitespace and newlines
- This allows the parser to recognize expression boundaries better
- Requires minimal special-casing in the parser
-
Whitespace with Newline Flag:
- Instead of creating a separate
Token.Newlineclass, enhanceToken.Whitespacewith a boolean flag - Add a
canActAsNewlineflag to indicate if this whitespace contains characters that can terminate expressions - This simplifies tokenization while still providing the necessary information to the parser
- Reduces token type proliferation and maintains a cleaner token hierarchy
- Parser can check
token.isWhitespace && token.canActAsNewlinewhen making termination decisions - Avoids the overhead of creating a completely new token type while gaining the same benefits
- Instead of creating a separate
-
Enhanced Block Parsing:
- Modify block parsing to handle different types of blocks in a more general way
- Use structural information rather than keyword recognition
- This approach maintains parser consistency while handling pattern matching
-
Contextual Parsing (LEAST PREFERRED):
- Use context information to parse expressions differently in different situations
- For pattern matching, recognize the context and adjust parsing rules
- More complex and violates the preference for context-free parsing
- Harder to maintain and reason about
Recommended Approach: The Context-Free Newline Handling approach combined with the Whitespace with Newline Flag provides the simplest and most maintainable solution. This approach:
- Maintains Chester’s core design principles of uniform symbol treatment
- Preserves strict separation of parsing from semantic analysis
- Applies a consistent rule for all block terminations without special cases
- Avoids context-dependent parsing which is harder to maintain
- Treats
}\nas a syntactic boundary in all contexts, which is simpler and more predictable
The parser should simply terminate an OpSeq when encountering a }\n pattern, regardless of what identifiers (like “match”) may be present in the sequence. This maintains the context-free nature of the parser and avoids the complexity of context-dependent rules.
Integration with Existing Code:
The proposed changes will affect several components of the current codebase:
-
Consistency with Operator Handling:
- The parser will continue to treat all symbols uniformly, including ‘match’
- No special precedence rules will be added in the parser itself
- Pattern matching will be represented as a standard OpSeq in the AST
- Any special handling of ‘match’ will occur in subsequent passes, not in the parser
-
Interaction with Block Parsing:
- Block parsing will remain unchanged
- The parser will create a standard OpSeq structure for match expressions
- Semantic analysis of pattern matching occurs after parsing, not during
Performance Considerations:
-
Token Differentiation Impact:
- Adding Token.Newline will slightly increase token count but with negligible memory overhead
- Parsing performance should not be significantly affected
- May improve performance by reducing backtracking and error recovery needs
-
Operator-Based Solution Efficiency:
- Leverages existing operator handling machinery
- No additional parsing passes required
- Consistent with current performance profile of operator parsing
Examples:
Current Parsing Result (V1):
// Input:
notification match {
case Email(sender, _) => handleEmail(sender)
case SMS(number, _) => handleSMS(number)
}
// AST (simplified):
OpSeq([
Identifier("notification"),
Identifier("match"),
Block([
OpSeq([Identifier("case"), FunctionCall("Email", ...), Identifier("=>"), ...]),
OpSeq([Identifier("case"), FunctionCall("SMS", ...), Identifier("=>"), ...])
])
])
Desired V2 Parsing Result:
// Same input should produce identical AST structure with flat OpSeq
// The parser has no knowledge of what 'match' means - it's just an identifier
// Structure interpretation happens in later passes, not during parsing
OpSeq([
Identifier("notification"),
Identifier("match"),
Block([
OpSeq([Identifier("case"), FunctionCall("Email", ...), Identifier("=>"), ...]),
OpSeq([Identifier("case"), FunctionCall("SMS", ...), Identifier("=>"), ...])
])
])
Reference Implementation Strategy:
-
Phased Approach:
- First implement the whitespace enhancement with newline flag
- Ensure the parser treats ‘match’ just like any other identifier
- Verify match expressions produce standard OpSeq nodes
- Test with existing pattern matching tests to ensure correct AST structure
-
Validation Criteria:
- All existing tests should pass when using both parsers
- Parser should produce identical AST structures for both V1 and V2
- No special handling for any identifiers including ‘match’ in the parser
- Maintain uniform treatment of symbols throughout the parser
- Preserve strict separation between parsing and semantic analysis
Learning from Other Languages:
-
Scala’s Approach:
- Scala treats ‘match’ as a special keyword with defined precedence
- Pattern matching is handled as a distinct grammar construct
- This differs from Chester’s uniform symbol treatment philosophy
-
Rust’s Approach:
- Rust uses match expressions with block-based syntax
- Parser explicitly recognizes the ‘match’ keyword
- Arms of match expressions have specific parsing rules
- Chester can adapt Rust’s block structure handling while maintaining uniform symbol treatment
Backward Compatibility Guarantees:
-
Parsing Output Compatibility:
- The V2 parser will produce ASTs semantically equivalent to V1 for pattern matching
- Existing code that consumes ASTs will continue to work without modification
- The structure of OpSeq nodes for pattern matching will be preserved
-
What Might Change:
- Internal source position information might be slightly different
- Comment attachment points could vary in edge cases
- Error messages may be more precise or different in wording
Transition Plan:
-
For Test Code:
- Gradually migrate tests from parseAndCheck to parseAndCheckBoth
- Document any tests that must remain on V1 parser temporarily
- Add specific tests for pattern matching edge cases
-
For Production Code:
- The V2 parser implementation can be introduced behind a feature flag
- Allow both parsers to run in parallel initially for validation
- Collect metrics on parsing compatibility and performance
- Full migration only after all tests pass with both parsers
-
For Documentation:
- Update parser documentation to reflect the new approach
- Provide migration notes for any edge cases
- Document the rationale behind the design decisions
Implementation Plan:
-
Whitespace Enhancement:
- Enhance
Token.Whitespacewith acanActAsNewlineflag - Modify tokenizer to set this flag appropriately when encountering newline characters
- Keep token handling simple and uniform
- Enhance
-
Context-Free Expression Termination:
- Update
LexerV2.parseRest()to implement simple}\ntermination rule - Add condition:
if (previousToken == "}" && currentToken.isWhitespace && currentToken.canActAsNewline) - Always terminate OpSeq when this pattern is encountered, regardless of context
- No special cases or context-dependent decisions
- Consistent rule application across all expressions
- Update
-
Uniform Symbol Treatment:
- Maintain the flat OpSeq production for all expressions including pattern matching
- No special handling for any identifiers (including ‘match’)
- Apply termination rules based purely on token patterns, not semantic meaning
- Let later passes handle pattern matching semantics
-
Error Handling Improvements:
- Add bounds checking in
parseAndCheck.scalato preventStringIndexOutOfBoundsException - Ensure safe substring extraction for error messages
- Add bounds checking in
-
Testing Strategy:
- Fix the core expression termination in V2 parser using the context-free approach
- Verify pattern matching tests pass with both parsers
- Gradually migrate more tests to use
parseAndCheckBoth
Current Status:
- Need to implement newline token handling
- Need to enhance operator-based approach for pattern matching
- Need to improve error reporting with bounds checking
- Pattern matching test runs with V1 parser but fails with V2
- More work needed on general parsing of pattern matching without special cases
The }\n Pattern Problem
The Chester parser treats the }\n pattern (closing brace followed by newline) as a significant syntax element for terminating expressions in specific contexts. This pattern plays a crucial role in:
-
Function/Method Definitions
def factorial(n) { if n <= 1 then 1 else n * factorial(n - 1) } // <- newline here ends the function definition val result = factorial(5); // Next statement -
Match Expressions
val result = notification match { case Email(sender, _) => { println(sender); "Email received" } // <- block for this case case SMS(number, _) => "SMS received"; } // <- newline here ends the match expression println(result); // Next statement
Current Implementation Issues
In the V2 parser:
- The
parseBlockmethod inLexerV2.scalarecognizes the closing brace (RBrace) as terminating a block but doesn’t consider what follows it (newline or not) - This causes inconsistencies between V1 and V2 parsers in how expressions are terminated
- The V1 parser considers what comes after the closing brace, but the V2 parser currently doesn’t
Proposed Solution
To address this issue while maintaining context-free parsing principles:
-
Extend Token State Tracking
- Modify the
LexerStateto track if the previous token was aRBrace - Add a helper method like
isAfterClosingBrace()to check this state
- Modify the
-
Update Expression Termination Logic
- In key expression parsing methods, check for the
}\npattern by testing if:- Previous token was
RBrace - Current token is
Whitespacecontaining a newline or isEOF
- Previous token was
- This check should be made in both the
parseExprandparseExprListmethods
- In key expression parsing methods, check for the
-
Ensure Uniform Treatment
- Apply the same termination rules consistently across all expression contexts
- This maintains the context-free parsing principle while addressing the termination issue
-
Add Test Cases
- Create specific test cases for the
}\npattern in different contexts - Verify that both parsers (V1 and V2) handle the pattern identically
- Create specific test cases for the
This solution preserves the uniform symbol treatment principle while ensuring that the }\n pattern is properly handled as a syntactic terminator where appropriate.
Implementation Strategy
- Start with smaller, isolated improvements that don’t affect the overall architecture
- Add comprehensive tests before making significant changes
- Update one component fully before moving to the next
- Prioritize improvements that enhance maintainability first
- Verify each change with existing tests before proceeding to the next improvement
- Complete high-priority features like comment preservation
- Update documentation to reflect implementation progress
2025-03-09
Fixed OnceCell Concurrent Write Issue
- Root Cause: Type-level function applications attempted multiple writes to same OnceCell
- Solution: Added existence check before cell population
- Files Modified:
tyck/src/main/scala/chester/tyck/TyckPropagator.scalatyck/src/test/scala/chester/tyck/TypeCellSpec.scala
- Tests Added:
- Concurrent type-level function application collisions
- Stress test with 100 parallel cell writes
2025-03-15
Type System Improvements Completed
-
Implemented Improvements:
-
Enhanced Type Structure Reduction in DefaultReducer
- Improved
reduceTypeStructuremethod to properly handle: - Union types by recursively reducing their component types
- Intersection types with proper reduction
- Type-level function applications with recursive reduction for complex result types
- Enhanced handling of nested function applications
- Improved
-
Alpha-Equivalence Checking in TyckPropagator
- Enhanced
areAlphaEquivalentmethod to: - Properly handle function types with bound variables
- Compare union and intersection types correctly
- Fall back to regular equality for other cases
- Added bound variable tracking in alpha-equivalence
- Enhanced
-
Enhanced Type Level Comparison
- Improved type level comparison in
tryUnifymethod: - Implemented more flexible compatibility rules between different level types
- Allow finite level types to be compatible with unrestricted level types
- Maintain controlled asymmetric compatibility
- Improved type level comparison in
-
Cell Coverage Mechanisms
- Added dedicated helper method to ensure cell coverage
- Implemented self-coverage for union type components
- Fixed early returns that left cells uncovered
-
TraitCallTerm Implementation
- Added
TraitCallTermin Term.scala - Laid groundwork for trait-record subtyping relationships
- Added
-
-
Files Modified:
semantic/shared/src/main/scala/chester/tyck/TyckPropagator.scalasemantic/shared/src/main/scala/chester/tyck/Elaborater.scalasemantic/shared/src/main/scala/chester/reduce/Reducer.scalasyntax/shared/src/main/scala/chester/syntax/core/Term.scala
-
Next Steps:
- Complete trait-record subtyping implementation
- Implement union-to-union subtyping case
- Fix remaining cell coverage issues in union-subtype.chester
- Add comprehensive test suite for traits and union types
2025-03-16
Term Definition Refactoring
-
Implemented Changes:
-
Unified Term Definitions
- Consolidated all Term definitions into a single Term.scala file
- Eliminated the separate spec/simple/truffle files
- Simplified the codebase by removing the need for converters
-
Updated Documentation
- Updated development.md with new Term implementation approach
- Updated tyck-improvement-proposal.md to reflect Term changes
- Updated type-checking-system.md with current Term usage examples
-
Simplified Type System
- Removed the need for trait interfaces with *C and *F suffixes
- Streamlined the inheritance hierarchy
- Made the codebase more maintainable with simplified Term definitions
-
-
Files Modified:
syntax/shared/src/main/scala/chester/syntax/core/Term.scaladocs/src/dev/development.mddocs/src/dev/tyck-improvement-proposal.mddocs/src/dev/type-checking-system.mddocs/src/dev/devlog.md
Historical: Chester.tyck System (Deprecated and Removed)
Note: The
chester.tycksystem has been removed from Chester. This content is preserved for historical reference and understanding of the evolution to the currentchester.elabsystem.
Original System Architecture (chester.tyck)
The original type checking system was built around propagator networks with cells and constraints:
- Used
TyckPropagatorfor constraint propagation - Had a monolithic
Elaboraterclass with specialized components (ElaboraterBlock,ElaboraterFunction, etc.) - Relied on
CellIdreferences for tracking types - Used a stateful approach for tracking and resolving constraints
Historical Development Commands
Commands that were used with the chester.tyck system:
# Historical test commands for chester.tyck (no longer applicable)
sbt "rootJVM/testOnly chester.tyck.FilesTyckTest"
sbt "semantic/testOnly chester.tyck.TheTyckTest"
sbt "rootJVM/testOnly chester.tyck.FilesTyckTest -- -only add.chester"
Historical Issues and Solutions
The chester.tyck system had several architectural issues that led to its replacement:
1. Complex Propagator Networks
- Constraint propagation was difficult to debug and reason about
- Cell coverage issues led to frequent “cells not covered by any propagator” errors
- Complex interdependencies between propagators made the system brittle
2. Monolithic Architecture
- Large
Elaboraterclass was difficult to maintain and extend - Specialized components were tightly coupled
- Adding new language features required extensive modifications
3. Type System Limitations
- Union type handling was complex and error-prone
- Type-level function applications had limited support
- Alpha-equivalence checking was incomplete for dependent types
Migration to chester.elab
The transition from chester.tyck to chester.elab addressed these issues:
- Modular Design: Handler-based architecture replaced monolithic elaborator
- Cleaner Constraints: Simplified constraint system with dedicated solver
- Better Type Support: Improved union types, type-level functions, and dependent types
- Maintainability: Clearer separation of concerns and more focused components
Legacy Test File Patterns
The chester.tyck system used different test patterns:
// Old chester.tyck test pattern
class FilesTyckTest extends FunSuite {
test("some feature") {
// Test logic using chester.tyck components
val elaborater = new Elaborater(...)
val result = elaborater.elaborate(expr)
// Assertions
}
}
Compare with current chester.elab pattern:
// Current chester.elab test pattern
class ElabHoleTest extends FunSuite {
test("feature description") {
// Setup
val reporter = new VectorReporter[TyckProblem]()
val elabOps = ElabOps(reporter, NoopSemanticCollector)
// Test
val judge = DefaultElaborator.inferPure(expr)(using elabOps)
// Assertions
assertEquals(reporter.getReports.isEmpty, true)
assert(judge.wellTyped.isInstanceOf[ExpectedTerm])
}
}
Historical Documentation References
Content that previously referenced chester.tyck has been updated:
- Development commands in
development.mdupdated to focus on chester.elab - Architecture documentation in
elaboration-system.mdsimplified to focus on current system - Type checking improvement proposals moved to historical section
This historical information is preserved to help developers understand the evolution of Chester’s type system and the rationale behind the current chester.elab architecture.
-
Files Removed:
syntax/shared/src/main/scala/chester/syntax/core/spec/Term.scalasyntax/shared/src/main/scala/chester/syntax/core/simple.scalasyntax/jvm/src/main/scala/chester/syntax/core/truffle.scalasyntax/jvm/src/main/scala/chester/syntax/core/convertToTruffle.scalasyntax/shared/src/main/scala/chester/syntax/core/convertToSimple.scala
-
Benefits:
- Simplified codebase structure
- Reduced code duplication
- Eliminated need for converters
- Made adding new Term types easier and less error-prone
- Improved maintainability
2025-03-19
Trait Implementation Completed
-
Implemented Improvements:
-
Basic Trait Support
- Added support for empty traits and record extension using
<:syntax - Implemented trait-record subtyping relation in type system
- Added proper trait type representation with
TraitTypeTerm - Added appropriate error reporting for trait implementation issues
- Added support for empty traits and record extension using
-
Modified Elaborater for Trait Handling
- Enhanced
ElaboraterBlock.processTraitStmtto handle trait bodies properly - Updated
processRecordStmtto elaborate theextendsClausefor traits - Added handling for trait extension in subtyping relationships
- Implemented trait-to-trait inheritance checks in
TyckPropagator
- Enhanced
-
Trait Field Handling
- Added special handling for field declarations within trait bodies
- Implemented context tracking to recognize trait processing context
- Added
withProcessingTypemethod to track context in elaboration - Created system to handle trait field requirements (future work)
-
Error Types for Traits
- Added
NotATraiterror type for using non-traits in extends clause - Added
NotImplementingTraitfor trait implementation errors - Added
MissingTraitFieldfor missing required trait fields - Enhanced error reporting for trait-related issues
- Added
-
-
Files Modified:
semantic/shared/src/main/scala/chester/tyck/TyckPropagator.scalasemantic/shared/src/main/scala/chester/tyck/ElaboraterBlock.scalasemantic/shared/src/main/scala/chester/tyck/Context.scalasemantic/shared/src/main/scala/chester/tyck/Context2.scalatests/tyck/basic-trait.chester
-
Implementation Approach:
- Created temporary solution for trait field declarations to get basic traits working
- Added field for tracking processing context in the
Contextclass - Simplified trait checking to focus on basic extension relationship
- Used the propagator network for maintaining trait subtyping constraints
-
Test Status:
- All tests passing, including the basic-trait test
- Added support for more complex traits with field requirements
- Test coverage for basic trait extension and trait field access
-
Next Steps:
- Enhance trait field checking for complete field requirement verification
- Add support for multiple trait inheritance
- Implement trait method and default implementations
- Add more comprehensive trait test cases
2025-03-22
Pattern Matching Fix Implementation for V2 Parser
Problem Analysis
The V2 parser was failing to correctly parse pattern matching expressions with blocks after the => operator. This issue was particularly visible in the match2 test in PatternMatchingTest, which showed a mismatch between expected and actual AST structures.
Root Cause
-
Missing Context Tracking:
- V1 parser used
ParsingContext(newLineAfterBlockMeansEnds = true)for contextual parsing - V2 parser lacked this contextual awareness for block termination after newlines
- V1 parser used
-
AST Structure Discrepancies:
- V1 produces consistent OpSeq structures with Identifiers for operators
- V2 wasn’t properly maintaining this structure in pattern matching contexts
Critical Insight: Uniform Symbol Treatment
The key insight that guided our solution was the need to maintain Chester’s uniform symbol treatment:
- V1 parser treats ALL operators uniformly with no special cases
=>is handled as a plain identifier, not a special operator- Context affects only block termination, not token parsing
Implementation Approach
We implemented a 3-step fix that maintains uniform symbol treatment:
-
Added Context to LexerState:
case class LexerState( // Existing fields... newLineAfterBlockMeansEnds: Boolean = false ) { def withNewLineTermination(enabled: Boolean): LexerState = if (this.newLineAfterBlockMeansEnds == enabled) this else copy(newLineAfterBlockMeansEnds = enabled) } -
Updated
checkForRBraceNewlinePattern:- Added context-awareness to only terminate expressions in the right context
- Maintained the existing newline detection logic
def checkForRBraceNewlinePattern(state: LexerState): Boolean = { // Only consider }\n as terminating if we're in the right context if (!state.newLineAfterBlockMeansEnds) return false // Rest of existing implementation // ... } -
Enabled Context for All Blocks:
def parseBlock(state: LexerState): Either[ParseError, (Block, LexerState)] = { val contextState = state.withNewLineTermination(true) // Rest of implementation using contextState // ... }
AST Structure Matching
While the block termination fix allows basic pattern matching to work, there remain differences in the AST structure between V1 and V2 parsers:
=> Diff (- expected, + obtained)
meta = None
- )
- ),
- meta = None
- ),
- OpSeq(
- seq = Vector(
+ ),
Identifier(
Identifier(
+ name = "name",
+ meta = None
+ ),
+ Identifier(
name = ...,
meta = ...
- ),
- ...
+ )
),
These structural differences need to be resolved to ensure full compatibility between parsers. Current theories:
- Different handling of nested OpSeq structures
- Variance in how block expressions are attached to pattern matches
- Potential issues with comment attachment or source positions
Testing Approach
We’re using a phased testing approach:
// Current test approach - used during development
parseAndCheck(input, expected) // Tests with V1 parser only
// Goal after full AST compatibility is achieved
parseAndCheckBoth(input, expected) // Tests with both V1 and V2 parsers
Current tests in PatternMatchingTest show:
- All tests using
parseAndCheckpass with V1 parser - Simple pattern matching (no blocks after
=>) passes withparseAndCheckBoth - Complex pattern matching with blocks still shows AST differences
Next Steps
-
Investigate exact AST structural differences
- Run detailed tests with AST structure dumps
- Compare parsing behavior for complex pattern matching
-
Enhance debug output
- Add more detailed logging of AST structures
- Enable easier comparison between V1 and V2 outputs
-
Add targeted fixes for AST compatibility
- Maintain uniform symbol treatment
- Ensure consistent structure for nested expressions
-
Update tests to use
parseAndCheckBothwhen fully fixed- Migrate tests incrementally as compatibility issues are resolved
- Document any intentional differences
Files Modified
reader/src/main/scala/chester/readerv2/LexerV2.scala
This implementation represents significant progress in aligning V1 and V2 parser behaviors while maintaining Chester’s core design principles of uniform symbol treatment and context-free parsing.
2025-03-23
Comprehensive Type System Improvements Summary
The type system for Chester has undergone significant improvements, particularly in the areas of union types, cell coverage, and trait implementation. Key completed improvements include:
1. Union Type Subtyping Implementation
Completed Features:
-
Implemented three key union subtyping scenarios:
- Union-to-Union Subtyping:
(A|B) <: (C|D)with proper component type relationships - Specific-to-Union Subtyping:
A <: (B|C)for cases like passingIntegertoInteger|String - Union-to-Specific Subtyping:
(A|B) <: Cfor returning a union from a function with specific return type
- Union-to-Union Subtyping:
-
Added cell coverage for all union types and their components:
private def ensureCellCoverage(cell: CellId[Term], cause: Expr)(using state: StateAbility[Tyck], ctx: Context, ck: Tyck ): Unit = { // Connect cell to itself to ensure it's covered by at least one propagator state.addPropagator(UnionOf(cell, Vector(cell), cause)) } -
Implemented proper connections in the propagator network:
- Added direct connections between union types
- Added connections from union types to their components
- Ensured all cells are covered by propagators during unification
2. Cell Coverage Mechanisms
Implemented Solutions:
- Added self-coverage mechanism to prevent “cells not covered” errors during zonking
- Implemented comprehensive coverage for complex types and their components
- Added safeguards to avoid early returns that could leave cells uncovered
- Added debugging support for cell coverage issues
This solution systematically addresses cell coverage issues by ensuring every cell in the propagator network is properly connected, which is essential for the constraint-based type checking system to function correctly.
3. Enhanced Type Level Comparison
Completed Improvements:
- Enhanced how type levels are compared during unification with asymmetric compatibility:
case (Type(level1, _), Type(level2, _)) => { (level1, level2) match { case (LevelFinite(_, _), LevelUnrestricted(_)) => true // Finite is compatible with unrestricted case (LevelUnrestricted(_), LevelFinite(_, _)) => false // Unrestricted is not compatible with finite case _ => level1 == level2 // For other cases, keep exact equality } } - Added recursive reduction for type-level function applications
- Improved alpha-equivalence checking for dependent types
4. Trait Implementation
Implemented Features:
- Added basic trait definition and record extension with
<:syntax - Implemented trait-record subtyping relation in the type system
- Added trait type representation with
TraitTypeTerm - Added trait-to-trait inheritance checking
- Implemented context tracking for trait field declarations
- Added appropriate error reporting for trait-related issues
The trait implementation provides a solid foundation for more advanced features planned in future work, such as complete field requirement verification, multiple trait inheritance, and trait methods with default implementations.
5. Type Structure Reduction Improvements
-
Enhanced the reducer to properly handle union and intersection types:
private def reduceTypeStructure(term: Term)(using ctx: ReduceContext, r: Reducer): Term = { term match { case Union(types, meta) => { val reducedTypes = types.map(ty => reduceTypeStructure(r.reduce(ty))) Union(reducedTypes, meta) } // Similar handling for Intersection and function calls // ... } } -
Added special handling for type-level function applications within type comparisons
Next Steps
While significant progress has been made, some areas still need work:
- Fix remaining edge cases in union-subtype.chester.todo test
- Complete type-level function application enhancement for nested applications
- Enhance trait field requirement verification
- Implement multiple trait inheritance support
- Add trait methods and default implementations
Files Modified:
semantic/shared/src/main/scala/chester/tyck/TyckPropagator.scalasemantic/shared/src/main/scala/chester/tyck/Elaborater.scalasemantic/shared/src/main/scala/chester/reduce/Reducer.scalasyntax/shared/src/main/scala/chester/syntax/core/Term.scalasemantic/shared/src/main/scala/chester/tyck/ElaboraterBlock.scala
2025-03-24
Parser Improvements Completed
Uniform Operator Handling
- Issue: Special case handling for the “=>” and “=” operators in
parseOperator()method - Improvement:
- Removed special case handling for the “=>” operator
- Ensured operators are treated uniformly in the tokenizer
- Treated “=>” like any other operator in the tokenizing process
- Benefits:
- More consistent operator handling
- Simplified code in the
parseOperator()method - Reduced special cases, making the code more maintainable
- Better alignment with Chester’s design principles of uniform symbol treatment
- Implementation:
- Removed special case code for the “=>” operator in the
parseOperator()method - Modified the method to uniformly parse all operators using a
StringBuilder - Verified all tests pass with the change, including operator tests
- Ensured consistent behavior with the original implementation
- Removed special case code for the “=>” operator in the
LexerV2 Optimization and Refactoring
- Issue:
LexerV2.scalahad redundant code and a missing state.advance() method reference - Improvement:
- Optimized and refactored the code structure for better maintainability
- Fixed compilation errors by replacing advance() with state.advance()
- Improved modularity by extracting repeated logic into helper methods
- Enhanced state management for better consistency across the codebase
- Benefits:
- Improved maintainability and readability of the lexer code
- Fixed compilation errors resulting in more stable code
- Better organization of related functionality
- Reduced duplication for easier future updates
- Implementation:
- Replaced direct advance() calls with state.advance() where appropriate
- Restructured code for better organization and clarity
- Maintained functionality while improving code quality
- Ensured all tests continued to pass after changes
Comment Preservation Implementation
- Issue: V2 parser didn’t preserve comments in the AST, unlike V1
- Improvement:
- Added comment collection and attachment similar to V1 parser
- Implemented support for both leading and trailing comments
- Created mechanism for handling comments in blocks and at block boundaries
- Benefits:
- Full feature parity with V1 parser for comment handling
- Source code formatting preservation
- Support for documentation generation tools
- Implementation:
- Added methods to collect and categorize comments during parsing
- Integrated with ExprMeta and CommentInfo structures
- Enhanced expression creation to include comment attachment
- Added test cases with various comment placements
TokenExtractors Refinement
- Issue: Verbose and redundant token extractors made the code harder to maintain
- Improvement:
- Simplified token extractors using a common helper function
- Reduced code duplication for source position extraction
- Made the code more maintainable with concise patterns
- Benefits:
- More readable and maintainable token handling
- Less code duplication
- Better abstraction of common patterns
- Implementation:
- Created a
posExtractfunction to centralize extraction logic - Refactored individual token extractors to use the helper
- Maintained the same semantics with less code
- Created a
Pattern Matching Block Termination Fix
- Issue: Inconsistent handling of the
}\npattern between V1 and V2 parsers in pattern matching - Improvement:
- Added context tracking to LexerState
- Implemented context-aware block termination checks
- Enabled context for all blocks uniformly
- Benefits:
- Consistent behavior between V1 and V2 parsers
- Maintained uniform symbol treatment principle
- Fixed pattern matching tests
- Implementation:
- Added
newLineAfterBlockMeansEndsflag to LexerState - Created
withNewLineTerminationhelper method - Updated
checkForRBraceNewlinePatternto consider context - Enabled context for all blocks in
parseBlock
- Added
Previously Completed Improvements
-
Number Parsing Refactoring
- Extracted specialized methods for different number formats
- Improved error handling for number parsing
-
Enhanced Escape Character Handling
- Extended support for escape sequences (Unicode, octal, hex)
- Improved error reporting for invalid escape sequences
-
Basic Operator Parsing Clean-Up
- Extracted comment parsing to a separate method
- Improved structure of
parseOperator()method
-
Identifier Parsing Correctness
- Aligned with IdentifierRules for consistent character validation
- Improved handling of Unicode characters and emoji
-
SourcePos Creation Efficiency
- Implemented caching for UTF-16 offset calculations
- Reduced tokenization time for complex expressions
Ultra-Compact Tokenizer Implementation
Tokenizer Size Reduction
- Issue: The Tokenizer.scala implementation was longer than necessary
- Improvement:
- Dramatically reduced code size (>25% reduction)
- Consolidated similar methods
- Simplified UTF-16 position tracking
- Enhanced token generation pipeline
- Benefits:
- More maintainable codebase
- Better readability
- Easier to extend with new token types
- More focused implementation
- Implementation:
- Created lookup tables for token types
- Used more functional patterns for token creation
- Streamlined number parsing logic
- Improved string processing with boundary control
- Consolidated position tracking logic
Functional Style Enhancement
- Issue: Imperative style code was harder to maintain
- Improvement:
- Added more functional approach to tokenization
- Implemented lazy stream-based token generation
- Created more concise token construction helpers
- Improved pattern matching throughout the codebase
- Benefits:
- Code more closely aligned with Scala best practices
- Better composability
- More declarative implementation
- Easier to test individual components
- Implementation:
- Used LazyList for token stream generation
- Implemented functional helpers for token creation
- Enhanced pattern matching for token type dispatch
- Added more concise function definitions
Unicode and Emoji Support Enhancements
- Issue: Complex Unicode handling with surrogate pairs needed improvement
- Improvement:
- Enhanced support for supplementary characters
- Improved UTF-16 position mapping
- Streamlined surrogate pair handling
- Added more comprehensive emoji support
- Benefits:
- Better internationalization support
- Correct handling of modern emoji
- Proper source position mapping for all character types
- More robust parsing for all Unicode scripts
- Implementation:
- Used Java’s Character API for proper codepoint handling
- Added special cases for supplementary characters
- Improved UTF-16 position calculation
- Enhanced identifier parsing with Unicode support
Performance Optimization
- Issue: Tokenizer performance could be improved
- Improvement:
- Reduced memory allocations
- Simplified position tracking
- Optimized string building
- Enhanced token stream generation
- Benefits:
- Faster tokenization
- Lower memory usage
- Better scalability for large files
- More efficient pipeline
- Implementation:
- Used StringBuilder for efficient string concatenation
- Implemented smarter UTF-16 position tracking
- Optimized character and token processing
- Enhanced error detection and handling
Updated Feature Coverage
The V2 parser now has complete implementations for:
- Basic literals (integers, floating-point numbers)
- Function calls (including nested and with type parameters)
- Pattern matching (with correct block termination)
- Operator sequences (with uniform treatment)
- Generic type parameters (including complex and nested generics)
- Block arguments
- Lists with mixed types
- Comment preservation (leading and trailing)
Next Steps
Focus is now shifting to:
- Object expressions implementation
- Source maps support
- Error recovery mechanisms
- Migrating remaining V1-only tests
2025-03-25
Union Type Subtyping Implementation Details
Following the comprehensive type system improvements from March 23rd, here are the detailed implementation specifics for union type subtyping and cell coverage mechanisms:
1. Enhanced Union Subtyping Implementation
The implementation now fully supports all three union subtyping scenarios with proper propagator connections:
-
Union-to-Union Subtyping (
A|B <: C|D):// For each type in RHS union, at least one type in LHS union must accept it for (t1 <- types1; t2 <- types2) { if (tryUnify(t1, t2)) { val t1Cell = toId(t1) val t2Cell = toId(t2) state.addPropagator(Unify(t1Cell, t2Cell, cause)) } } -
Specific-to-Union Subtyping (
A <: B|C):// Delegate to the connectSpecificAndUnion helper method connectSpecificAndUnion( specificId = specificCellId, specificType = specificType, unionId = unionCellId, unionTypes = unionTypes, cause = cause ) -
Union-to-Specific Subtyping (
A|B <: C):// A union can be used where a specific type is expected if all components match it unionToSpecific(union, unionTypes, specificType, cause)
2. Advanced Cell Coverage Mechanisms
To solve the “cells not covered by any propagator” error, several key mechanisms have been implemented:
-
EnsureCellCoverage Propagator:
case class EnsureCellCoverage( cell: CellId[Term], cause: Expr ) extends Propagator[Tyck] { override val readingCells = Set(cell.asInstanceOf[CIdOf[CellRW[?,?]]]) override val writingCells = Set.empty override val defaultingCells = Set(cell.asInstanceOf[CIdOf[CellRW[?,?]]]) // Always succeeds - just ensures the cell is covered override def run(using StateAbility[Tyck], Tyck): Boolean = true override def naiveZonk(needed: Vector[CellIdAny]) (using StateAbility[Tyck], Tyck): ZonkResult = ZonkResult.Done } -
Helper Methods for Union Component Coverage:
// Ensures coverage for all cells within a union type private def ensureUnionComponentsCoverage( unionTypes: NonEmptyVector[Term], cause: Expr )(using StateAbility[Tyck], Context, Tyck): Vector[CellId[Term]] = { unionTypes.map(typ => { val cellId = toId(typ) ensureCellCoverage(cellId, cause) cellId }).toVector } -
Connection of Union Types to Components:
// Creates UnionOf propagator to connect union cell to its components private def connectUnionToComponents( unionCell: CellId[Term], componentIds: Vector[CellId[Term]], cause: Expr )(using StateAbility[Tyck], Context, Tyck): Unit = { state.addPropagator(UnionOf(unionCell, componentIds, cause)) } -
Special Handling for Function Calls:
// Recursively processes function calls to ensure cell coverage private def processFunctionCall(term: Term, cause: Expr)(using StateAbility[Tyck], Context, Tyck): Unit = { val cellId = toId(term) ensureCellCoverage(cellId, cause) term match { case fcall: FCallTerm => { processFunctionCall(fcall.f, cause) fcall.args.foreach(arg => processFunctionCall(arg, cause)) } case Union(types, _) => types.foreach(t => processFunctionCall(t, cause)) case Intersection(types, _) => types.foreach(t => processFunctionCall(t, cause)) case _ => // No further processing for simple terms } }
3. Improvements to Type Compatibility Checking
The implementation now includes enhanced compatibility checks for working with union types:
-
Union Compatibility Methods:
- Added specialized methods for checking compatibility between union and non-union types
- Implemented bidirectional compatibility checking for union types
- Enhanced subtyping relationships with proper union type handling
-
Special Handling for Literal Types with Unions:
- Added special case handling for integer literals with union types
- Implemented type compatibility checking for literals against union types
- Added support for both direct and indirect type matching
These improvements ensure that union types work seamlessly with the rest of the type system, including with literals, function types, and in both widening and narrowing contexts.
4. Type-Level Function Application Enhancements
The implementation includes improvements to how type-level function applications are handled:
-
Reduction for Type Checking:
- Added specialized reduction mode for type-level expressions
- Implemented proper context handling for type equality checking
- Enhanced type comparison with reduction-based equality
-
Term Reference Resolution:
- Added recursive reference resolution for deeper type analysis
- Implemented proper handling of nested references in types
- Enhanced type comparison with fully resolved references
All these implementations follow the design principles outlined in the type improvement proposal, ensuring that:
- Original terms are preserved in elaborated results
- Reduction is only used for type checking, not for elaboration
- Union types behave correctly in all subtyping scenarios
- Cell coverage is guaranteed to prevent “cells not covered by propagator” errors
2025-03-25
Enhanced Trait Implementation Details
Building on the basic trait implementation completed on March 19, several enhancements have been added to the trait system:
1. Advanced Trait Context Tracking
To properly handle trait fields and method declarations, the Context system now includes special tracking for the current declaration context:
case class Context(
// Existing fields
currentProcessingType: String = "" // Can be "trait", "record", etc.
) {
// Helper method to create a new context with a specific processing type
def withProcessingType(typeStr: String): Context = copy(currentProcessingType = typeStr)
// Rest of the implementation
}
This allows the elaborator to recognize when it’s processing fields inside a trait definition versus a record definition, which affects how those fields are processed.
2. Trait Statement Elaboration
The processTraitStmt method now fully handles trait declarations with proper context management:
def processTraitStmt(
expr: TraitStmt,
ctx: Context,
declarationsMap: Map[Expr, DeclarationInfo],
effects: CIdOf[EffectsCell]
)(using
parameter: SemanticCollector,
ck: Tyck,
state: StateAbility[Tyck]
): (Seq[StmtTerm], Context) = {
val traitInfo = declarationsMap(expr).asInstanceOf[TraitDeclaration]
// Process extends clause if present
val elaboratedExtendsClause = expr.extendsClause.map { case clause @ ExtendsClause(superTypes, _) =>
superTypes.headOption
.map {
case Identifier(traitName, _) =>
ctx.getTypeDefinition(traitName) match {
case Some(traitDef: TraitStmtTerm) =>
TraitTypeTerm(traitDef, convertMeta(clause.meta))
case _ =>
ck.reporter.report(NotATrait(superTypes.head))
ErrorTerm(NotATrait(superTypes.head), convertMeta(clause.meta))
}
// Other cases
}
// More processing
}
// Elaborate the body within a trait-specific context
val elaboratedBody = expr.body.map { body =>
elabBlock(body, newTypeTerm, effects)(using ctx.withProcessingType("trait"), parameter, ck, state)
}
// Create and return the TraitStmtTerm
val traitStmtTerm = TraitStmtTerm(
name = traitInfo.name,
uniqId = traitInfo.uniqId,
extendsClause = elaboratedExtendsClause,
body = elaboratedBody,
meta = convertMeta(expr.meta)
)
(Seq(traitStmtTerm), ctx.addTypeDefinition(traitStmtTerm))
}
The key improvement is using ctx.withProcessingType("trait") to indicate when we’re elaborating a trait body.
3. Record-Trait Subtyping Verification
The trait implementation now includes proper record-trait subtyping relationship verification:
private def checkTraitImplementation(
recordDef: RecordStmtTerm,
traitDef: TraitStmtTerm,
cause: Expr
)(using
localCtx: Context,
ck: Tyck,
state: StateAbility[Tyck]
): Boolean = {
// Check for a direct extension relationship
val hasExtendsClause = recordDef.extendsClause.exists {
case traitCall: TraitTypeTerm =>
traitCall.traitDef.uniqId == traitDef.uniqId
case _ => false
}
if (!hasExtendsClause) {
// Report error if record doesn't explicitly extend the trait
ck.reporter.report(NotImplementingTrait(recordDef.name, traitDef.name, cause))
false
} else {
true
}
}
4. Trait-Trait Extension Checking
Similarly, trait-trait inheritance is now properly verified:
private def checkTraitExtends(
childTraitDef: TraitStmtTerm,
parentTraitDef: TraitStmtTerm,
cause: Expr
)(using
Context,
Tyck,
StateAbility[Tyck]
): Boolean = {
// Check if they're the same trait (reflexivity)
if (childTraitDef.uniqId == parentTraitDef.uniqId) {
return true
}
// Check direct parent
val directParent = childTraitDef.extendsClause match {
case Some(traitCall: TraitTypeTerm) =>
traitCall.traitDef.uniqId == parentTraitDef.uniqId
case _ => false
}
directParent
}
5. Special Handling in Type Unification
The trait subtyping rules are now properly integrated into the type unification system:
// In unify method
(lhsResolved, rhsResolved) match {
// Record implementing trait (structural subtyping)
case (RecordTypeTerm(recordDef, _, _), TraitTypeTerm(traitDef, _)) =>
checkTraitImplementation(recordDef, traitDef, cause); ()
// Trait extending trait (structural subtyping)
case (TraitTypeTerm(childTraitDef, _), TraitTypeTerm(parentTraitDef, _)) =>
checkTraitExtends(childTraitDef, parentTraitDef, cause); ()
// Other cases
}
These implementations provide a solid foundation for trait-based programming in Chester, with support for basic field requirements and type inheritance. Future work will focus on more advanced trait features like method implementations, default values, and multiple inheritance.
2025-04-01
Generalized Block Termination Implementation for V2 Parser
Problem Analysis
- Issue: V2 parser needed a general solution for the
}\npattern without special-casing keywords - Core Principle Violation: Previous implementation relied on checking if expressions were Blocks
- Design Requirement: Need context-free parsing with uniform token pattern detection
Implementation Approach
- Solution Strategy: Implemented a generalized
}\npattern detection mechanism - Key Changes:
- Modified
checkForRBraceNewlinePatternto check previous token instead of expression type - Added support for EOF as an implicit newline terminator
- Renamed
separateCaseStatementstoprocessMixedStatements - Made statement splitting logic more general while preserving behavior
- Modified
Technical Implementation
- Token Pattern Detection:
- Check if previous token was a closing brace (RBrace)
- Verify if current token is whitespace containing a newline or EOF
- Apply termination rules based purely on syntax, not semantics
- Statement Processing:
- Preserve existing multiple statements without changes
- Split single OpSeq statements when they contain multiple logical statements
- Detect natural statement boundaries at certain identifiers like “case”
- Maintain consistent behavior with V1 parser
Benefits
- Alignment with Core Principles:
- Maintains strict context-free parsing
- Treats all blocks uniformly
- Applies consistent rules for expression termination
- Better separation between syntax and semantics
- Technical Improvements:
- More maintainable parser with fewer special cases
- Simplified codebase with clearer termination rules
- Better alignment between V1 and V2 parsers
- All relevant tests now pass with identical behavior
Files Modified
reader/src/main/scala/chester/readerv2/LexerV2.scala
This implementation properly adheres to Chester’s core parsing principles by treating all }\n patterns uniformly, regardless of their context.
2025-04-02
Fixed Outdated Pattern in Reducer.scala
During a comprehensive code review to align with Chester’s term architecture principles:
- Issue Identified: Found outdated pattern in
reduceStandardmethod inReducer.scalathat was using explicit type casting with pattern matching onStmtTerm - Fix Applied: Updated the code to maintain type safety while avoiding unnecessary pattern matching
- Before:
val reducedStatements = statements.map { case stmt: StmtTerm => r.reduce(stmt).asInstanceOf[StmtTerm] } - After:
val reducedStatements = statements.map(stmt => // Keep the type information while reducing r.reduce(stmt).asInstanceOf[StmtTerm] ) - Alignment with Principles: The solution balances Chester’s design principles with type safety requirements by:
- Avoiding pattern matching with
*Cand*Tsuffix traits - Keeping necessary type casts for type safety
- Using more direct and readable code
- Avoiding pattern matching with
- Benefits:
- More consistent codebase that follows established design principles
- Type-safe implementation
- Clearer intent with inline comments
- Better alignment with the unified Term definition approach
2025-04-27
LexerV2 Refactoring and Plan Cleanup
-
LexerV2 Code Refinements:
- Introduced
peekmethod inLexerStateto look ahead without consuming tokens, simplifying logic. - Refactored
parseAtomto usepeekfor cleaner handling of empty objects{}vs block/object start. - Introduced
withModifiedStatehelper inLexerStateto encapsulate temporary state changes (likenewLineAfterBlockMeansEnds), simplifyingparseBlock. - Minor cleanup in
expectedErrorusing f-interpolators.
- Introduced
-
Parser Plan Update:
- Marked Phase 2 (Advanced Features) as fully complete in
parser-plan.md. - Marked “V1/V2 Semantic Consistency”, “Object Expressions”, and “Block Termination” priorities as complete.
- Updated
parser-implementation.mdto reflect completed status of these features. - Consolidated completed tasks from planning/implementation docs into this devlog entry.
- Current focus remains on Phase 3: Error Handling (Recovery, Messages, Debug Info), Source Maps, Test Migration, and Performance Optimization.
- Marked Phase 2 (Advanced Features) as fully complete in
-
Files Modified:
reader/shared/src/main/scala/chester/readerv2/LexerV2.scaladocs/src/dev/parser-plan.mddocs/src/dev/parser-implementation.mddocs/src/dev/devlog.md
Historical: Chester.tyck System Improvement Proposals (Archived)
⚠️ IMPORTANT: This content was originally from
tyck-improvement-proposal.mdand has been moved here for historical reference. Thechester.tycksystem described below has been completely removed and replaced withchester.elab. This content is preserved to understand the evolution of Chester’s type system.
Original Context and Background
Chester’s type system was originally based on a constraint propagation network where:
- Type constraints were represented by propagators
- Cells held type information and tracked their propagators
- Two types of propagator connections:
- Reading: Propagators that read from a cell
- Zonking: Propagators that could write to or resolve a cell
The original system focused on enhancing support for dependent types, which required:
- Types that could depend on terms
- Variable bindings in types with alpha-equivalence checking
- Sophisticated reduction strategies for type equality
Original Implementation Plans and Issues
Union Type Subtyping Implementation Plans
The original chester.tyck system had detailed plans for union type implementation:
- Union-to-Union subtyping (
A|B <: C|D) - Specific-to-Union subtyping (
A <: B|C) - Union-to-Specific subtyping (
A|B <: C) - Cell coverage mechanisms for union types
- Proper type checking for union types in all contexts
EnsureCellCoverage Hack Removal Plans
The original system included plans to replace the EnsureCellCoverage hack with proper AutoConnect propagators:
- Analysis of term structure to create proper type connections
- Smart handling of union and intersection types
- Specialized support for function calls and their arguments
- Default value support for truly unconstrained type variables
Enhanced Type-Level Function Application
Original plans included:
- Better handling of nested type-level function applications
- Improved
DefaultReducerfor composed functions - Enhanced
tryUnifymethod for complex function call terms - Testing with examples like:
// Test enhanced type-level function application record A(a: Integer); record B(b: String); // Basic identity function for types def idType(x: Type): Type = x; // Function composition at the type level def composeTypes(f: Type -> Type, g: Type -> Type, x: Type): Type = f(g(x)); // Test basic composition let aT = composeTypes(idType, idType, A); def getA(x: aT): Integer = x.a; // Should work via reduction
Original Testing Strategy
The chester.tyck system used different testing commands:
# Historical test commands for chester.tyck (no longer applicable)
sbt "rootJVM/testOnly chester.tyck.FilesTyckTest"
sbt "semantic/testOnly chester.tyck.TheTyckTest"
sbt "rootJVM/testOnly chester.tyck.FilesTyckTest -- -only add.chester"
Design Principles from Original System
The original chester.tyck system followed these principles:
- Term Preservation: Keep original terms in elaborated results
- Reduction Strategy: Only reduce during type equality checking using
ReduceMode.TypeLevel - Documentation: Maintain clear test cases for each feature
Experimental Implementation Notes
During the chester.tyck era, experimental changes were made to improve union type handling:
Union-to-Specific Type Relationship Changes:
- Modified
unionSpecificCompatiblemethod to check ALL components - Changed logic from
unionTypes.exists(compatible)to!unionTypes.exists(!compatible) - Added explicit error reporting in
handleUnionSpecificmethod
DefaultValuePropagator Implementation:
- Implemented dedicated propagators to solve “cells not covered” errors
- Added
DefaultValuePropagator[T]case class with high priority score - Implemented proper cell tracking with
readingCells,writingCells, anddefaultingCells
Infinite Recursion Prevention:
- Added guard conditions in
UnionOfpropagator - Used early returns to prevent cyclic dependencies
- Implemented filtered component selection before creating propagator connections
Migration to chester.elab
The transition from chester.tyck to chester.elab addressed these issues:
- Simplified constraint system without cell coverage hacks
- More direct type relationship handling
- Better union type management
- Cleaner function call processing
- Eliminated complicated propagator networks
Comparison: Old vs New Test Patterns
Old chester.tyck test pattern:
class MyTest extends FunSuite {
test("my test") {
// Test logic using chester.tyck components
// Complex setup with propagators, cells, etc.
}
}
Current chester.elab test pattern:
class ElabHoleTest extends FunSuite {
test("?hole should produce HoleTerm") {
platformInfo.withValue(TypescriptPlatformInfo) {
// Simpler, more direct test logic
val elabOps = ElabOps(reporter, NoopSemanticCollector)
val judge = DefaultElaborator.inferPure(expr)(using elabOps)
assert(judge.wellTyped.isInstanceOf[HoleTerm])
}
}
}
Legacy Documentation Impact
Content that previously referenced chester.tyck was updated when the system was replaced:
- Development commands in
development.mdupdated to focus on chester.elab - All type system documentation migrated to new elaboration system
- Test commands updated to use current chester.elab test suites
This historical information is preserved to help developers understand the evolution of Chester’s type system and the rationale behind the current chester.elab architecture.
Chester Elaboration System (chester.elab)
Introduction
Chester’s type checking system uses a modernized elaboration approach with the chester.elab package. This system provides a more modular, constraint-based approach to type checking and elaboration.
Architecture Overview
Core System (chester.elab)
The elaboration system is built around a constraint-based solver with dedicated handlers:
- Uses a dedicated solver architecture for tracking and resolving typing constraints
- Features a modular handler system where each elaboration concern is handled by a dedicated, composable handler
- Uses cells to represent type variables and constraints in a structured way
- Infers more specific types (e.g.,
IntTerminstead ofIntegerTerm) - Employs handler-driven architecture with components like
BlockElabHandlerandListOfHandler
Key Components
1. Core Interfaces
Elab Trait (Elab.scala) serves as the primary interface for elaboration operations. It provides three key methods:
elab: Elaborates an expression against a given type, returning a cell containing the elaborated terminfer: Infers both the term and type for an expression, returning them as a pairinferType: Specializes in type-checking expressions that should be types themselves
All these methods require appropriate context, effects tracking, and solver operations to function.
DefaultElab Implementation provides the core elaboration logic for different expression types. It uses pattern matching to handle different kinds of expressions, dispatching each to an appropriate constraint handler:
- Integer literals are handled by the IntegerLit constraint
- String literals are handled by the StringLit constraint
- List expressions are handled by the ListOf constraint
- Blocks are handled by the BlockElab constraint
For each expression type, a corresponding constraint is created and passed to the solver through SolverOps.
2. Entry Point
DefaultElaborator (Elaborator.scala) is the main entry point for using the new elaboration system. It’s configured with:
- A default Elab implementation (DefaultElabImpl)
- A SolverFactory (ConcurrentSolverFactory)
- A handler configuration (DefaultSolverConf)
This setup provides the inferPure() method that is used by the REPL and tests as the primary entry point for type checking expressions.
3. Constraint Handlers
The system uses a handler-based architecture where each type of constraint has a dedicated handler:
- Literal Handlers:
IntegerLitHandler,StringLitHandler,SymbolLitHandler - Block Handler:
BlockElabHandlerfor elaborating code blocks - List Handler:
ListOfHandlerfor list expressions - Unification Handlers:
UnifyHandler,UnifyMultipleHandlerfor type compatibility - Type Handlers:
IsTypeHandlerfor type checking,SimplifyUnionHandlerfor union types
Each handler implements the Handler trait with a run method that processes a specific type of constraint.
4. Operations Interface
ElabOps (ElabOps.scala) provides operations for error reporting and semantic collection:
case class ElabOps(reporter: Reporter[TyckProblem], collector: SemanticCollector) extends Reporter[TyckProblem] {
// Delegated reporter methods
override def report(problem: TyckProblem): Unit = reporter.report(problem)
}
Current Implementation Status
Features Supported
The elaboration system currently supports:
- Basic literals (integers, strings, symbols)
- Lists (including heterogeneous and nested lists with correct union typing)
- Code blocks with statements and expressions
- Type unification and compatibility checking
- Pure expressions (without effects)
REPL Integration
The REPL uses the elaboration system by default, as seen in REPLEngine.scala:
The REPL implementation calls DefaultElaborator.inferPure() to type check expressions. This provides the main entry point for the elaboration system.
Test Coverage
Test coverage for the new system is implemented in ElabLiteralAndListTest.scala, which verifies:
- Integer literals: Correctly elaborated to
IntTermwithIntType - Heterogeneous lists: Elaborated to
ListTermwith a union type for elements - Empty lists: Properly typed as
ListTerm[NothingType] - Nested lists: Correctly handle nested list structures and their type relationships
These tests demonstrate the system’s ability to:
- Infer precise types (using
IntTerminstead of the more generalIntegerTerm) - Handle heterogeneity through proper union type creation
- Maintain correct type relationships in nested structures
Using the Elaboration System
Basic Usage
The following example shows how to use the elaboration system to type check an expression:
To use the elaboration system, you’ll need to:
- Parse an expression using ChesterReaderV2 or another parser
- Create a reporter and ElabOps for error collection
- Call DefaultElaborator.inferPure() to obtain a Judge containing the elaborated term and type
- Check for errors and access the elaborated term and inferred type
This process will properly handle parsing and type checking of various expressions, including heterogeneous lists that will be typed with appropriate union types.
Extending the System with New Expression Types
To add support for a new expression type, you need to follow these steps:
1. Define a Constraint Kind
Create a Kind object for your constraint:
1. Define a Constraint Kind
Create a Kind object in the chester.elab package that defines the type of your constraint. This serves as a unique identifier for your constraint type in the system.
2. Create a Constraint Class
Define a constraint class for your expression type that takes:
- Your expression type as a parameter
- The target type cell
- Required implicit parameters (effects, elab, ops, ctx)
The constraint class should extend the Constraint abstract class with your Kind.
3. Implement a Handler
Create a handler that processes your constraint by implementing the Handler trait. The handler needs to:
- Override the
runmethod to implement the elaboration logic - Create appropriate cells for your results
- Connect your result to the target type using constraints like
Unify - Optionally implement defaulting behavior for when type information is incomplete
4. Register the Handler
Add your handler to DefaultSolverConf.scala so the system knows how to process your constraint. This involves adding your handler to the list of handlers in the DefaultSolverConf value.
5. Update DefaultElab
Finally, extend the elab() method in DefaultElab to handle your expression type by adding a pattern matching case for your expression type that calls your constraint.
Example: Adding Support for Boolean Literals
A practical example would be adding support for boolean literals, which would require:
- Defining a
BooleanLitKind to identify the boolean literal constraint - Creating a
BooleanLitconstraint class that takes a BooleanLiteral expression and target type - Implementing a
BooleanLitHandlerthat:- Creates a BooleanTerm with the appropriate value
- Adds a cell containing that term
- Creates a BooleanType cell
- Unifies the target type with the boolean type
- Connects the boolean term to the output cell
- Registering the handler in DefaultSolverConf
- Adding a case for BooleanLiteral expressions in the DefaultElab.elab method
Development Guidelines
Follow these guidelines when working with the codebase:
For New Development
- Use
DefaultElaborator.inferPure()as the primary entry point for new typechecking code - Implement new features and extensions in the
chester.elabpackage - Add handlers for new expression types following the established patterns
- Write tests specifically for the elaboration system
Testing Approach
- Use
ElabLiteralAndListTest.scalaas a reference for test structure and pattern - Create comprehensive test cases for new features
- Test both success and failure cases to ensure robust error handling
Best Practices
1. Preserve Original Terms
Follow the existing guidelines for the elaboration system:
- The elaboration system should preserve the original structure of terms
- Avoid reduction during elaboration
- Keep source code structure intact in elaborated results
- Only use reduction internally during type checking when absolutely necessary
2. Error Reporting
- Use the
ElabOpsreporter for consistent error messages - Match the error format for consistency
- Include source position information when available
- Use internationalized messages (with
t""string templates)
3. Testing
- Test both success and failure cases
- Verify the structure of elaborated terms
- Check inferred types carefully, especially for complex cases like union types
- Test with heterogeneous data to verify union type handling
- Ensure tests cover edge cases like empty collections and nested structures
Current Limitations and Future Work
The elaboration system is under active development and doesn’t yet support the full range of Chester language features. Current limitations include:
- Limited support for complex expressions and statements
- Incomplete handling of advanced type features like traits and interfaces
- Partial support for effects system
- Incomplete support for pattern matching
Future development should focus on:
- Extending the system to support all Chester language features
- Improving error messages and diagnostics
- Enhancing performance and error recovery
- Implementing more advanced type system features
Chester Reader Architecture
Design of Chester’s parsers (“readers”) that transform source code into abstract syntax trees.
Overview
Chester currently has two parser implementations:
- ReaderV1: The original parser using FastParse combinators
- ReaderV2: The newer implementation using a token-based state machine
Both parsers produce semantically identical ASTs using different internal approaches.
Core Design Principles
- Context-Free Parsing: Uniform rules for all expressions; identifiers treated consistently
- Separation of Concerns: Parse syntax without imposing semantics
- Uniform Symbol Treatment: No special keywords - just identifiers and operators
- Flat Operator Sequences: Operator precedence handled later in the semantic phase
- Newline Significance:
}\nterminates expressions in blocks - Block Return Values: Last expression in a block is its return value
ReaderV1 Implementation
ReaderV1 uses the FastParse library to implement a parser combinator approach.
Key Components
- TokenParsers: Small parsers for basic lexemes (identifiers, literals, operators)
- Combinators: Composable functions that build larger parsers from smaller ones
- ParsingContext: Tracks parsing state (e.g., whether currently in an operator sequence)
- ExprMeta: Metadata handling for source positions and comments
Characteristics
- Declarative grammar definitions
- FastParse-based error reporting
- Recursive descent parsing model
Implementation Structure
ReaderV1 consists of:
-
Expression Parsers: Methods like
parseExpr,parseAtom, andparseOperatorform the core of the parser. They use FastParse combinators to build complex parsers from simpler ones. -
Context Tracking: A
ParsingContextobject tracks the current parsing state, including whether we’re in an operator sequence, a block, or other specialized contexts. -
Source Position Tracking: Dedicated methods map character positions to line/column positions for error reporting, with special handling for UTF-16 surrogate pairs.
-
Whitespace and Comment Handling: Dedicated parsers for whitespace, line endings, and comments ensure these elements are preserved in the AST.
-
Parser Extensions: Custom extension methods for FastParse parsers add support for metadata attachment, relaxed parsing, and error recovery.
-
Parser Composition: The implementation composes smaller parsers into larger ones, following FastParse’s combinator approach.
ReaderV2 Implementation
ReaderV2 uses a custom tokenizer and a state machine-based approach for parsing, with significant improvements to block termination detection and object expression parsing.
Key Components
- Lexer: Converts source code into a stream of tokens for efficient parsing
- ReaderState: Tracks current token position, history, and pending whitespace/comments
- ReaderContext: Contains context flags like
newLineAfterBlockMeansEndsfor parsing decisions - Token: Represents tokens like identifiers, operators, literals, with source position information
- Token Handlers: Specialized methods for parsing different token types and structures
Characteristics
- Pre-tokenization for efficient token stream processing
- Separate lexing and parsing phases for cleaner code organization
- Context-aware parsing with explicit state tracking
- Enhanced UTF-16 aware Unicode and emoji handling
- Robust block termination detection with the
}\npattern - Comprehensive object expression support with multiple key types
- Optimized comment handling and attachment
Implementation Structure
ReaderV2 consists of:
-
Two-Phase Parsing: Separates tokenization from parsing, with a dedicated Tokenizer creating a stream of tokens before parsing begins.
-
State Management: The parser maintains state through two complementary objects:
- ReaderState: Tracks token position, history, and pending whitespace/comments
- ReaderContext: Contains context flags like
newLineAfterBlockMeansEndsfor syntactic decisions - Together they enable precise tracking of parser state and contextual information
-
Context-Aware Processing: Context flags enable important syntactic decisions like proper block termination with the
}\npattern, while maintaining uniform symbol treatment. -
Optimized Comment Handling: Non-recursive methods like
skipComments()andpullComments()efficiently manage comment attachment, replacing the previous recursive approach. -
Robust Block Termination: The special
}\npattern detection is implemented in thecheckForRBraceNewlinePattern()method, which uses thenewLineAfterBlockMeansEndsflag from ReaderContext to determine when blocks should end. -
Enhanced Object Expressions: Support for multiple key types:
- Identifier keys (e.g.,
{ x = 1 }) - String literal keys (e.g.,
{ "x" = 1 }) - Symbol literal keys (e.g.,
{ 'x = 1 }) - Both
=and=>operators in object clauses
- Identifier keys (e.g.,
-
Error Handling: The parser produces structured
ParseErrorobjects with detailed source position information and recovery mechanisms. -
Bottom-Up Construction: Parsing builds expressions from atoms and then extends them through continuation-based parsing in
parseRest().
Key Similarities Between Implementations
Both parsers:
- Track source positions for error reporting
- Preserve comments in the AST
- Handle the
}\nblock termination pattern - Produce flat operator sequences without precedence handling
- Parse the same language syntax
- Use context tracking for parsing decisions
- Generate identical AST structures
Key Differences Between Implementations
| Feature | ReaderV1 | ReaderV2 |
|---|---|---|
| Parsing Approach | Parser combinators (FastParse) | Token-based state machine |
| Error Recovery | Limited | Enhanced with token-based recovery |
| Token Creation | On-demand during parsing | Separate tokenization phase |
| State Handling | Implicit in parse context | Explicit in ReaderState |
| Code Structure | Grammar-centric | Process-centric |
| Performance | Good | Better (especially on large files) |
| Unicode Support | Basic | Enhanced with better UTF-16 handling |
Testing Infrastructure
Chester’s test framework validates parser correctness and compatibility between V1 and V2 implementations. This framework, defined in reader/shared/src/test/scala/chester/reader/parseAndCheck.scala, provides several key testing functions:
Core Testing Functions
-
Parser-Specific Testing:
parseV1(input): Parses input with V1 parser only and returns the resultparseV2(input): Parses input with V2 parser only and returns the resultparseAndCheckV1(input, expected): Tests V1 parser against expected outputparseAndCheckV2(input, expected): Tests V2 parser against expected output
-
Cross-Parser Verification:
parseAndCheckBoth(input, expected): Tests both parsers and ensures they produce identical results- Tests backward compatibility and feature parity
-
Top-Level Parsing:
parseTopLevelV1/V2andparseAndCheckTopLevelV1/V2/Both: Similar functions for testing top-level parsing- Handle file-level parsing with multiple expressions
Error Reporting
The testing framework provides error reporting with:
- Detailed error messages showing exact failure position
- Visual pointer to error location in source code
- Context-aware error descriptions
- Comparison between expected and actual AST structures
Serialization Verification
The framework also tests that parsed expressions can be correctly serialized and deserialized:
- Verifies JSON serialization with
read[Expr](write[Expr](value)) - Confirms binary serialization with
readBinary[Expr](writeBinary[Expr](value)) - Ensures AST structures maintain integrity through serialization cycles
Test Organization
Parser tests are organized into several categories:
- Expression Tests: Verify parsing of individual expression types
- Integration Tests: Test combined language features
- Regression Tests: Ensure previously fixed issues don’t reoccur
- Migration Tests: Track progress in supporting V1 features in V2
File-Based Testing
In addition to the core testing functions, Chester implements file-based integration tests:
- FileParserTest.scala: Tests ReaderV2 against a suite of test files in
tests/parserdirectory - FileParserTestV1.scala: Tests ReaderV1 against the same test suite for comparison
These file-based tests:
- Ensure consistency when parsing complete Chester files
- Verify parser behavior across a wide range of syntax combinations
- Automatically generate expected output for regression testing
- Maintain backward compatibility during parser evolution
Future Development
ReaderV2 is the focus of ongoing development, with priorities including:
- Completing error recovery implementation
- Adding source maps support
- Migrating any remaining V1-only tests
- Expanding test coverage
- Optimizing token handling for better performance
See devlog.md for chronological implementation details.
Intellij Idea configurations
- use nightly update of scala plugin
- use “Compiler” for scala2
- in sbt settings, Create separate modules for main and test; Use separate compiler output paths