Language Design
Baseline combines refinement types, algebraic effects, and machine-actionable diagnostics. This page explains the design decisions, how they interact, and where the language stands today.
Refinement Types
Baseline's type system sits between two extremes:
- TypeScript-level types are advisory.
type Port = numbercompiles but doesn't preventlisten(-1). Developers bolt on Zod, branded types, and runtime assertions to enforce constraints the type system ignores. - Theorem provers (Idris, Agda, Lean) enforce arbitrary invariants but require expertise in dependent types and proof tactics. Most teams can't afford the learning curve.
Refinement types check real constraints in a familiar syntax:
type Port = Int where 1 <= self <= 65535
type Percentage = Int where self >= 0 && self <= 100
fn listen!(port: Port) -> {Net} Server =
// port is guaranteed 1..65535. No validation code needed.
... The compiler verifies at every call site that the value satisfies the
constraint. Pass a literal 0 and you get a compile error.
Pass a variable and the compiler checks whether the constraint can be
proven from context. This is refinement checking, not runtime validation.
Compare the same constraint in TypeScript:
// TypeScript: the type is a lie
type Port = number;
function listen(port: Port) {
if (port < 1 || port > 65535) throw new Error("invalid port");
// ...
} The TypeScript version compiles with listen(-1). The
Baseline version does not.
A function that receives Port never needs to check the range.
A function that receives Percentage never needs to clamp.
The constraint is encoded once in the type definition and enforced
everywhere by the compiler.
Refinements + Effects Together
Two verification systems, each useful alone, become significantly more useful together.
Refinements check data invariants: is this port in range? Is this string non-empty? Effects check behavior invariants: does this function touch the filesystem? Does it make network requests?
Together, they answer a question no other practical language can answer at compile time: is this function honest about what it accepts AND what it does?
fn agent_task!(input: Input) -> {Log, Db.read} Output =
Log.info!("Processing ${input}")
let data = Db.query!("SELECT * FROM items")
// Fs.delete!("important.txt") COMPILE ERROR: Fs not in effect set
transform(data) This is compile-time capability enforcement. The untrusted code never runs because the compiler rejects it before execution. Compare this to runtime sandboxing, where the code executes and the sandbox intercepts the call.
Who else has what
| Language | Refinements | Effects | Both |
|---|---|---|---|
| Liquid Haskell | SMT-backed (powerful) | No | No |
| Koka | No | Row-polymorphic (gold standard) | No |
| OCaml 5 | No | Effect handlers | No |
| TypeScript | No (Zod is runtime) | No | No |
| Baseline | Integer intervals (growing) | Declared + transitively checked | Yes |
Liquid Haskell's refinements are more expressive. Koka's effect system is more mature. Neither combines both, and neither ships as a standalone language with a server framework.
Machine-Actionable Diagnostics
Every language has error messages. Baseline's are structured data.
$ blc check app.bl --json {
"code": "CAP_001",
"message": "Unauthorized Side Effect: 'Http.get!'",
"context": "Function 'process!' declares effects {Log}, but calls 'Http.get!' which requires {Http}.",
"suggestions": [{
"strategy": "escalate_capability",
"patch": {
"original_text": "fn process!(data: String) -> {Log} String",
"replacement_text": "fn process!(data: String) -> {Log, Http} String"
}
}]
} This is not a human-readable error message with a suggestion. It is a
machine-actionable patch with a named strategy
(escalate_capability), the exact text to find, and the exact
replacement. An LLM receives this, applies the patch, resubmits. One-shot fix.
Three verification layers, one format
Every diagnostic follows the same structure across all three checkers:
| Layer | Code prefix | Example |
|---|---|---|
| Types | TYP_xxx | Type mismatch, undefined variable, missing field |
| Effects | CAP_xxx | Unauthorized side effect, missing declaration |
| Refinements | REF_xxx | Constraint violation, unprovable refinement |
Each includes error codes, source locations, context, and (where applicable) patch suggestions with confidence scores.
The generate-check-fix loop
The compiler becomes a collaboration partner, not an obstacle. An LLM generating Baseline code enters a feedback loop:
- Generate code from a type signature
- Run
blc check --json - Parse structured errors
- Apply suggested patches
- Repeat until clean
The compiler guides the model to correctness through structured feedback. LLMs don't need millions of Baseline programs to generate correct code if the compiler teaches them through actionable diagnostics.
Comparison
How Baseline compares to established languages on each axis.
vs. TypeScript
TypeScript's type system is structurally sound in many cases but
fundamentally advisory. any exists. Type assertions exist.
Runtime and compile-time types can disagree. TypeScript has npm
(2M+ packages) and massive LLM training data. Baseline has 38 stdlib
modules and near-zero training data. TypeScript is the pragmatic choice
for most teams today. Baseline's value is for teams that need compile-time
correctness guarantees TypeScript cannot provide.
vs. Rust
Rust has ownership, borrow checking, and fearless concurrency. Its type system prevents memory bugs and data races. Baseline has refinements and effects, which Rust lacks. Rust's error messages are the gold standard for human readability but are not structured for machine consumption. Rust is a better choice for systems programming. Baseline targets application-level code where data invariants and effect tracking matter more than memory safety.
vs. OCaml 5
OCaml 5 added effect handlers. It has a mature type system, great performance, and a real ecosystem (opam). Baseline adds refinement types and structured diagnostics on top of effects. OCaml's effect handlers are lower-level (untyped in the initial implementation). Baseline's are declared in function signatures and checked transitively. OCaml is the more mature choice. Baseline's advantage is the combination of refinements + effects + diagnostics.
vs. Koka
Koka is the gold standard for algebraic effects with row polymorphism and the Perceus reference counting system. Baseline's effect system is simpler (declared sets, not row-polymorphic). Koka lacks refinement types and structured machine-actionable diagnostics. Koka is a research language. Baseline targets practical use with a server framework and JIT compiler.
Performance
Baseline compiles to native code via Cranelift JIT. All times are median of 3 runs on Apple Silicon (arm64).
Hanabi Suite
| Benchmark | C -O2 | Baseline | Node.js | Python |
|---|---|---|---|---|
| nbody (5M) | 0.21s | 0.94s | 0.53s | 21.06s |
| binarytrees (18) | 1.07s | 4.63s | 0.69s | 5.09s |
| fasta (2.5M) | 0.29s | 0.68s | 0.003s | 4.50s |
| fannkuch (10) | 0.006s | 2.46s | 1.97s | 3.85s |
| spectral-norm (500) | 0.004s | 0.72s | 0.07s | 1.31s |
CPU Micro-Benchmarks
| Benchmark | Rust | Go | Baseline | OCaml | Node.js | Python |
|---|---|---|---|---|---|---|
| tak | 0.098s | 0.098s | 0.084s | 0.125s | 0.353s | 3.164s |
| fib (35) | 0.044s | 0.046s | 0.047s | 0.050s | 0.146s | 1.016s |
| divsum | 0.060s | 0.057s | 0.059s | 0.062s | 0.142s | 1.474s |
| primes | 0.020s | 0.018s | 0.023s | 0.009s | 0.085s | 0.424s |
| mergesort | 0.011s | 0.015s | 0.021s | 0.009s | 0.075s | 0.048s |
| mapbuild | 0.014s | 0.016s | 0.027s | 0.009s | 0.080s | 0.049s |
| treemap | 0.119s | 0.028s | 0.320s | 0.009s | 0.107s | 0.226s |
Current Limitations
As of v0.3, these are the areas where Baseline is incomplete.
- Refinement types are integer intervals only.
Int where self > 0works. String refinements (regex patterns, length constraints) are not yet supported. - No concurrency. There is no async/await, no task spawning, no parallelism. Structured concurrency is on the roadmap.
- Small standard library. 38 modules covering core operations, HTTP, database, and JSON. You will encounter gaps. There is no package registry.
- No embedding API. Baseline runs as a standalone compiler and runtime. There is no C API or way to embed it in a host application.
Roadmap
Planned work beyond v0.3. Designs are subject to change.
- Concurrency. Structured concurrency modeled through the effect system: async/await, task spawning, supervision, and capability-based fiber sandboxing.
- Rust interop. FFI layer for calling Rust libraries from Baseline and embedding Baseline in Rust applications.
- Memory management. Perceus reuse analysis, ownership-based RC elimination, and arena allocation for request-scoped data.
- Numeric performance. Monomorphization, unboxed arrays, effect erasure, and SIMD emission for compute-heavy workloads.
- String and compound refinements. Regex patterns, length constraints, and refinements on non-integer types.
- Agent tooling. Agent convergence benchmarks,
blc initscaffolding, and Context7 distribution for LLM context retrieval. - Extended standard library. CLI framework, UUID generation, checked arithmetic, async I/O, and broader ecosystem coverage.
See the changelog for what shipped and when.