Language Design

Baseline combines refinement types, algebraic effects, and machine-actionable diagnostics. This page explains the design decisions, how they interact, and where the language stands today.

Refinement Types

Baseline's type system sits between two extremes:

TypeScript-level types are advisory. type Port = number compiles but doesn't prevent listen(-1). Developers bolt on Zod, branded types, and runtime assertions to enforce constraints the type system ignores.
Theorem provers (Idris, Agda, Lean) enforce arbitrary invariants but require expertise in dependent types and proof tactics. Most teams can't afford the learning curve.

Refinement types check real constraints in a familiar syntax:

type Port = Int where 1 <= self <= 65535
type Percentage = Int where self >= 0 && self <= 100

fn listen!(port: Port) -> {Net} Server =
  // port is guaranteed 1..65535. No validation code needed.
  ...

The compiler verifies at every call site that the value satisfies the constraint. Pass a literal 0 and you get a compile error. Pass a variable and the compiler checks whether the constraint can be proven from context. This is refinement checking, not runtime validation.

Compare the same constraint in TypeScript:

// TypeScript: the type is a lie
type Port = number;

function listen(port: Port) {
  if (port < 1 || port > 65535) throw new Error("invalid port");
  // ...
}

The TypeScript version compiles with listen(-1). The Baseline version does not.

A function that receives Port never needs to check the range. A function that receives Percentage never needs to clamp. The constraint is encoded once in the type definition and enforced everywhere by the compiler.

Refinements + Effects Together

Two verification systems, each useful alone, become significantly more useful together.

Refinements check data invariants: is this port in range? Is this string non-empty? Effects check behavior invariants: does this function touch the filesystem? Does it make network requests?

Together, they answer a question no other practical language can answer at compile time: is this function honest about what it accepts AND what it does?

fn agent_task!(input: Input) -> {Log, Db.read} Output =
  Log.info!("Processing ${input}")
  let data = Db.query!("SELECT * FROM items")
  // Fs.delete!("important.txt")  COMPILE ERROR: Fs not in effect set
  transform(data)

This is compile-time capability enforcement. The untrusted code never runs because the compiler rejects it before execution. Compare this to runtime sandboxing, where the code executes and the sandbox intercepts the call.

Who else has what

Language	Refinements	Effects	Both
Liquid Haskell	SMT-backed (powerful)	No	No
Koka	No	Row-polymorphic (gold standard)	No
OCaml 5	No	Effect handlers	No
TypeScript	No (Zod is runtime)	No	No
Baseline	Integer intervals (growing)	Declared + transitively checked	Yes

Liquid Haskell's refinements are more expressive. Koka's effect system is more mature. Neither combines both, and neither ships as a standalone language with a server framework.

Machine-Actionable Diagnostics

Every language has error messages. Baseline's are structured data.

$ blc check app.bl --json

{
  "code": "CAP_001",
  "message": "Unauthorized Side Effect: 'Http.get!'",
  "context": "Function 'process!' declares effects {Log}, but calls 'Http.get!' which requires {Http}.",
  "suggestions": [{
    "strategy": "escalate_capability",
    "patch": {
      "original_text": "fn process!(data: String) -> {Log} String",
      "replacement_text": "fn process!(data: String) -> {Log, Http} String"
    }
  }]
}

This is not a human-readable error message with a suggestion. It is a machine-actionable patch with a named strategy (escalate_capability), the exact text to find, and the exact replacement. An LLM receives this, applies the patch, resubmits. One-shot fix.

Three verification layers, one format

Every diagnostic follows the same structure across all three checkers:

Layer	Code prefix	Example
Types	`TYP_xxx`	Type mismatch, undefined variable, missing field
Effects	`CAP_xxx`	Unauthorized side effect, missing declaration
Refinements	`REF_xxx`	Constraint violation, unprovable refinement

Each includes error codes, source locations, context, and (where applicable) patch suggestions with confidence scores.

The generate-check-fix loop

The compiler becomes a collaboration partner, not an obstacle. An LLM generating Baseline code enters a feedback loop:

Generate code from a type signature
Run blc check --json
Parse structured errors
Apply suggested patches
Repeat until clean

The compiler guides the model to correctness through structured feedback. LLMs don't need millions of Baseline programs to generate correct code if the compiler teaches them through actionable diagnostics.

Comparison

How Baseline compares to established languages on each axis.

vs. TypeScript

TypeScript's type system is structurally sound in many cases but fundamentally advisory. any exists. Type assertions exist. Runtime and compile-time types can disagree. TypeScript has npm (2M+ packages) and massive LLM training data. Baseline has 38 stdlib modules and near-zero training data. TypeScript is the pragmatic choice for most teams today. Baseline's value is for teams that need compile-time correctness guarantees TypeScript cannot provide.

vs. Rust

Rust has ownership, borrow checking, and fearless concurrency. Its type system prevents memory bugs and data races. Baseline has refinements and effects, which Rust lacks. Rust's error messages are the gold standard for human readability but are not structured for machine consumption. Rust is a better choice for systems programming. Baseline targets application-level code where data invariants and effect tracking matter more than memory safety.

vs. OCaml 5

OCaml 5 added effect handlers. It has a mature type system, great performance, and a real ecosystem (opam). Baseline adds refinement types and structured diagnostics on top of effects. OCaml's effect handlers are lower-level (untyped in the initial implementation). Baseline's are declared in function signatures and checked transitively. OCaml is the more mature choice. Baseline's advantage is the combination of refinements + effects + diagnostics.

vs. Koka

Koka is the gold standard for algebraic effects with row polymorphism and the Perceus reference counting system. Baseline's effect system is simpler (declared sets, not row-polymorphic). Koka lacks refinement types and structured machine-actionable diagnostics. Koka is a research language. Baseline targets practical use with a server framework and JIT compiler.

Performance

Baseline compiles to native code via Cranelift JIT. All times are median of 3 runs on Apple Silicon (arm64).

Hanabi Suite

Benchmark	C -O2	Baseline	Node.js	Python
nbody (5M)	0.21s	0.94s	0.53s	21.06s
binarytrees (18)	1.07s	4.63s	0.69s	5.09s
fasta (2.5M)	0.29s	0.68s	0.003s	4.50s
fannkuch (10)	0.006s	2.46s	1.97s	3.85s
spectral-norm (500)	0.004s	0.72s	0.07s	1.31s

CPU Micro-Benchmarks

Benchmark	Rust	Go	Baseline	OCaml	Node.js	Python
tak	0.098s	0.098s	0.084s	0.125s	0.353s	3.164s
fib (35)	0.044s	0.046s	0.047s	0.050s	0.146s	1.016s
divsum	0.060s	0.057s	0.059s	0.062s	0.142s	1.474s
primes	0.020s	0.018s	0.023s	0.009s	0.085s	0.424s
mergesort	0.011s	0.015s	0.021s	0.009s	0.075s	0.048s
mapbuild	0.014s	0.016s	0.027s	0.009s	0.080s	0.049s
treemap	0.119s	0.028s	0.320s	0.009s	0.107s	0.226s

Current Limitations

As of v0.3, these are the areas where Baseline is incomplete.

Refinement types are integer intervals only. Int where self > 0 works. String refinements (regex patterns, length constraints) are not yet supported.
No concurrency. There is no async/await, no task spawning, no parallelism. Structured concurrency is on the roadmap.
Small standard library. 38 modules covering core operations, HTTP, database, and JSON. You will encounter gaps. There is no package registry.
No embedding API. Baseline runs as a standalone compiler and runtime. There is no C API or way to embed it in a host application.

Roadmap

Planned work beyond v0.3. Designs are subject to change.

Concurrency. Structured concurrency modeled through the effect system: async/await, task spawning, supervision, and capability-based fiber sandboxing.
Rust interop. FFI layer for calling Rust libraries from Baseline and embedding Baseline in Rust applications.
Memory management. Perceus reuse analysis, ownership-based RC elimination, and arena allocation for request-scoped data.
Numeric performance. Monomorphization, unboxed arrays, effect erasure, and SIMD emission for compute-heavy workloads.
String and compound refinements. Regex patterns, length constraints, and refinements on non-integer types.
Agent tooling. Agent convergence benchmarks, blc init scaffolding, and Context7 distribution for LLM context retrieval.
Extended standard library. CLI framework, UUID generation, checked arithmetic, async I/O, and broader ecosystem coverage.

See the changelog for what shipped and when.

Get started Learn the language