Type Systems for Software Developers: One Tool, Four Ways

Table of Contents

1. One spec, four type disciplines

This is a point-in-time note to get up to speed – not on proofs, on types. The question is narrow: what does a more expressive type system buy a working programmer? The four implementations already exist. They are the evidence.

The transform tool is one spec – a reversible codec pipeline, where the URL is the sole source of truth (reversible pipeline transforms). It has been rebuilt from that one spec four times, each in a language with a different type discipline:

Implementation Type discipline What the type carries
transform-tool (TypeScript) structural, erasable shape, as documentation
transform-tool-rs (Rust) affine + ownership total vs partial, failure
transform-tool-dafny (Dafny) refinement the round-trip law
transform-tool-racket (Racket) contracts + blame the law, checked at the boundary

All four pass the same acceptance tests against the same spec. They differ only in how much of the codec's structure they can state before the program runs. That difference is the whole subject.

The thesis in one line: the algebraic shape of a transform – whether it is an involution, a bijection, a parameterized family, or a lossy surjection – is a type, and each language lets you lift more of that shape out of runtime data and into the static type (Pierce 2002).

2. The codec taxonomy is a type

The spec sorts every codec into one of four patterns. Stated informally:

  • Involutionf(f(x)) = x. Encoding and decoding are the same function (rot13, reverse, xor:k). Direction is irrelevant.
  • Bijectiondecode(encode(x)) = x, but encode ≠ decode (base64, caesar:n). Direction matters; the inverse exists.
  • Parameterized – a factory make(args, direction) produces the codec (caesar:13, xor:42). The arguments are part of the identity.
  • Surjection – information-destroying. No inverse (upper, strip). You can go forward but not back.

This is not documentation prose; it is a sum type. Rust writes it down directly:

pub enum CodecPattern {
    /// `f(f(x)) = x`.  Direction irrelevant.
    Involution,
    /// `decode(encode(x)) = x`, `encode ≠ decode`.
    Bijection,
    /// Factory receives `(args, direction)` → codec function.
    Parameterized,
    /// Information-destroying.  No true inverse.
    Surjection,
}

Once the taxonomy is a type, the compiler can hold you to it. The interesting move is what each language does with the surjection case – the one codec family that has no inverse – because that is where "can I reverse this?" stops being a comment and becomes a thing the type system can enforce, or fail to.

3. Structural and erasable: TypeScript

The TypeScript build types the pipeline with structural interfaces. Direction is a union; reversibility is a field:

export type Direction = "encode" | "decode";
export type ApplyDirection = "apply"; // for involutions and surjections

export interface TraceEntry {
  stepIndex: number;
  // ...
  success: boolean;
  reversible: boolean;          // <- runtime data, not a type
}

export interface PipelineResult {
  isReversible: boolean;
  firstIrreversibleStep?: number;  // <- discovered by running
}

Read the last two fields carefully. reversible and firstIrreversibleStep are results the program computes by executing the pipeline. The type PipelineResult cannot prevent you from building a pipeline that loses information; it can only report, afterward, that you did. The union "encode" | "decode" is real and useful – it rules out a misspelled direction at compile time, and the IDE autocompletes it – but the deeper invariant, "this chain round-trips," lives in a boolean.

This is the floor, and it is not nothing. Structural types are erasable (they vanish at runtime), cheap, and catch the large class of shape errors – wrong field, wrong arity, wrong direction string. What they do not catch is the property the whole tool is about. For that you need to say more.

4. Affine and total: Rust

Rust lifts the surjection case into the type. The registry entry stores encode and decode as optional functions:

pub type CodecFn =
    Box<dyn Fn(String) -> Result<String, TransformError> + Send + Sync>;

pub struct Codec {
    pub pattern: CodecPattern,
    pub has_inverse: bool,
    pub encode: Option<CodecFn>,
    pub decode: Option<CodecFn>,   // None for surjections
    pub make_fn: Option<MakeFn>,   // Some only for parameterized
}

Two type-system facts are doing work here that the TypeScript version cannot.

First, decode: Option<CodecFn> makes "this codec has no inverse" a value of a type, not a boolean flag. A surjection's decode is None. The pipeline runner must pattern-match the Option before it can call decode – the compiler will not let it forget. "Partial" is in the signature.

Second, CodecFn returns Result<String, TransformError>, not String. Failure is in the type. There is "no .unwrap() in library code" (the repo's own rule), which means the type system tracks every way a codec can fail and forces a caller to handle it. The Send + Sync bounds then make the codec map safe to share across threads – again, enforced, not hoped for.

Affine typing (each owned String has one owner, moved not aliased) is the substrate that makes the boxed closures sound. The developer payoff is concrete: the class of bug "we called decode on something that has no decode" is not a test you have to remember to write – it is a program that does not compile.

5. Refinement: Dafny

Rust can say a decode may be absent. It cannot say decode is the inverse of encode. Dafny can, because in a refinement type the law is part of the type (Jhala and Vazou 2021). The involution pattern becomes a predicate:

type String = seq<char>
type Fn = String -> String

datatype Involution = Involution(id: string, label: string, fn: Fn)

// The core algebraic property every involution must satisfy.
predicate IsInvolution(inv: Involution)
{
  forall x: String :: inv.fn(inv.fn(x)) == x
}

forall x :: inv.fn(inv.fn(x)) == x is the definition of involution, written as code the verifier checks for all inputs, not the few in a test suite. The bijection round-trip decode(encode(x)) == x becomes an ensures clause the prover must discharge.

The more useful payoff is what happens when the type cannot be stated. The Dafny port is littered with markers like [SG-1], [SG-2] – "spec gaps." The character model is undefined (Unicode scalar or UTF-16 code unit?); several codecs have unstated domain preconditions (base64:d requires valid base64). The repository's discipline is explicit: "Where the spec is insufficiently precise to state a formal property, we file GitHub issues rather than write unprovable theorems." A refinement type system does not just verify what is true – it forces the imprecisions in your spec to the surface, because you cannot write the type until you resolve them. That is a design tool, not just a verification tool.

6. Contracts and blame: Racket

Dafny moves the law to compile time and pays in proof effort. Racket keeps the law but checks it at runtime – at the module boundary – and adds the one thing none of the others give you: blame (Findler and Felleisen 2002).

The motivating bug is real. In an earlier ClojureScript build, caesar:3:d shifted forward instead of backward. The symptom was "wrong output" – a failing end-to-end test that tells you something is wrong, somewhere. With a boundary contract, the symptom is instead:

caesar/decode: contract violation
  blame: codec 'caesar' (decode direction)
  expected: decode(encode(x)) = x
  given: a function that shifts forward instead of backward

The contract states decode(encode(x)) = x once, at the boundary of the codec module. The pipeline runner needs to know nothing about reversibility; the contract enforces it and, when it breaks, names the culprit – the codec, the direction, the violated property. This is Findler and Felleisen's insight: put the specification at the module boundary, not in the test suite, and let the runtime assign blame to the party that broke it.

For a developer the trade against Dafny is legible. Dafny: the law cannot be violated, but you pay the prover. Racket: the law can be violated, but the instant it is, you get a precise accusation instead of a debugging session.

6.1. Flat contracts and higher-order contracts

The central problem Findler and Felleisen solve is this: a contract on a plain value – "this number is positive," "this string is valid base64" – can be checked immediately by applying a predicate. They call these flat contracts. The codec pipeline is full of them:

(define positive-length/c (and/c string? (λ (s) (> (string-length s) 0))))

But a contract on a function – "this codec round-trips" – cannot be checked by running a predicate, because you would have to call the function on every possible input. Predicates on functions are, in general, undecidable. The type systems community responded to this by building static approximations (ML types, Hindley–Milner). Findler and Felleisen's move was different: instead of approximating, defer and wrap.

A higher-order contract does not check the function when it crosses the boundary. It wraps the function in a monitor that checks the contract each time the function is called. The wrapping is the key mechanism in their λcon calculus (Findler and Felleisen 2002):

;; A codec contract: encode takes a string and returns a string,
;; decode takes a string and returns a string,
;; and round-tripping through both is identity.
(define codec/c
  (->i ([input string?])
       [result string?]
       #:post (input result)
       (equal? (codec-decode (codec-encode input)) input)))

When the pipeline runner receives a codec through a module boundary, it gets back not the raw function but a wrapped function. Each call checks the domain contract (blaming the caller if the input is bad) and the range contract (blaming the codec if the output is bad). The function is never "verified" in full – it is monitored, call by call, and the first violation is reported with blame attached.

This is the practical difference from testing: a test suite checks specific inputs you thought of; a contract checks every input the program actually encounters, for the entire run.

6.2. Blame assignment for higher-order arguments

Blame becomes subtle when functions are passed as arguments to other functions – which is exactly what happens in a codec pipeline. The pipeline runner receives a codec (a function), then passes strings through it. Who is at fault if something goes wrong?

Findler and Felleisen's rule: when a function crosses a boundary, the contract decomposes into a negative position (the domain – what the caller promises to supply) and a positive position (the range – what the callee promises to return). Blame attaches to the party responsible for each position. The key insight for higher-order contracts is that blame alternates: when a function is passed as an argument into another module, the positive and negative parties swap (Findler and Felleisen 2002; Dimoulas et al. 2011).

In the codec pipeline, this plays out concretely:

Boundary Positive (provides value) Negative (receives value)
Pipeline → Codec.encode pipeline (supplies input) codec (returns output)
Codec → Pipeline (result) codec (supplies encoded) pipeline (consumes it)
Pipeline → Codec.decode pipeline (supplies encoded) codec (returns decoded)

If caesar:encode returns a non-string, blame falls on the codec module (positive position). If the pipeline passes a non-string to caesar:encode, blame falls on the pipeline (negative position). For the round-trip property, blame falls on whichever codec breaks the invariant – not on the pipeline runner that composed the chain.

This blame-tracking structure is what Eiffel's Design by Contract (Meyer 1992) could not provide for higher-order programs: Eiffel contracts sit on methods, not on function values that flow between modules. The λcon calculus extended contracts to the higher-order, functional setting where values are functions and boundaries are module interfaces, not class hierarchies.

6.3. Contract combinators

Racket's contract library builds on the flat/higher-order foundation with combinators that compose (Strickland et al. 2012):

;; Flat combinators
(and/c string? (λ (s) (regexp-match? #rx"^[A-Za-z0-9+/=]+$" s)))  ; valid base64
(or/c string? bytes?)                                                ; either type

;; Function contract: string → string, with blame
(-> string? string?)

;; Dependent contract: result depends on input
(->i ([s string?]) [r (s) (λ (r) (= (string-length r) (string-length s)))])

;; Struct contracts (for the codec registry)
(struct/c codec [name string?] [encode (-> string? string?)] [decode (-> string? string?)])

The combinators matter because they are the developer-facing API – the contract language is embedded in Racket, not a separate annotation syntax. A codec author writes contracts in Racket; the contract system compiles them into wrappers that the runtime inserts at provide boundaries.

Every combinator in the list above is a direct descendant of the two primitives in Findler and Felleisen's paper: flat contracts (predicates) and function contracts (domain/range with blame). and/c, or/c, struct/c are compositions; ->i is the dependent generalization of ->. The design has remained stable since 2002, extended to first-class classes (Strickland et al. 2012), temporal properties (Dimoulas et al. 2011), and – for performance – collapsible contract wrappers that prevent accumulation overhead (Feltey et al. 2018).

7. Where dependent types go

There is a limit case to this progression. In a dependently typed language – Lean 4, Idris (Brady 2017) – types are expressive enough that they are proofs: decode(encode(x)) = x is not a contract or an ensures clause but a proposition you inhabit with a term, propositions-as-types (Wadler 2015). The boundary between "type that constrains" and "proof that obligates" disappears.

That side is its own subject and its own tutorial – see jwalsh/lean4-workshop, and the structured-proof pedagogy of Macbeth's Mechanics of Proof (Macbeth 2024) and the Lean companion to Tao's Analysis I (Tao 2022). The blueprint methodology behind large formalizations ((Buzzard and The FLT contributors 2024)) is, to a developer's eye, just a dependency graph over obligations – a build system for proofs. This note stops here intentionally: the developer payoff from types arrives well before full dependent types, and most of it is collected in the four implementations above.

8. A fifth view (speculative): Eiffel, and where the contracts came from

A bookend to the other direction. If 7 looks past refinement, this looks behind contracts – to where the require / ensure / invariant vocabulary was minted. Eiffel's Design by Contract (Meyer 1992) is the ancestor of the Racket cell, not a peer of it: a contract is a runtime obligation, and the codec's law goes in the postcondition.

rot13 (s: STRING): STRING
    do
        Result := rot_n (s, 13)
    ensure
        involution: rot_n (Result, 13) ~ s        -- the law, at the boundary
    end

hex_decode (s: STRING): STRING
    require
        even_length: s.count \\ 2 = 0             -- the [SG-2] domain gap, named
    do ... end

Compiled and run (EiffelStudio 25.02), the placement is exact. The ensure holds for the inputs actually run – not proven for all of them – and a deliberately broken decode trips it at runtime, naming the culprit:

CODEC  bad_caesar @2  round_trip:  Postcondition violated.  Fail

That is the round-trip law checked at the boundary with two-party blame – precondition violated → blame the caller, postcondition violated → blame the routine. Which is the Racket cell (Findler and Felleisen 2002), minus the higher-order story: Eiffel's blame is first-order, where Racket carries it across function arguments. And it is strictly below Dafny, which proves the law for all inputs rather than checking the ones that ran.

So Eiffel does not extend the spectrum; it dates it. Two things it alone makes vivid: (1) writing require even_length is the same act Dafny forces when you try to state ensures – the contract is where the hidden precondition becomes explicit, and DbC got there first; (2) contract inheritance (subcontracting: weaken preconditions, strengthen postconditions down a hierarchy) makes Liskov substitution a thing the compiler checks – a lens neither Racket nor Dafny foregrounds, and the natural place a codec taxonomy would live. A full Eiffel port is left as future work; this section records the placement, not the port.

9. When not to

The progression TypeScript → Rust → Dafny → Lean is a progression in expressiveness, not a recommendation to always climb. Each step up costs: proof effort, compile times, a smaller hiring pool, a steeper on-ramp. The companion lab aygp-dr/petclinic-lab exists to measure exactly this – "the cost of over-formalization," running the same domain through TLA+, Lean 4, core.logic, and more, deliberately too wide, with a refutation log per artifact.

There are axes the four leave untouched – invertibility by construction (lenses), the optics lattice as a reframing of the codec taxonomy, partiality as an effect. See The Cells the Four Do Not Occupy for the companion analysis, with executable Haskell proofs.

The engineering rule that falls out: pick the weakest type discipline that makes the bug you actually fear unrepresentable. If the fear is "wrong direction string," a union type pays for itself. If it is "we shipped a codec that cannot round-trip," you want Option in the signature (Rust) or a contract at the boundary (Racket). Only if the fear is "the inverse is subtly wrong for some inputs we will never test" do you pay for refinement or dependent types. The transform tool is small enough to show all four answers to the same question, side by side, – four answers to one question, side by side.

10. Reading map

For a developer getting up to speed, in rough order of payoff-per-page:

  • Types – Pierce, Types and Programming Languages (Pierce 2002) is the spine; read for the vocabulary (soundness, progress, preservation, structural vs nominal).
  • Contracts and blame – Findler & Felleisen (Findler and Felleisen 2002); short (12 pages), and directly the design of the Racket port. The tech report version has fuller examples. Follow-up: Dimoulas et al. (Dimoulas et al. 2011) on correct blame; Feltey et al. (Feltey et al. 2018) on performance (collapsible wrappers). Meyer (Meyer 1992) is the Eiffel origin – read it to see what higher-order contracts had to extend beyond.
  • Refinement types – Jhala & Vazou (Jhala and Vazou 2021) for the FP-side framing of what Dafny's requires~/~ensures are doing.
  • Dependent types, for programmers – Brady, Type-Driven Development with Idris (Brady 2017); the closest existing book to "this note, but a whole course."
  • Why types can be specs – Wadler, Propositions as Types (Wadler 2015); the one-sitting essay that ties it together.
  • The proof end of the spectrum – Macbeth's Mechanics of Proof (Macbeth 2024), Tao's Analysis I (Tao 2022) and its Lean companion, the FLT blueprint (Buzzard and The FLT contributors 2024), and Quanta's The Proof in the Code (Hartnett 2026) for the popular framing.

The four repositories – transform-tool (TS), transform-tool-rs (Rust), transform-tool-dafny (Dafny), transform-tool-racket (Racket) – are the lab. Read any two of them against the same spec.org and the trade-off reads itself off the diff.

11. References

Brady, Edwin. 2017. Type-Driven Development with Idris. Manning Publications.
Buzzard, Kevin, and The FLT contributors. 2024. “Fermat’s Last Theorem: A Lean Formalization (Blueprint).” Lean 4 / leanblueprint project. https://imperialcollegelondon.github.io/FLT/.
Dimoulas, Christos, Robert Bruce Findler, Cormac Flanagan, and Matthias Felleisen. 2011. “Correct Blame for Contracts: No More Scapegoating.” In Proceedings of the 38th Acm Sigplan-Sigact Symposium on Principles of Programming Languages (Popl ’11), 215–26. ACM. https://doi.org/10.1145/1926385.1926410.
Feltey, Daniel, Ben Greenman, Christophe Scholliers, Robert Bruce Findler, and Vincent St-Amour. 2018. “Collapsible Contracts: Fixing a Pathology of Gradual Typing.” In Proceedings of the Acm on Programming Languages (Oopsla). Vol. 2. ACM. https://doi.org/10.1145/3276503.
Findler, Robert Bruce, and Matthias Felleisen. 2002. “Contracts for Higher-Order Functions.” In Proceedings of the Seventh Acm Sigplan International Conference on Functional Programming (Icfp ’02), 48–59. ACM. https://doi.org/10.1145/581478.581484.
Hartnett, Kevin. 2026. “The Proof in the Code: How a Truth Machine Is Transforming Math and Ai.” Book (Farrar, Straus and Giroux / Quanta Books); Quanta Books. https://www.quantabooks.org/books/the-proof-in-the-code/.
Jhala, Ranjit, and Niki Vazou. 2021. “Refinement Types: A Tutorial.” Foundations and Trends in Programming Languages 6 (3--4): 159–317. https://doi.org/10.1561/2500000032.
Macbeth, Heather. 2024. “The Mechanics of Proof (Math2001).” Online textbook. https://hrmacbeth.github.io/math2001/.
Meyer, Bertrand. 1992. “Applying ``Design by Contract’’.” Computer 25 (10): 40–51. https://doi.org/10.1109/2.161279.
Pierce, Benjamin C. 2002. Types and Programming Languages. Cambridge, MA: MIT Press.
Strickland, T. Stephen, Sam Tobin-Hochstadt, Robert Bruce Findler, and Matthew Flatt. 2012. “Chaperones and Impersonators: Run-Time Support for Reasonable Interposition.” In Proceedings of the Acm International Conference on Object-Oriented Programming, Systems, Languages, and Applications (Oopsla ’12). ACM. https://doi.org/10.1145/2384616.2384685.
Tao, Terence. 2022. Analysis I. Fourth. Hindustan Book Agency. https://terrytao.wordpress.com/books/analysis-i/.
Wadler, Philip. 2015. “Propositions as Types.” Communications of the Acm 58 (12): 75–84. https://doi.org/10.1145/2699407.