Transform Tool — Rebuild Specification

Table of Contents

Axiom: The URL is the sole source of truth. ?chain=rot13+base64:d&text=Hello fully determines the output. No server state. No session. Pure function from URL → rendered page.

1. Confirmation Gate

Before writing code, the implementing agent MUST state:

  1. How many codecs are in the registry (answer: 24)
  2. The three codec patterns and an example of each
  3. What caesar:3:d produces on input HW WX EUXWH (answer: ET TU BRUTE)
  4. Why rot13 does not need encode/decode but caesar:3 does

If any answer is wrong, stop and re-read this spec.

2. URL Contract

2.1. Base URL

https://wal.sh/tools/transform/

2.2. Query Parameters

Param Required Format Example
chain no + delimited codec steps rot13+base64:d
text no input string (spaces as + or %20) Hello+World

Default (no params): input = Hello, World!, chain = identity encode.

2.3. Chain Syntax

Each step in the + delimited chain follows:

<codec-id>[:<arg1>[:<arg2>...]][:<direction>]

Direction tokens: d, decode, e, encode. Default: encode. Everything between the codec ID and the direction token is positional args.

Chain token Codec Args Direction
rot13 rot13 encode (irrelevant: involution)
base64:d base64 decode
caesar:3 caesar [3] encode (shift +3)
caesar:3:d caesar [3] decode (shift +23 = -3 mod 26)
xor:42:d xor [42] decode
hex+sort hex, sort encode, encode (two steps)

2.4. Critical: No %2B

The + delimiter MUST NOT be URL-encoded as %2B. The chain uses literal + in the URL. This is intentional — the pipeline reads like addition. Do NOT use URLSearchParams for chain construction; build the query string by raw concatenation.

3. Codec Registry

24 codecs in three patterns.

3.1. Pattern 1: Involution (:fn)

f(f(x)) = x. One function. Direction is irrelevant. Verb: apply.

ID Label Notes
identity identity f(x) = x. The unit.
reverse reverse String reversal. Anti-homomorphism: f(ab) = f(b)f(a).
rot13 rot13 Caesar shift 13. The only Caesar shift that is its own inverse.
jump5 jump5 Digit substitution: 0↔5 1↔6 2↔7 3↔8 4↔9. Non-digits pass through.
atbash atbash Mirror alphabet: A↔Z B↔Y C↔X. Hebrew cipher.

Registry shape: {:id :rot13 :label "rot13" :fn rot13 :involution? true :inverse? true}

3.2. Pattern 2: Bijection (:encode / :decode)

decode(encode(x)) = x but encode ≠ decode. Two separate functions. Verb: encode or decode.

ID Label Notes
1337 1337 Leet speak. A↔4 E↔3 I↔1 O↔0 S↔5 T↔7 B↔8 G↔6. Uppercase only.
xor xor 0x42 XOR each byte with 0x42. Output is hex string.
base64 base64 RFC 4648. Uses TextEncoder~/~TextDecoder (NOT btoa~/~encodeURIComponent).
url-encode url-encode Percent-encoding per RFC 3986.
hex hex Each byte → 2-char hex.
a1z26 A1Z26 A=1 … Z=26. Spaces → /. Gravity Falls cipher.
binary binary Each char → 8-bit binary, space-separated.
char-codes char-codes Each char → decimal ordinal, space-separated.
morse morse ITU Morse code. Words separated by /.
ipv4-int ipv4→int Dotted-quad ↔ uint32. Domain-restricted.
hamming hamming Hamming(7,4). Input is hex. Corrects single-bit errors on decode.
f-to-c °F→°C (f - 32) * 5/9.
c-to-f °C→°F c * 9/5 + 32.
c-to-k °C→K c + 273.15.

Registry shape: {:id :base64 :label "base64" :encode encode-fn :decode decode-fn :inverse? true}

3.3. Pattern 3: Parameterized (:make-fn)

A factory function receives [args] dir and returns (string → string). The same arg means different things depending on direction.

ID Label Args Encode Decode
caesar caesar shift N (default 13) rot-n N rot-n (26-N)

Registry shape:

{:id :caesar :label "caesar"
 :make-fn (fn [args dir]
            (let [n (parse (first args))
                  shift (if (= dir :decode) (- 26 n) n)]
              #(rot-n shift %)))
 :inverse? true :involution? false}

Note: the shift N is applied mod 26, so every integer is a valid argument and no range check is needed — caesar:0 and caesar:26 are both the identity, caesar:13 is rot13, caesar:-1 equals caesar:25. The examples below cover the cases; the formal domain is left for the reimplementation to synthesize.

Case is preserved: [A-Z] shift within uppercase, [a-z] within lowercase, and every other character (digits, punctuation, spaces, non-ASCII) passes through unchanged. So rot13 applied to [VADER] No. I am your father. yields [INQRE] Ab. V nz lbhe sngure. — brackets, period, spaces, and case all intact.

caesar:13 is the only codec where direction changes the shift amount rather than selecting a different function.

3.4. Pattern 4: Surjection (no inverse)

Information is destroyed. Verb: apply. Decode is nil or cosmetic. Pipeline turns red/amber at this step.

ID Label Notes
sort sort Sort characters. listen → eilnst. No section exists.
upper upper toUpperCase. decode maps to lower for convenience but lower(upper("Hello")) = "hello" ≠ "Hello".
lower lower toLowerCase. Same caveat.
corrupt corrupt Flip one random bit. Non-deterministic.

Registry shape: {:id :sort :label "sort" :encode sort-fn :decode nil :inverse? false}

4. Pipeline Execution

4.1. Threading Model

Identical to Clojure's (-> input f1 f2 f3).

  1. Start with text param as initial value
  2. For each step left-to-right, resolve the function and apply it
  3. Record the intermediate value after each step (the trace)
  4. Halt on first error

The trace is the core data structure — it drives both the visual overlay and the output.

4.2. Function Resolution

Given a step {:id :base64 :direction :encode :args []}:

  1. Look up codec by :id
  2. If codec has :make-fn: call (make-fn args direction) → function
  3. If codec has :fn: use it (involution, direction irrelevant)
  4. Otherwise: (get codec direction):encode or :decode function

4.3. Reversibility

A step is reversible when:

(and (:inverse? codec)
     (or (:involution? codec)      ; f = f⁻¹
         (:make-fn codec)          ; factory handles both directions
         (some? (get codec (opposite direction)))))  ; partner fn exists

A pipeline is reversible when every step is reversible.

4.4. Reverse Pipeline

To invert a pipeline:

  1. Reverse the step order
  2. Flip each step's direction (:encode ↔ :decode)
  3. Involutions keep :encode (direction is irrelevant)
  4. Preserve :args — parameterized codecs need their arguments
;; Forward:  [{:id :caesar :direction :encode :args ["3"]}]
;; Reversed: [{:id :caesar :direction :decode :args ["3"]}]
;; The "3" stays — make-fn interprets it as shift-23 when dir=:decode

5. Serialization

5.1. steps->str (canonical form)

Pipeline → URL chain param:

  • Join steps with +
  • Each step: <id> (encode) or <id>:d (decode)
  • With args: <id>:<arg1>:<arg2> or <id>:<arg1>:d
  • Encode direction is the default — the :e suffix is DROPPED. rot13 means encode. rot13:d means decode. There is no rot13:e in canonical output.

5.2. str->steps (liberal parsing)

URL chain param → pipeline:

  • Split on +, ,, or space
  • Each token: split on :
  • First segment: codec ID
  • Last segment: direction if it matches d~/~e~/~encode~/~decode
  • Middle segments: positional args
  • :e and :encode MUST be accepted as explicit encode direction, even though steps->str never emits them.

5.3. Canonicalization invariant

str->steps(steps->str(pipeline)) = pipeline for all valid pipelines. The reverse does NOT hold: steps->str(str->steps("rot13:e")) = "rot13" (the explicit :e is normalized away). Implementations MUST accept both forms on input but produce only canonical form on output.

6. Visual Contract

The UI renders:

  1. Threading form: (-> input rot13 base64/decode) — Clojure syntax
  2. Reversibility badge: green REVERSIBLE or amber ONE-WAY
  3. Input: textarea with source buttons (Hello World, Date.now(), etc.)
  4. Chain: vertical pipeline with intermediate values after each arrow
  5. Step node: codec name, direction tag, intermediate value, flip/remove buttons
  6. Add step: dropdown (all 24 codecs) + direction dropdown + blue "add" button
  7. Reverse button: always visible. Green when reversible, amber when not.
  8. Output: dark block with final value, copy button, shareable link button

Direction dropdown shows:

  • apply: for involutions (direction irrelevant) and surjections (no inverse)
  • encode / decode: for bijections and parameterized codecs

Surjections: decode option is disabled in the direction dropdown.

7. Test Suite (URL → Expected Output)

These URLs are the acceptance tests. The output is the final value after threading through the chain.

URL chain+text Expected output
chain=rot13&text=V+NZ+LBHE+SNGURE I AM YOUR FATHER
chain=rot13&text=PNXR+VF+N+YVR CAKE IS A LIE
chain=rot13&text=[VADER] No. I am your father. [INQRE] Ab. V nz lbhe sngure. (case + punctuation preserved)
chain=caesar:26&text=Hello, World! Hello, World! (identity, mod 26)
chain=caesar:0&text=Hello, World! Hello, World! (identity)
chain=reverse&text=REDRUM MURDER
chain=atbash&text=SVOOL HELLO
chain=base64:d&text=TmV2ZXIgZ29ubmEgZ2l2ZSB5b3UgdXA= Never gonna give you up
chain=caesar:3:d&text=HW+WX+EUXWH ET TU BRUTE
chain=caesar:3&text=ET+TU+BRUTE HW WX EUXWH
chain=hex&text=Hello 48656c6c6f
chain=hex:d&text=48656c6c6f Hello
chain=rot13+base64&text=Hello Uryyb then VXJ5eWI=
chain=upper&text=Hello HELLO
chain=upper:d&text=HELLO hello (lossy: not a true inverse)
chain=identity&text=anything anything
(no params) input: Hello, World!, chain: identity

7.1. Reversibility Tests

Chain Reversible? Why
rot13 yes involution
base64 yes bijection
caesar:3 yes parameterized, :inverse? true
rot13+base64 yes all steps reversible
hex+sort no sort is surjection
upper no :inverse? false
identity+identity+identity yes all identity

7.2. Round-Trip Tests

For every reversible chain, threading input through the chain and then through the reversed chain MUST return the original input:

(let [steps  (str->steps "caesar:3")
      result (thread "ET TU BRUTE" steps)        ; → "HW WX EUXWH"
      rev    (reverse-pipeline steps)
      back   (thread "HW WX EUXWH" rev)]         ; → "ET TU BRUTE"
  (assert (= "ET TU BRUTE" (:value (last back)))))

8. Failure Handler

If any test in the suite above fails:

  1. STOP — do not proceed to UI implementation
  2. Report: which test, expected vs actual, the codec involved
  3. The most likely failure modes:
    • make-fn ignoring direction (caesar decode = encode)
    • reverse-pipeline dropping :args (parameterized codecs lose config)
    • step-reversible? not recognizing :make-fn pattern
    • base64 using btoa(encodeURIComponent(s)) instead of TextEncoder
    • URL construction using URLSearchParams (encodes + as %2B)

9. Formal Precision (from Dafny review)

The following clarifications were added after a Dafny agent attempted to formalize this spec as ensures / requires clauses (aygp-dr/transform-tool-dafny).

9.1. Structural vs semantic reversibility

step-reversible? checks registry flags (:inverse? true), not runtime behavior. A codec could claim :inverse? true while its decode function is broken. Implementations MUST NOT rely on the flag alone for round-trip guarantees. The flag is a UI hint for the badge; the actual round-trip property is a semantic invariant that the spec asserts but does not prove.

9.2. Domain predicates

Several codecs have restricted domains. The spec now requires these predicates for decode direction:

Codec Domain predicate for decode
hex even-length string of [0-9a-fA-F]
base64 valid RFC 4648 alphabet + padding
ipv4-int integer string in [0, 4294967295]
hamming binary string with length divisible by 7
binary space-separated 8-bit binary strings
char-codes space-separated decimal integers

Attempting to decode invalid input MUST produce an error in the trace, not undefined behavior.

9.3. Character model

The spec assumes strings are sequences of Unicode code points. Implementations in UTF-16 languages (JS, Java) and UTF-8 languages (Rust, Go) MUST agree on output for ASCII input. Non-ASCII behavior is implementation-defined and should be documented per-codec.

9.4. Caesar fixed points

caesar:N where encode = decode (involution) holds for N ≡ 0 and N ≡ 13 (mod 26). N ≡ 0 is the identity (caesar:0, caesar:26, caesar:-26 …). N ≡ 13 is rot13. The unique non-trivial fixed point among N ∈ {1,…,25} is N = 13.

9.5. Non-determinism

corrupt is the only non-deterministic codec. Pipelines containing corrupt violate the purity axiom ("URL → deterministic output"). The spec's axiom applies to all pipelines EXCEPT those containing corrupt. Implementations SHOULD seed the RNG from the input for reproducibility in tests, or exclude corrupt from round-trip assertions.

9.6. Surjection terminology

The spec uses "surjection" loosely. Formally, sort is non-injective (many inputs map to one output) but whether it is surjective depends on the codomain definition. The spec means: information-destroying, no left-inverse exists. The UI label ONE-WAY is more precise than the mathematical term.

10. Non-Functional Requirements

  • No server: pure client-side. The URL is the API.
  • No framework: vanilla DOM or a compile-to-JS language (ClojureScript, Elm, etc.)
  • CSS: monospace font (IBM Plex Mono), circuit-trace aesthetic. Green (#059669) = reversible. Amber (#d97706) = warning. Red (#dc2626) = error.
  • #tool-root: min-height: 675px to prevent layout shift during hydration.
  • URL sync: typing updates the URL. Loading a URL hydrates the state. No %2B.
  • localStorage: save/load named chains. Keys: transform-chains (JSON object mapping name → chain string), transform-customer-id (UUID v4, created on first visit).

11. URL Parsing Clarification (from TS review)

+ is overloaded: in the chain param it delimits pipeline steps; in the text param it represents a space (standard URL encoding). Implementations MUST parse the raw query string manually — splitting on & first, then handling each param. Do NOT use URLSearchParams for either reading or writing: it decodes + as space in both params, which corrupts step delimiters in chain.

12. Instrumentation

Each of these MUST be observable in the browser console or via the trace:

  1. Input value at each step boundary
  2. Whether each step succeeded or errored
  3. Whether the pipeline is reversible (and which step breaks it if not)
  4. The serialized chain string matching the URL
  5. Round-trip identity: for reversible chains, reverse(forward(x)) = x

13. Spec Review History

Date Agent Language Tests Issues Key finding
2026-06-07 builder TypeScript 40/40 9 wrong test vector (VXJ5Yg==)
2026-06-07 lambda Dafny 9 structural ≠ semantic reversibility
2026-06-07 rust-pro Rust/WASM 37/37 10 UTF-8 vs UTF-16 charCodeAt — "spec lie"
2026-06-07 zero Racket 33/33 7 contracts enforce round-trip at boundary, not in tests