REPL-Driven Compliance: Multi-Language Gate Pipeline Development
Table of Contents
The flight-tracking note demonstrated REPL-driven development against external data sources. This note extends the methodology to a different problem: implementing the same protocol compliance spec in 29 languages, verified against a single production access log.
The system under test is an RSS reader with plumbing. Every implementation does the same thing: fetch a URL politely. The differentiation is not what data you get — it is how you get it compliantly.
Update (2026-05-25): the corpus reached 29 languages (see the experience report), adding the spec-stressors Haskell, Zig, Erlang and C alongside a libcurl-FFI tier (Chez, SBCL, Janet) for languages with no native HTTP/TLS. A human-directed
--spidermode exercises R9 (structured-format link extraction) live and stays compliant — same gate pipeline, R13-tagged, refuses to scrape HTML.
1. The stack
elfeed (Elisp) ─┐ tech-crawler (Clojure) ─┤── same feeds, same Walsh-Research/1.2 UA @walsh/research (TS) ─┤── same robots.txt compliance (RFC 9309) compliance-harness (29) ─┘── same gate pipeline, verified via access log
One spec (walsh-research-compliance/v1.3). One production access log. 29 implementations that must all produce the same observable behavior.
2. Gate pipeline
The pre-request gate pipeline is the core invariant. Order is normative and non-commutative:
R3 (blocklist) → R2 (robots.txt) → R4 (throttle) → fetch (R5 backoff)
First gate to DENY stops the request. No HTTP request is made for denied URLs. This ordering is the same across all 29 implementations.
3. REPL-first development
For each language with a genuine interactive evaluator, the implementation grew from a working 10-line REPL session:
| Language | REPL | First expression |
|---|---|---|
| Python | ipython |
urllib.request.urlopen("https://wal.sh/robots.txt").read() |
| Clojure | clj -M -r |
(http/get "https://wal.sh/robots.txt") |
| Ruby | irb |
Net::HTTP.get(URI("https://wal.sh/robots.txt")) |
| Guile | guile |
(http-get (string->uri "https://wal.sh/robots.txt")) |
| Elisp | ielm |
(url-retrieve-synchronously "https://wal.sh/robots.txt") |
| Elixir | iex |
:httpc.request(:get, {~c"https://wal.sh/robots.txt", []}, [], []) |
| Racket | racket -i |
(port->string (get-pure-port (string->url "https://wal.sh/robots.txt"))) |
| Chez | scheme |
(libcurl-get "https://wal.sh/robots.txt") |
| Perl | perl -de0 |
HTTP::Tiny->new->get("https://wal.sh/robots.txt") |
| Tcl | tclsh |
::http::geturl "https://wal.sh/robots.txt" |
| Deno | deno |
await fetch("https://wal.sh/robots.txt") |
For compiled languages (Go, Rust, Java, Swift), the REPL is the compile-run cycle. The methodology still applies: write the smallest thing that fetches a URL, observe the result, then grow the gate pipeline around it.
4. Verification: the access log as oracle
Every implementation tags its compliance fixture requests with R13:
GET /bot/?impl=walsh/compliance-harness/python&sha=7a01aea&spec=walsh-research-compliance/v1.3
The production access log is the single source of truth. After
gmake test-all, gmake logs shows which languages checked in:
c chez clojure d deno elisp elixir erlang go guile haskell janet java julia kotlin lua nim perl php python racket ruby rust sbcl scala swift tcl typescript zig
No mocks. No local assertions about what the server saw. The real server saw it or it did not.
5. CPRR cycle per language
Each implementation follows the CPRR cycle:
- Conjecture: open a REPL, fetch
/robots.txt, parse it - Proof: write the gate pipeline (<200 lines), commit
- Refinement: run against production, check access log
- Refutation: if it fails,
git notes adddocuments why
29 progressive feat(lang) commits (one per language) plus scaffold/docs;
30 git notes as the failure ledger. Bugs found by *running*, not reading:
Tcl SHA float coercion (~6062e32 → 6.062e+35) and return-inside-catch
swallowing the blocklist; Erlang's fold dropping the final (Walsh-Research)
group and illegal regex escapes; a Zig use-after-free where robots patterns
pointed into a freed body; and three in the --spider path live (finditer
shape, robots-at-target-scheme, and transport-error fail-open — clarification C5).
6. What the REPL teaches that tests do not
- Tcl coerces
6062e32to6.062e+35(scientific notation). No unit test catches this because the test input would need to be a real git SHA that happens to look like a float. - Chez Scheme has no native HTTP/TLS. The REPL session reveals this in 10 seconds. A CI pipeline reveals it in 10 minutes.
- Guile's
(web request)module requires GnuTLS, which has a different CA path on macOS vs FreeBSD. The REPL shows the TLS handshake error immediately.
The REPL compresses the feedback loop from "write 200 lines, run tests, read error" to "evaluate one expression, observe, adjust."
7. Stretch goals
Three implementations that stress the spec's assumptions about what "native HTTP" means:
Realized in the 29-language corpus:
- libcurl-FFI tier — Chez, SBCL, Janet have no native HTTP/TLS, so they link
libcurl through FFI (a
7-line C shim once, for the variadic ~curl_easy_setoptABI). Not a subprocess: the honest answer to "is FFI-to-libcurl still native?" - No runtime safety net — Zig and C; manual memory exposed a use-after-free the
GC'd languages hid. Zig's 0.16
Io-basedstd.http.Clientwas the most churn. - Different control model — Erlang/Elixir on OTP (gates kept deliberately sequential, not a supervision tree); Haskell's monadic IO making the DENY short-circuit structurally safe (no early-return footgun).
Still genuinely future: WASM/WASI-HTTP (fetch from the runtime), Make (the gate
pipeline as a dependency graph), and a post-fetch X-Robots-Tag=/Content-Signal
gate (R14) — see the harness's =spec-clarifications.org.
8. Related
- REPL-Driven Flight Tracking — the methodology applied to data exploration
- CPRR Methodology — the lifecycle model
- Walsh-Research Compliance Spec (v1.3) — the normative contract
- Orchestrator Pattern — multi-agent decomposition
- walsh/compliance-harness — the 29-language implementation
- Experience report — per-language compliant-fetching notes (this corpus)
- kanaka/mal — prior art: same Lisp, 97 languages
9. Lisp-family REPL exercise: sbcl, ecl, clisp, fennel (2026-05-25)
Four Lisp-family implementations exercised in parallel tmux sessions
(repl-sbcl, repl-ecl, repl-clisp, repl-fennel) against the v1.3
harness on FreeBSD 15.0 / nexus.lan.
9.1. SBCL — existing impl, live redefinition
The SBCL implementation required one FreeBSD port: .dylib → .so
(shared library extension) and -dynamiclib → -shared -fPIC in the
compiler invocation. Everything else compiled unchanged.
Setup at the REPL: sbcl --load /tmp/sbcl-repl-setup.lisp
=== SBCL FreeBSD REPL setup loaded ===
libcurl FFI shim: /tmp/whc-sbcl-curlshim.so
spec: walsh-research-compliance/v1.3
Positive case:
(multiple-value-bind (status body hdrs) (http-get "https://wal.sh/bot/") (format t "status=~a body-length=~a~%" status (length body))) ;; status=200 body-length=6721
Full gate pipeline — named group found with crawl-delay=2 and 3 rules:
group-named=T delay=2 rules=3
All 5 compliance targets passed:
[R5] FETCH https://wal.sh/bot/ -> 200 PASS allow FETCH https://wal.sh/bot/ [R4] wal.sh: wait 1.89s (interval 2.0s) [R5] FETCH https://wal.sh/research/bots/dogfood-allow -> 200 PASS allow FETCH https://wal.sh/research/bots/dogfood-allow [R4] wal.sh: wait 1.86s (interval 2.0s) [R5] FETCH https://wal.sh/research/bots/dogfood-walsh-only -> 200 PASS allow FETCH https://wal.sh/research/bots/dogfood-walsh-only [R2] DENY https://wal.sh/research/bots/dogfood-disallow (robots) PASS deny R2-DENY https://wal.sh/research/bots/dogfood-disallow [R3] DENY https://example.com/ (operator blocklist) PASS deny R3-DENY https://example.com/ T
9.1.1. Live redefinition: deny-all mode
;; Override host-blocked? to always return T (defun host-blocked? (host blocked) (declare (ignore host blocked)) (format *error-output* " [R3-OVERRIDE] deny-all active~%") t) ;; WARNING: redefining COMMON-LISP-USER::HOST-BLOCKED? in DEFUN
SBCL emits a style warning but the redefinition takes effect immediately.
No restart required. The gate now denies every URL including /bot/:
[R3-OVERRIDE] deny-all active [R3] DENY https://wal.sh/bot/ (operator blocklist) kind=R3-DENY status=NIL
Restoring the original is an identical defun. SBCL's debugger never
fired during normal compliance testing — the condition system is quiet for
non-exceptional operations. The style warning on redefinition is the only
condition raised.
9.1.2. New function added at the REPL
(defun count-gates-passed (url blocked rules) "Count gates a URL clears before deny or fetch." ...)
Results:
| URL | gates-cleared | log |
|---|---|---|
https://wal.sh/bot/ |
4 | (R3-PASS R2-PASS R4-throttle R5-fetch) |
https://example.com/ |
0 | (R3-BLOCKED) |
https://wal.sh/research/bots/dogfood-disallow |
1 | (R3-PASS R2-BLOCKED) |
9.2. ECL — no existing impl, exploration
ECL 24.5.10 has TCP sockets via the built-in SB-BSD-SOCKETS module:
(require 'sockets) ; => ("SB-BSD-SOCKETS" "SOCKETS")
DNS resolution works:
(sb-bsd-sockets:host-ent-address (sb-bsd-sockets:get-host-by-name "wal.sh")) ;; => #(173 236 197 209)
Plain TCP connection and HTTP/1.0 GET succeed, but wal.sh rejects
HTTP/1.0 with a 400. The target requires HTTPS. The sb-bsd-sockets
module provides no TLS layer.
Path to compliance: ECL has ffi:load-foreign-library. Loading libcurl
succeeded:
(ffi:load-foreign-library "/usr/local/lib/libcurl.so.4") ;; => #<codeblock "/usr/local/lib/libcurl.so.4" 0x833e10d80>
The same shim strategy used by SBCL applies — compile a 7-line C shim
for the variadic curl_easy_setopt ABI, load it via ffi:load-foreign-library.
All gate logic (R2/R3/R4/R5) is pure Lisp: no obstacles.
Condition system note: interrupting the blocking socket-connect call
produced an INTERACTIVE-INTERRUPT with numbered restarts:
Condition of type: INTERACTIVE-INTERRUPT Console interrupt. 1. (CONTINUE) CONTINUE 2. (RESTART-TOPLEVEL) Go back to Top-Level REPL.
ECL and SBCL share the CL restart protocol for async interrupts.
9.3. CLISP — no existing impl, exploration
GNU CLISP 2.49.95+ bundles a SOCKET package directly in its image —
require fails but the package is already present. TCP via
socket:socket-connect:
(defvar *cs* (socket:socket-connect 80 "wal.sh")) ;; *CS* — returns a bidirectional stream, not a socket object
The return value is a stream, not a socket struct. HTTP/1.0 GET returned 400 for the same reason (HTTPS required).
CLISP's FFI confirmed libcurl is loadable and curl_easy_init returns a
valid handle:
(ffi:default-foreign-library "/usr/local/lib/libcurl.so.4") (ffi:def-call-out curl-easy-init (:name "curl_easy_init") (:library "/usr/local/lib/libcurl.so.4") (:return-type ffi:c-pointer) (:arguments)) (curl-easy-init) ;; => #<FOREIGN-ADDRESS #x00003A7F906DD800>
CLISP gotcha: the format string "~10 lines" is interpreted as a field-width
directive (10), not literal text. Use explicit "~%" before numbers or
escape the tilde. This is a difference from SBCL where ~(format t "~10 foo")
would also be an error but the message was different.
9.4. Fennel/LuaJIT — no existing impl, exploration
Fennel 1.6.0 on PUC Lua 5.4 has neither socket (luasocket not installed)
nor ffi (that is a LuaJIT extension). However, Fennel can target
LuaJIT:
fennel --lua luajit Welcome to Fennel 1.6.0 on LuaJIT 2.1.1765228720 BSD/x64!
On LuaJIT, (require :ffi) works. libcurl loaded and curl_easy_init
returned a valid pointer:
(local ffi (require :ffi)) (local lc (ffi.load "/usr/local/lib/libcurl.so.4")) (local h (lc.curl_easy_init)) (print (tostring h)) ;; cdata<void *>: 0x0a9c41095000
Full HTTPS GET of /bot/ at the REPL, using the same pre-compiled shim:
(do (local ffi (require :ffi)) (local lc (ffi.load "/usr/local/lib/libcurl.so.4")) (local sh (ffi.load "/tmp/whc-sbcl-curlshim.so")) (local lbc (ffi.load "/lib/libc.so.7")) (lc.curl_global_init 3) (local h (lc.curl_easy_init)) (local fp (lbc.fopen "/tmp/body.txt" "wb")) (sh.shim_setopt_str h 10002 "https://wal.sh/bot/?impl=walsh/compliance-harness/fennel") (sh.shim_setopt_str h 10018 "Mozilla/5.0 (compatible; Walsh-Research/1.2; +https://wal.sh/bot/)") (sh.shim_setopt_long h 52 1) (sh.shim_setopt_long h 13 30) (sh.shim_setopt_ptr h 10001 fp) (local rc (lc.curl_easy_perform h)) (lbc.fclose fp) (local cb (ffi.new "long[1]")) (sh.shim_getinfo_long h 2097154 cb) (local status (tonumber (. cb 0))) (lc.curl_easy_cleanup h) (print (string.format "[fennel/repl] rc=%d status=%d" rc status))) ;; [fennel/repl] rc=0 status=200
R3 gate sketched in Fennel:
(fn host-blocked? [host blocked]
(var found false)
(each [_ d (ipairs blocked)]
(when (or (= host d)
(= (string.sub host (- (+ 1 (length d)))) (.. "." d)))
(set found true)))
found)
(print (host-blocked? "example.com" ["example.com" "badactor.net"]))
;; true
(print (host-blocked? "wal.sh" ["example.com" "badactor.net"]))
;; false
9.5. Summary: libcurl shim as a compliance primitive
All four Lisp environments can load /usr/local/lib/libcurl.so.4 via
their respective FFI mechanisms. The variadic ABI shim (7 lines of C,
compiled once at startup to ~/tmp/whc-*-curlshim.so) is the only
platform-specific artifact. Gate logic is portable across CL dialects
and Fennel with minimal adaptation.
| Impl | Runtime | FFI | libcurl | HTTPS fetch | Gate logic |
|---|---|---|---|---|---|
| sbcl | SBCL 2.5.7 | sb-alien | WORKS | WORKS (200) | all gates |
| ecl | ECL 24.5.10 | ffi:load-foreign-library | LOADS | shim needed | all gates |
| clisp | CLISP 2.49.95+ | ffi:def-call-out | WORKS (handle ok) | shim needed | all gates |
| fennel | LuaJIT 2.1 | ffi (LuaJIT) | WORKS | WORKS (200) | R3 sketched |
The SBCL condition/restart system was silent during normal compliance
runs. The only conditions raised were style warnings on defun
redefinition — non-fatal, non-interactive, proceed automatically.
ECL's INTERACTIVE-INTERRUPT on Ctrl-C is the same protocol with
numbered restarts: the CL restart vocabulary is consistent across
implementations.