TLA+ for Build Pipeline Verification: When Make Targets Are State Machines
Table of Contents
1. The observation
A build pipeline is a distributed system. The wal.sh publish pipeline touches four machines (FreeBSD workstation, Mac image generator, DreamHost VPS, GitHub), makes network calls that can fail independently, and has ordering constraints between steps. A banner generation alone crosses three failure domains:
workstation → ollama on mac.lan:11434 → workstation → scp to VPS (shell) (HTTP/model inference) (magick) (SSH/scp)
Each arrow can fail: the Mac can be asleep, ollama can OOM, magick can
be missing (as we discovered when pkg remove gnutls cascaded through
323 packages), or scp can timeout. The pipeline has no memory of where
it failed. Re-running gmake deploy starts from scratch.
This is the same class of problem TLA+ was designed for: concurrent processes with partial failure, where "just retry from the top" is wasteful and "resume from where we stopped" requires tracking state.
2. What Make gets right
GNU Make already models a subset of this. A non-PHONY target with a file dependency is a state machine transition:
banner.png: index.org
./scripts/gen-banner "$(PROMPT)" $@
If banner.png exists and is newer than index.org, Make skips the step.
This is idempotency via filesystem timestamps – the banner.png file is
the state. Make's dependency graph is a DAG of such transitions, and gmake
evaluates it correctly for the local filesystem.
The problem is that Make's state model is files on one machine. It has no model for:
- Network calls that succeed on the remote but fail on write-back
- Multi-machine state (image generated on Mac but not yet on workstation)
- Partial products (raw 1024x1024 PNG exists but 728x90 crop doesn't)
- External service availability (ollama loaded but model not pulled)
- ABI compatibility (
magickbinary exists butliblcms2_fast_float.so.1doesn't)
Each of these is a state the pipeline can be in, and Make cannot represent it.
3. What went wrong: a session log
On 2026-06-14, the wal.sh publish pipeline hit every failure mode in a single session:
| Step | Expected | Actual | Root cause |
|---|---|---|---|
gmake publish |
org-publish via emacs | liblcms2_fast_float.so.1 not found |
pkg remove gnutls cascaded 323 packages |
gmake deploy |
scp .htaccess + HTML | deploy-preflight refused (ouroboros targets missing) | Preflight checks file existence for redirect targets |
gen-banner |
728x90 PNG | magick: command not found |
ImageMagick7 removed in cascade |
gen-banner (retry) |
use Pillow fallback | success | Pillow was not in the cascade |
git commit |
signed commit | cannot run gpg |
gnupg removed in cascade |
guile3 HTTPS |
test redirect chain | gnutls-not-available |
guile-gnutls not in FreeBSD ports |
emacs --batch |
30.2 from pkg | libtree-sitter.so.0.25 not found |
tree-sitter upgraded 0.25→0.26 |
Every failure is a state that Make cannot see. The Makefile says "run emacs"
but doesn't know that emacs needs liblcms2_fast_float.so.1 which needs
lcms2 at a specific ABI version.
4. The TLA+ model
A build pipeline modeled as a state machine has these components:
4.1. State
PipelineState == [
step : {"idle", "generating", "cropping", "converting",
"deploying", "verifying", "done", "failed"},
artifacts : SUBSET {"raw_png", "cropped_png", "greyscale_png",
"deployed_png", "html_published", "htaccess_deployed",
"sitemap_deployed", "index_deployed"},
services : [ollama: BOOLEAN, emacs: BOOLEAN, magick: BOOLEAN,
gpg: BOOLEAN, ssh: BOOLEAN],
errors : Seq(STRING)
]
4.2. Transitions
Each build step is a transition that:
- Checks a precondition (service available, input artifact exists)
- Performs an action (network call, shell command)
- Produces an output artifact or an error
GenerateBanner ==
/\ step = "generating"
/\ services.ollama = TRUE
/\ "raw_png" \notin artifacts
/\ artifacts' = artifacts \union {"raw_png"}
/\ step' = "cropping"
CropBanner ==
/\ step = "cropping"
/\ services.magick = TRUE \/ services.pillow = TRUE
/\ "raw_png" \in artifacts
/\ artifacts' = artifacts \union {"cropped_png"}
/\ step' = "converting"
CropBannerFallback ==
/\ step = "cropping"
/\ services.magick = FALSE
/\ services.pillow = TRUE
/\ "raw_png" \in artifacts
/\ artifacts' = artifacts \union {"cropped_png"}
/\ step' = "converting"
4.3. Failure and recovery
The key insight: failure is not "the pipeline stopped" but "the pipeline is in a state where some artifacts exist and some don't." Recovery means finding a transition from that state to the next step, not restarting from scratch.
RecoverFromCropFailure ==
/\ step = "failed"
/\ "raw_png" \in artifacts
/\ "cropped_png" \notin artifacts
/\ step' = "cropping" \* retry just the crop step
This is what gmake does for file targets – if banner.png is missing
but raw.png exists, Make re-runs only the crop step. The extension is
modeling non-file state: service availability, ABI compatibility, network
reachability.
5. The beads connection
The beads issue tracker already tracks tasks with IDs and status. Extending beads to track artifact generation creates a content-addressable pipeline:
bd create --title="Banner: redirect-client-limits" \
--type=task --priority=3 \
--notes="prompt_hash=sha256:a3f2..., model=flux2-klein:4b, target=728x90-grey"
The bead ID becomes the pipeline state. Each step updates the bead:
(-> (bd/create {:title "Banner: redirect-client-limits" :prompt-hash "sha256:a3f2..." :model "x/flux2-klein:4b"}) (generate-raw {:host "mac.lan:11434"}) ;; ollama call (crop {:tool :pillow :size [728 90]}) ;; fallback chain (greyscale) ;; colorspace conversion (validate {:format :png :depth 8}) ;; check output (deploy {:target "vps:~/wal.sh/research/"}) ;; scp (bd/close))
If generate-raw fails (Mac asleep), the bead stays at status
in_progress with :step :generate-raw and :error :connection-refused.
The REPL can retry from that point:
(-> (bd/show "www.wal.sh-oj8") (retry-from-step :generate-raw {:host "mac.lan:11434"}))
This is the threading macro as a pipeline checkpoint. Each step is idempotent (re-running with the same input produces the same output), and the bead tracks which steps have completed.
6. Non-PHONY Make targets as verification
The practical lesson: use non-PHONY targets in the Makefile wherever the output is a file. This gives Make's built-in state tracking for free.
# BAD: PHONY target, always re-runs .PHONY: banner banner: ./scripts/gen-banner "$(PROMPT)" site/research/$(SLUG)/banner.png # GOOD: file target, skips if newer than source site/research/%/banner.png: site/research/%/index.org ./scripts/gen-banner "$$(head -1 $</scripts/prompt-from-org $<)" $@ # GOOD: deploy only if local file changed deploy-banner-%: site/research/%/banner.png scp $< www_wal_sh@vps:~/wal.sh/research/$*/banner.png touch $@
The deploy-banner-% target creates a stamp file. If the stamp exists and
is newer than the local banner.png, the deploy is skipped. This is Make's
timestamp DAG doing exactly what the TLA+ model describes: tracking which
artifacts exist and which transitions have completed.
The gap remains for non-file state (service availability, ABI compat), but for the file-artifact portion of the pipeline, non-PHONY targets are the correct and sufficient mechanism.
7. The pkg remove cascade as a safety violation
The gnutls cascade removal (audit) is a safety violation in the TLA+ sense:
the system transitioned to a state where services.emacs = FALSE and
services.magick = FALSE simultaneously, making multiple pipeline steps
unreachable.
In a TLA+ model of the package manager:
PkgRemove(pkg) ==
/\ pkg \in installed
/\ LET deps == ReverseDeps(pkg)
IN installed' = installed \ ({pkg} \union deps)
Safety == services.emacs = TRUE /\ services.gpg = TRUE
The model checker would find that PkgRemove("gnutls") violates
Safety because ReverseDeps("gnutls") includes emacs via the
gnutls → glib-networking → gtk3 → emacs chain.
This is not hypothetical – it happened. The fix is either:
- Check
pkg info -r gnutls | wc -lbefore removing (manual guard) - Use
pkg remove --dry-runand parse the output (automated guard) - Model the package dependency graph and verify safety before any removal
Option 3 is what TLA+ provides. Whether it's worth the effort for package management is debatable; for build pipelines that cross machine boundaries, it clearly is.