TLA+ for Build Pipeline Verification: When Make Targets Are State Machines

Table of Contents

1. The observation

A build pipeline is a distributed system. The wal.sh publish pipeline touches four machines (FreeBSD workstation, Mac image generator, DreamHost VPS, GitHub), makes network calls that can fail independently, and has ordering constraints between steps. A banner generation alone crosses three failure domains:

workstation → ollama on mac.lan:11434 → workstation → scp to VPS
   (shell)     (HTTP/model inference)     (magick)     (SSH/scp)

Each arrow can fail: the Mac can be asleep, ollama can OOM, magick can be missing (as we discovered when pkg remove gnutls cascaded through 323 packages), or scp can timeout. The pipeline has no memory of where it failed. Re-running gmake deploy starts from scratch.

This is the same class of problem TLA+ was designed for: concurrent processes with partial failure, where "just retry from the top" is wasteful and "resume from where we stopped" requires tracking state.

2. What Make gets right

GNU Make already models a subset of this. A non-PHONY target with a file dependency is a state machine transition:

banner.png: index.org
    ./scripts/gen-banner "$(PROMPT)" $@

If banner.png exists and is newer than index.org, Make skips the step. This is idempotency via filesystem timestamps – the banner.png file is the state. Make's dependency graph is a DAG of such transitions, and gmake evaluates it correctly for the local filesystem.

The problem is that Make's state model is files on one machine. It has no model for:

  • Network calls that succeed on the remote but fail on write-back
  • Multi-machine state (image generated on Mac but not yet on workstation)
  • Partial products (raw 1024x1024 PNG exists but 728x90 crop doesn't)
  • External service availability (ollama loaded but model not pulled)
  • ABI compatibility (magick binary exists but liblcms2_fast_float.so.1 doesn't)

Each of these is a state the pipeline can be in, and Make cannot represent it.

3. What went wrong: a session log

On 2026-06-14, the wal.sh publish pipeline hit every failure mode in a single session:

Step Expected Actual Root cause
gmake publish org-publish via emacs liblcms2_fast_float.so.1 not found pkg remove gnutls cascaded 323 packages
gmake deploy scp .htaccess + HTML deploy-preflight refused (ouroboros targets missing) Preflight checks file existence for redirect targets
gen-banner 728x90 PNG magick: command not found ImageMagick7 removed in cascade
gen-banner (retry) use Pillow fallback success Pillow was not in the cascade
git commit signed commit cannot run gpg gnupg removed in cascade
guile3 HTTPS test redirect chain gnutls-not-available guile-gnutls not in FreeBSD ports
emacs --batch 30.2 from pkg libtree-sitter.so.0.25 not found tree-sitter upgraded 0.25→0.26

Every failure is a state that Make cannot see. The Makefile says "run emacs" but doesn't know that emacs needs liblcms2_fast_float.so.1 which needs lcms2 at a specific ABI version.

4. The TLA+ model

A build pipeline modeled as a state machine has these components:

4.1. State

PipelineState == [
    step     : {"idle", "generating", "cropping", "converting",
                "deploying", "verifying", "done", "failed"},
    artifacts : SUBSET {"raw_png", "cropped_png", "greyscale_png",
                        "deployed_png", "html_published", "htaccess_deployed",
                        "sitemap_deployed", "index_deployed"},
    services  : [ollama: BOOLEAN, emacs: BOOLEAN, magick: BOOLEAN,
                 gpg: BOOLEAN, ssh: BOOLEAN],
    errors    : Seq(STRING)
]

4.2. Transitions

Each build step is a transition that:

  1. Checks a precondition (service available, input artifact exists)
  2. Performs an action (network call, shell command)
  3. Produces an output artifact or an error
GenerateBanner ==
    /\ step = "generating"
    /\ services.ollama = TRUE
    /\ "raw_png" \notin artifacts
    /\ artifacts' = artifacts \union {"raw_png"}
    /\ step' = "cropping"

CropBanner ==
    /\ step = "cropping"
    /\ services.magick = TRUE \/ services.pillow = TRUE
    /\ "raw_png" \in artifacts
    /\ artifacts' = artifacts \union {"cropped_png"}
    /\ step' = "converting"

CropBannerFallback ==
    /\ step = "cropping"
    /\ services.magick = FALSE
    /\ services.pillow = TRUE
    /\ "raw_png" \in artifacts
    /\ artifacts' = artifacts \union {"cropped_png"}
    /\ step' = "converting"

4.3. Failure and recovery

The key insight: failure is not "the pipeline stopped" but "the pipeline is in a state where some artifacts exist and some don't." Recovery means finding a transition from that state to the next step, not restarting from scratch.

RecoverFromCropFailure ==
    /\ step = "failed"
    /\ "raw_png" \in artifacts
    /\ "cropped_png" \notin artifacts
    /\ step' = "cropping"        \* retry just the crop step

This is what gmake does for file targets – if banner.png is missing but raw.png exists, Make re-runs only the crop step. The extension is modeling non-file state: service availability, ABI compatibility, network reachability.

5. The beads connection

The beads issue tracker already tracks tasks with IDs and status. Extending beads to track artifact generation creates a content-addressable pipeline:

bd create --title="Banner: redirect-client-limits" \
          --type=task --priority=3 \
          --notes="prompt_hash=sha256:a3f2..., model=flux2-klein:4b, target=728x90-grey"

The bead ID becomes the pipeline state. Each step updates the bead:

(-> (bd/create {:title "Banner: redirect-client-limits"
                :prompt-hash "sha256:a3f2..."
                :model "x/flux2-klein:4b"})
    (generate-raw {:host "mac.lan:11434"})   ;; ollama call
    (crop {:tool :pillow :size [728 90]})     ;; fallback chain
    (greyscale)                                ;; colorspace conversion
    (validate {:format :png :depth 8})         ;; check output
    (deploy {:target "vps:~/wal.sh/research/"}) ;; scp
    (bd/close))

If generate-raw fails (Mac asleep), the bead stays at status in_progress with :step :generate-raw and :error :connection-refused. The REPL can retry from that point:

(-> (bd/show "www.wal.sh-oj8")
    (retry-from-step :generate-raw {:host "mac.lan:11434"}))

This is the threading macro as a pipeline checkpoint. Each step is idempotent (re-running with the same input produces the same output), and the bead tracks which steps have completed.

6. Non-PHONY Make targets as verification

The practical lesson: use non-PHONY targets in the Makefile wherever the output is a file. This gives Make's built-in state tracking for free.

# BAD: PHONY target, always re-runs
.PHONY: banner
banner:
        ./scripts/gen-banner "$(PROMPT)" site/research/$(SLUG)/banner.png

# GOOD: file target, skips if newer than source
site/research/%/banner.png: site/research/%/index.org
        ./scripts/gen-banner "$$(head -1 $</scripts/prompt-from-org $<)" $@

# GOOD: deploy only if local file changed
deploy-banner-%: site/research/%/banner.png
        scp $< www_wal_sh@vps:~/wal.sh/research/$*/banner.png
        touch $@

The deploy-banner-% target creates a stamp file. If the stamp exists and is newer than the local banner.png, the deploy is skipped. This is Make's timestamp DAG doing exactly what the TLA+ model describes: tracking which artifacts exist and which transitions have completed.

The gap remains for non-file state (service availability, ABI compat), but for the file-artifact portion of the pipeline, non-PHONY targets are the correct and sufficient mechanism.

7. The pkg remove cascade as a safety violation

The gnutls cascade removal (audit) is a safety violation in the TLA+ sense: the system transitioned to a state where services.emacs = FALSE and services.magick = FALSE simultaneously, making multiple pipeline steps unreachable.

In a TLA+ model of the package manager:

PkgRemove(pkg) ==
    /\ pkg \in installed
    /\ LET deps == ReverseDeps(pkg)
       IN installed' = installed \ ({pkg} \union deps)

Safety == services.emacs = TRUE /\ services.gpg = TRUE

The model checker would find that PkgRemove("gnutls") violates Safety because ReverseDeps("gnutls") includes emacs via the gnutls → glib-networking → gtk3 → emacs chain.

This is not hypothetical – it happened. The fix is either:

  1. Check pkg info -r gnutls | wc -l before removing (manual guard)
  2. Use pkg remove --dry-run and parse the output (automated guard)
  3. Model the package dependency graph and verify safety before any removal

Option 3 is what TLA+ provides. Whether it's worth the effort for package management is debatable; for build pipelines that cross machine boundaries, it clearly is.