Modeling Apache Redirect Rules in TLA+

Table of Contents

1. The problem: redirect rules compose into state machines

An .htaccess file is a program. Each RewriteRule and Redirect directive is a transition in a state machine where the state is a URL and the transitions are rewrites. When rules interact – one rule's output is another rule's input – the composition can loop.

The wal.sh .htaccess has grown to include:

  • HTTPS enforcement (http → https)
  • Canonical URL enforcement (.html → extensionless, 301)
  • Ghost page removal (README.html → 410)
  • Content negotiation (extensionless → .md when Accept: text/markdown)
  • Extensionless fallback (extensionless → .html, internal rewrite)
  • Legacy path redirects (encode → transform, graphql/* → *, etc.)
  • Explicit per-page redirects (flat file → directory, etc.)

The dangerous interaction is between rules 2 and 5:

Rule 2: /research/apl.html → /research/apl  (301 external)
Rule 5: /research/apl     → /research/apl.html  (internal rewrite)

On paper, this loops. In practice, Apache's THE_REQUEST variable – which freezes the original client request and never changes on internal rewrites – prevents rule 2 from re-firing. But "in practice" is not a proof. The question is whether this holds across all URL permutations: with query params, with trailing slashes, with README.html fragments, with the Accept header varying, with the legacy redirects interleaving.

That question is a model-checking question.

2. Why TLA+ and not just testing

You could curl every URL. We have ~682 pages, 2 schemes, 2 extensions, an Accept header toggle, and query params – call it 5,000 test cases. But:

  1. You only test the URLs that exist today. A future page at a new path might hit a rule interaction you never tested.
  2. Redirect chains are sequential – a 3-hop chain means 3 HTTP requests. At scale, this is slow and flaky.
  3. The interesting bugs are in the rule interaction, not the individual rules. Testing individual URLs does not prove the absence of loops in the composition.

TLA+ explores the state space exhaustively. The model has a finite set of URL shapes, a finite set of rules, and a bounded hop count. The model checker (TLC) enumerates every reachable state and verifies two properties:

  • Safety: hops < MaxHops= – no redirect storm.
  • Liveness: <>(phase = "done") – every request eventually terminates.

If either property fails, TLC produces a counterexample: the exact sequence of rule firings that violates the property. That counterexample is the bug.

3. The model

The full spec is at docs/htaccess-redirect-spec.tla. The key design decisions:

3.1. State

A request is a record:

request == [scheme, path, ext, query, fragment]

the_request is a separate variable that freezes at connection time and never changes on internal rewrites. This models Apache's THE_REQUEST server variable, which is the linchpin of loop prevention.

3.2. Phases

Apache processes rules in two phases. External redirects ([R=301]) send the client a new URL; the client makes a new request and all rules re-evaluate. Internal rewrites ([L]) change the served file without a new HTTP request; rules in the current pass stop but do not restart from the top.

The model has three phases: redirect (external rules), rewrite (internal rules), and done (terminal).

3.3. The critical invariant

Rule 2 (.html → extensionless 301) uses THE_REQUEST:

RewriteCond %{THE_REQUEST} \s/(.+)\.html[\s?] [NC]

After rule 2 fires and the client follows the 301 to the extensionless URL, rule 5 internally rewrites it back to .html. But THE_REQUEST for the new request is the extensionless URL – it does not contain .html – so rule 2's condition fails. No loop.

In the TLA+ model, this is expressed as:

HtmlToExtensionless ==
    /\ phase = "redirect"
    /\ request.ext = ".html"
    \* THE_REQUEST must contain .html — loop-prevention guard
    /\ the_request contains ".html"
    /\ request' = [request EXCEPT !.ext = ""]
    /\ the_request' = the_request    \* frozen, unchanged
    ...

The internal rewrite does not update the_request, so the guard fails on re-evaluation.

4. A real bug found

Before modeling, the .htaccess contained:

Redirect 301 /research/local-first/ /research/local-first.html

This redirects a trailing-slash request to the .html file. Combined with the new global .html → extensionless rule, the chain becomes:

GET /research/local-first/
  → 301 /research/local-first.html    (explicit Redirect, line 142)
  → 301 /research/local-first         (global .html→extensionless rule)
  → internal rewrite to local-first.html
  → 200

Three hops instead of one. Not a loop, but a wasteful chain. The TLA+ model flagged this as a state with hops = 2 before reaching done. The fix: redirect directly to the extensionless canonical:

Redirect 301 /research/local-first/ /research/local-first

Now: one hop, then internal rewrite, then 200.

4.1. Bug 2: prefix match shadows canary routes

The /ouroboros landing page initially used Apache's Redirect directive:

Redirect 301 /ouroboros /research/tla-plus-htaccess-redirects

This silently broke the canaries. Apache's Redirect is a prefix match: /ouroboros matches /ouroboros/a, /ouroboros/b, and /ouroboros/c – all three test paths were redirected to the research note instead of looping. The swap canary became dead code.

The symptom in production:

GET /ouroboros/a
  → 301 /research/tla-plus-htaccess-redirects/a   (prefix match, wrong!)
  → 404

The TLA+ model should have caught this, but the spec used exact string equality (=) for path matching, which does not model Apache's prefix semantics. A PrefixMatch operator would have surfaced the conflict: both OuroborosLanding and OuroborosA enabled for the same input, with the landing taking priority by rule order.

The fix: RedirectMatch with an anchored regex that matches only the bare path:

RedirectMatch 301 ^/ouroboros/?$ /research/tla-plus-htaccess-redirects

This is the more interesting bug class. The local-first chain (Bug 1) is a rule-interaction problem – two correct rules compose badly. The prefix shadow is a semantic mismatch – the developer's mental model (Redirect is exact) differs from Apache's actual behavior (Redirect is prefix). TLA+ catches rule interactions naturally but catches semantic mismatches only if the model faithfully represents the directive semantics. Model the wrong thing and you verify the wrong thing.

5. The webring trap: intentional non-termination for bots

The ?w= and ?t= query parameters are Winding Web Ring (WWN) torus coordinates. They generate an infinite URL space by design – a honeypot that traps aggressive crawlers in an endless traversal. Amazonbot's 28K requests/day on these coordinates is the trap working.

The TLA+ model explicitly excludes these parameters from redirect rules. The invariant is:

Query params ?w= and ?t= MUST pass through all redirect rules unchanged.
No rule may strip, rewrite, or redirect based on w or t values.

This is a safety property of the redirect system, not a liveness property. The webring's non-termination is in the crawl graph, not in the redirect chain. Any single request with ?w=3&t=1 terminates in one hop (200). The non-termination is that following the links on the served page leads to ?w=4&t=1, then ?w=5&t=1, forever. That is the webring, not a redirect loop.

6. Practical workflow

The .htaccess header now mandates:

Any change to redirect or rewrite rules MUST be modeled in TLA+
before deployment to prove termination across all URL permutations.

In practice:

  1. Edit docs/htaccess-redirect-spec.tla to add the new rule as a transition.
  2. Run TLC: tlc HtaccessRedirects -config HtaccessRedirects.cfg
  3. If TLC finds a counterexample, the rule interaction is unsafe. Fix it.
  4. If TLC passes, the composition terminates. Deploy.

The initial state space is small (~5,000 states for 6 paths × 2 schemes × 2 extensions × 2 accepts × 2 query shapes × 2 fragments). TLC explores it in under a second. Adding a new redirect rule adds one transition and marginally increases the state space. The cost of running the check is negligible compared to debugging a redirect storm in production.

7. The ouroboros canary

The .htaccess includes intentional non-terminating redirects under /ouroboros/. The bare path /ouroboros redirects here (to this page). The sub-paths are the test canaries:

Test URL Failure mode Expected
Swap /ouroboros/a Cycle (a → b → a → …) ERR_TOO_MANY_REDIRECTS after ~20 hops
Grow /ouroboros/c Acyclic divergence (c → cc → ccc → …) 414 Request-URI Too Long at ~8190 bytes

When you run the TLA+ model checker with /ouroboros/a in the initial state, TLC must report a liveness violation:

Error: Temporal properties were violated.
The following behavior constitutes a counterexample:
  State 1: request.path = "ouroboros/a", phase = "redirect", hops = 0
  State 2: request.path = "ouroboros/b", phase = "redirect", hops = 1
  State 3: request.path = "ouroboros/a", phase = "redirect", hops = 2
  ...
  Back to state 2.

If TLC does not flag this, the spec is wrong – the termination property is too weak or the ouroboros transitions are unreachable. The canary validates the validator.

7.1. Type 1: swap (cycle)

/ouroboros/a/ouroboros/b. The state graph has a cycle of length 2. TLC detects this as a liveness violation: the system never reaches done. Browsers give up after ~20 hops (ERR_TOO_MANY_REDIRECTS).

curl -sIL https://wal.sh/ouroboros/a 2>&1 | grep -E 'HTTP/|Location:'
# 20 × 301 alternating between /ouroboros/a and /ouroboros/b

7.2. Type 2: grow (unbounded state)

/ouroboros/c/ouroboros/cc/ouroboros/ccc → … The path grows by one character each hop. The state graph has no cycle – every state is new – but it never terminates either.

In practice, the chain hits three limits (binary search on path length):

Path length Status Enforcer
≤ 4058 chars 301 RewriteRule fires, redirect works
4059–8169 chars 403 Server-level path limit (DreamHost mod_security)
≥ 8170 chars 414 Apache LimitRequestLine (stock default ~8190 bytes)

A vanilla Apache would reach 414 at ~8190 bytes. This server's 403 at ~4058 chars is a hosting-provider restriction – the same class of "the model assumes one thing, production has another" gap that makes verification worth doing. TLC detects the grow canary as a NoRedirectStorm violation when hops > MaxHops, regardless of which server limit actually fires.

curl -sIL https://wal.sh/ouroboros/c 2>&1 | grep -E 'HTTP/|Location:'
# 301 /ouroboros/cc, 301 /ouroboros/ccc, ..., 414 Request-URI Too Long

This is a more subtle pathology than the swap loop. A naive "detect cycles" check would miss it entirely – the redirect chain is acyclic but infinite. The TLA+ model catches it because it bounds hops, not cycles.

8. Client redirect limits

The TLA+ model bounds hops at MaxHops. In production, the bound is imposed by the client. The core clients that matter for redirect validation:

Client Default max redirects On exceed
Browsers (Chrome/Firefox/Safari) 20 ERR_TOO_MANY_REDIRECTS
curl 50 error 47
Python requests 30 TooManyRedirects
Go net/http 10 *url.Error

The range is 10–50. Testing with curl (50 hops) exercises more of the chain than a browser (20 hops); Go at 10 is the most conservative.

Full empirical results across 13 clients (including Deno, Perl, Ruby, Crystal, Erlang, Elixir, Racket) with tangleable code blocks and server-side log verification: HTTP Redirect Limits: One Ouroboros, Six Languages.

For the wal.sh bot compliance spec (C6), redirect policy is deferred to v1.4.