Modeling Apache Redirect Rules in TLA+
Table of Contents
1. The problem: redirect rules compose into state machines
An .htaccess file is a program. Each RewriteRule and Redirect directive
is a transition in a state machine where the state is a URL and the transitions
are rewrites. When rules interact – one rule's output is another rule's input
– the composition can loop.
The wal.sh .htaccess has grown to include:
- HTTPS enforcement (http → https)
- Canonical URL enforcement (.html → extensionless, 301)
- Ghost page removal (README.html → 410)
- Content negotiation (extensionless → .md when
Accept: text/markdown) - Extensionless fallback (extensionless → .html, internal rewrite)
- Legacy path redirects (encode → transform, graphql/* → *, etc.)
- Explicit per-page redirects (flat file → directory, etc.)
The dangerous interaction is between rules 2 and 5:
Rule 2: /research/apl.html → /research/apl (301 external) Rule 5: /research/apl → /research/apl.html (internal rewrite)
On paper, this loops. In practice, Apache's THE_REQUEST variable – which
freezes the original client request and never changes on internal rewrites –
prevents rule 2 from re-firing. But "in practice" is not a proof. The question
is whether this holds across all URL permutations: with query params, with
trailing slashes, with README.html fragments, with the Accept header
varying, with the legacy redirects interleaving.
That question is a model-checking question.
2. Why TLA+ and not just testing
You could curl every URL. We have ~682 pages, 2 schemes, 2 extensions, an
Accept header toggle, and query params – call it 5,000 test cases. But:
- You only test the URLs that exist today. A future page at a new path might hit a rule interaction you never tested.
- Redirect chains are sequential – a 3-hop chain means 3 HTTP requests. At scale, this is slow and flaky.
- The interesting bugs are in the rule interaction, not the individual rules. Testing individual URLs does not prove the absence of loops in the composition.
TLA+ explores the state space exhaustively. The model has a finite set of URL shapes, a finite set of rules, and a bounded hop count. The model checker (TLC) enumerates every reachable state and verifies two properties:
- Safety:
hops <MaxHops= – no redirect storm. - Liveness:
<>(phase = "done")– every request eventually terminates.
If either property fails, TLC produces a counterexample: the exact sequence of rule firings that violates the property. That counterexample is the bug.
3. The model
The full spec is at docs/htaccess-redirect-spec.tla. The key design
decisions:
3.1. State
A request is a record:
request == [scheme, path, ext, query, fragment]
the_request is a separate variable that freezes at connection time and never
changes on internal rewrites. This models Apache's THE_REQUEST server
variable, which is the linchpin of loop prevention.
3.2. Phases
Apache processes rules in two phases. External redirects ([R=301]) send the
client a new URL; the client makes a new request and all rules re-evaluate.
Internal rewrites ([L]) change the served file without a new HTTP request;
rules in the current pass stop but do not restart from the top.
The model has three phases: redirect (external rules), rewrite (internal
rules), and done (terminal).
3.3. The critical invariant
Rule 2 (.html → extensionless 301) uses THE_REQUEST:
RewriteCond %{THE_REQUEST} \s/(.+)\.html[\s?] [NC]
After rule 2 fires and the client follows the 301 to the extensionless URL,
rule 5 internally rewrites it back to .html. But THE_REQUEST for the new
request is the extensionless URL – it does not contain .html – so rule 2's
condition fails. No loop.
In the TLA+ model, this is expressed as:
HtmlToExtensionless == /\ phase = "redirect" /\ request.ext = ".html" \* THE_REQUEST must contain .html — loop-prevention guard /\ the_request contains ".html" /\ request' = [request EXCEPT !.ext = ""] /\ the_request' = the_request \* frozen, unchanged ...
The internal rewrite does not update the_request, so the guard fails on
re-evaluation.
4. A real bug found
Before modeling, the .htaccess contained:
Redirect 301 /research/local-first/ /research/local-first.html
This redirects a trailing-slash request to the .html file. Combined with the
new global .html → extensionless rule, the chain becomes:
GET /research/local-first/ → 301 /research/local-first.html (explicit Redirect, line 142) → 301 /research/local-first (global .html→extensionless rule) → internal rewrite to local-first.html → 200
Three hops instead of one. Not a loop, but a wasteful chain. The TLA+ model
flagged this as a state with hops = 2 before reaching done. The fix:
redirect directly to the extensionless canonical:
Redirect 301 /research/local-first/ /research/local-first
Now: one hop, then internal rewrite, then 200.
4.1. Bug 2: prefix match shadows canary routes
The /ouroboros landing page initially used Apache's Redirect directive:
Redirect 301 /ouroboros /research/tla-plus-htaccess-redirects
This silently broke the canaries. Apache's Redirect is a prefix match:
/ouroboros matches /ouroboros/a, /ouroboros/b, and /ouroboros/c – all
three test paths were redirected to the research note instead of looping. The
swap canary became dead code.
The symptom in production:
GET /ouroboros/a → 301 /research/tla-plus-htaccess-redirects/a (prefix match, wrong!) → 404
The TLA+ model should have caught this, but the spec used exact string
equality (=) for path matching, which does not model Apache's prefix
semantics. A PrefixMatch operator would have surfaced the conflict: both
OuroborosLanding and OuroborosA enabled for the same input, with the
landing taking priority by rule order.
The fix: RedirectMatch with an anchored regex that matches only the bare
path:
RedirectMatch 301 ^/ouroboros/?$ /research/tla-plus-htaccess-redirects
This is the more interesting bug class. The local-first chain (Bug 1) is a
rule-interaction problem – two correct rules compose badly. The prefix shadow
is a semantic mismatch – the developer's mental model (Redirect is exact)
differs from Apache's actual behavior (Redirect is prefix). TLA+ catches
rule interactions naturally but catches semantic mismatches only if the model
faithfully represents the directive semantics. Model the wrong thing and
you verify the wrong thing.
5. The webring trap: intentional non-termination for bots
The ?w= and ?t= query parameters are Winding Web Ring (WWN) torus
coordinates. They generate an infinite URL space by design – a honeypot that
traps aggressive crawlers in an endless traversal. Amazonbot's 28K requests/day
on these coordinates is the trap working.
The TLA+ model explicitly excludes these parameters from redirect rules. The invariant is:
Query params ?w= and ?t= MUST pass through all redirect rules unchanged. No rule may strip, rewrite, or redirect based on w or t values.
This is a safety property of the redirect system, not a liveness property.
The webring's non-termination is in the crawl graph, not in the redirect
chain. Any single request with ?w=3&t=1 terminates in one hop (200). The
non-termination is that following the links on the served page leads to
?w=4&t=1, then ?w=5&t=1, forever. That is the webring, not a redirect
loop.
6. Practical workflow
The .htaccess header now mandates:
Any change to redirect or rewrite rules MUST be modeled in TLA+ before deployment to prove termination across all URL permutations.
In practice:
- Edit
docs/htaccess-redirect-spec.tlato add the new rule as a transition. - Run TLC:
tlc HtaccessRedirects -config HtaccessRedirects.cfg - If TLC finds a counterexample, the rule interaction is unsafe. Fix it.
- If TLC passes, the composition terminates. Deploy.
The initial state space is small (~5,000 states for 6 paths × 2 schemes × 2 extensions × 2 accepts × 2 query shapes × 2 fragments). TLC explores it in under a second. Adding a new redirect rule adds one transition and marginally increases the state space. The cost of running the check is negligible compared to debugging a redirect storm in production.
7. The ouroboros canary
The .htaccess includes intentional non-terminating redirects under
/ouroboros/. The bare path /ouroboros redirects here (to this page).
The sub-paths are the test canaries:
| Test | URL | Failure mode | Expected |
|---|---|---|---|
| Swap | /ouroboros/a | Cycle (a → b → a → …) | ERR_TOO_MANY_REDIRECTS after ~20 hops |
| Grow | /ouroboros/c | Acyclic divergence (c → cc → ccc → …) | 414 Request-URI Too Long at ~8190 bytes |
When you run the TLA+ model checker with /ouroboros/a in the initial state,
TLC must report a liveness violation:
Error: Temporal properties were violated. The following behavior constitutes a counterexample: State 1: request.path = "ouroboros/a", phase = "redirect", hops = 0 State 2: request.path = "ouroboros/b", phase = "redirect", hops = 1 State 3: request.path = "ouroboros/a", phase = "redirect", hops = 2 ... Back to state 2.
If TLC does not flag this, the spec is wrong – the termination property is too weak or the ouroboros transitions are unreachable. The canary validates the validator.
7.1. Type 1: swap (cycle)
/ouroboros/a ↔ /ouroboros/b. The state graph has a cycle of length 2.
TLC detects this as a liveness violation: the system never reaches done.
Browsers give up after ~20 hops (ERR_TOO_MANY_REDIRECTS).
curl -sIL https://wal.sh/ouroboros/a 2>&1 | grep -E 'HTTP/|Location:' # 20 × 301 alternating between /ouroboros/a and /ouroboros/b
7.2. Type 2: grow (unbounded state)
/ouroboros/c → /ouroboros/cc → /ouroboros/ccc → … The path grows by
one character each hop. The state graph has no cycle – every state is new –
but it never terminates either.
In practice, the chain hits three limits (binary search on path length):
| Path length | Status | Enforcer |
|---|---|---|
| ≤ 4058 chars | 301 | RewriteRule fires, redirect works |
| 4059–8169 chars | 403 | Server-level path limit (DreamHost mod_security) |
| ≥ 8170 chars | 414 | Apache LimitRequestLine (stock default ~8190 bytes) |
A vanilla Apache would reach 414 at ~8190 bytes. This server's 403 at ~4058
chars is a hosting-provider restriction – the same class of "the model
assumes one thing, production has another" gap that makes verification
worth doing. TLC detects the grow canary as a NoRedirectStorm violation
when hops > MaxHops, regardless of which server limit actually fires.
curl -sIL https://wal.sh/ouroboros/c 2>&1 | grep -E 'HTTP/|Location:' # 301 /ouroboros/cc, 301 /ouroboros/ccc, ..., 414 Request-URI Too Long
This is a more subtle pathology than the swap loop. A naive "detect cycles"
check would miss it entirely – the redirect chain is acyclic but infinite.
The TLA+ model catches it because it bounds hops, not cycles.
8. Client redirect limits
The TLA+ model bounds hops at MaxHops. In production, the bound is
imposed by the client. The core clients that matter for redirect validation:
| Client | Default max redirects | On exceed |
|---|---|---|
| Browsers (Chrome/Firefox/Safari) | 20 | ERR_TOO_MANY_REDIRECTS |
| curl | 50 | error 47 |
| Python requests | 30 | TooManyRedirects |
| Go net/http | 10 | *url.Error |
The range is 10–50. Testing with curl (50 hops) exercises more of the chain than a browser (20 hops); Go at 10 is the most conservative.
Full empirical results across 13 clients (including Deno, Perl, Ruby, Crystal, Erlang, Elixir, Racket) with tangleable code blocks and server-side log verification: HTTP Redirect Limits: One Ouroboros, Six Languages.
For the wal.sh bot compliance spec (C6), redirect policy is deferred to v1.4.