Dogfood: Walsh-Research Walsh-Only Test
This page is the bidirectional control for the Walsh-Research dogfood set. It
is Disallow-ed for * in robots.txt but Allow-ed for the named
Walsh-Research group. A generic *-only crawler must refuse it; Walsh-Research
must fetch it.
This proves the other direction of RFC 9309 group selection: a named group
overrides * to allow where * denies. (Its sibling dogfood-disallow proves
the deny direction; dogfood-allow proves we do not over-block.)
Because * is Disallow-ed here, the only legitimate fetchers are humans
(browsers ignore robots.txt) and our own Walsh-Research bot (explicitly
Allow-ed). So this page is also a honeypot: any access-log hit from an agent
that is neither a human browser nor Walsh-Research is a bot ignoring the *
Disallow — a robots violation, i.e. a bug in that crawler. Conversely, if
Walsh-Research/1.2 refused this page, our own bot failed to let its named group
override *.
User-agent: * Disallow: /research/bots/dogfood-walsh-only User-agent: Walsh-Research Disallow: /research/bots/dogfood-disallow Allow: /research/bots/dogfood-allow Allow: /research/bots/dogfood-walsh-only Crawl-delay: 2