Org-Mode Feed Generation: Atom and RSS from Static Sites

Table of Contents

1. Overview

An org-mode site has one corpus of .org files and one parser (wal-sh.site.org/read-all). Six tools project this corpus into six artifacts for six audiences. The parser extracts #+TITLE, #+DATE, #+DESCRIPTION, #+KEYWORDS, headings, and property drawers. Each tool consumes the same parse output and discards what it doesn't need.

2. Six projections, one parser

Tool Output Consumer What it reads
pocket-es.indexer search-index.json Browser (BM25 search) title, description, keywords, headings, body terms
wal-sh.site.feed site.atom Feed readers (elfeed, NetNewsWire) title, date, description, path
scripts/generate_sitemap.py sitemap.xml Search engines (Googlebot) path, date (lastmod)
wal-sh.site.check-headers lint report Developer (CI / gmake lint) title, author, date, keywords presence
wal-sh.site.annotations verification report Auditor (daily-publish, REPL) property drawers (:VERIFIED_AT:, :VERDICT:)
wal-sh.site.provenance provenance mapping Agent (ProvenanceGuard) property drawers (:SOURCE:, :VERIFIED_BY:)

The parser is the invariant. The projections vary. Adding a seventh tool (say, a citation graph extractor) requires only a new consumer of read-all, not a new parser.

3. Architecture

site/**/*.org
     │
     └── wal-sh.site.org/read-all ──┬──────────┬──────────┬──────────┬──────────┐
                                     │          │          │          │          │
                                     ▼          ▼          ▼          ▼          ▼
                                  indexer     feed     sitemap    headers   annotations
                                     │          │          │          │          │
                                     ▼          ▼          ▼          ▼          ▼
                               index.json  site.atom  sitemap.xml  lint     drawers

4. The three published feeds

Feed URL What it contains Generated by
site.atom https://wal.sh/site.atom (/feed redirects here) Our published content (last 14 days) wal-sh.site.feed
research.atom https://wal.sh/current/research.atom Outbound links the crawler curated tech-crawler /www-sync
Pinboard RSS https://feeds.pinboard.in/rss/u:jwalsh/ Phone/browser bookmarks Pinboard (external)

Three audiences: what we publish, what we read, what we bookmark.

5. Shared parser contract

Every tool depends on the same map shape from org/read-all:

{:path     "site/current/2026-06-19.org"
 :title    "Morning Brief: Thursday, June 19"
 :date     "2026-06-19"
 :author   "Jason Walsh"
 :keywords ["fable-arc" "agent-trust"]
 :description "Research brief: ..."
 :headings ["Top (5-7 min)" "Themes this week" ...]
 :content  "raw org text..."
 ;; property drawers (from annotations scanner):
 :drawers  [{:heading "Top" :verified_at "..." :verdict "correct"}]}

If this shape changes, all six tools break. The parser is the contract. Property-based tests in pocket-es.indexer and wal-sh.site.check-headers verify the shape holds across the full corpus.

6. Feed validation

Both site-generated feeds validate against RFC 4287 (Atom 1.0):

xmllint --noout site/site.atom && echo "site.atom: valid"
xmllint --noout site/current/research.atom && echo "research.atom: valid"

Required elements (RFC 4287 Section 4.1.1): <feed>, <title>, <id>, <updated>, <author>, <link rel"self">=.

The <generator> tag carries the version: <generator uri"https://wal.sh/research/pocket-es/" version="1.0.0">wal-sh.site.feed</generator>=

7. Staleness

The feed, index, and sitemap all go stale when an org file changes. gmake publish-daily regenerates all three:

gmake publish-daily
# Runs: publish-current → publish-today → index → deploy-index → sitemap

The Makefile uses gmake's own timestamp comparison for the org publish step (.published/current/research.org stamp file). The index and sitemap are still phony targets (always rebuild). Making them non-phony with $(wildcard site/**/*.org) dependencies is tracked in a separate bead.

8. Related