Streamlining Metadata Generation and Website Publishing in Org-mode with AI Assistance

Table of Contents

Overview

This document describes an automated workflow for generating and updating metadata in Org-mode files using AI assistance. The process involves a Python script (org_title_lint.py) and an Emacs Lisp configuration for publishing a website.

Goals

  • Automate metadata generation for Org-mode files
  • Maintain consistent metadata across a large number of files
  • Improve organization and searchability of Org-mode content
  • Streamline the publishing process for a website built with Org-mode
  • Provide flexibility in AI provider choice and execution options

Workflow Components

graph TD
    A[Start] --> B{Check API keys}
    B -->|Invalid| C[Error: API key not set]
    B -->|Valid| D{Check NPM_AUTHOR_INFO}
    D -->|Invalid| E[Error: NPM_AUTHOR_INFO not set or invalid]
    D -->|Valid| F[Set up AI client]
    F --> G[Walk through directory]
    G --> H{Is file .org?}
    H -->|No| G
    H -->|Yes| I{Dry run?}
    I -->|Yes| J[Echo would-be changes]
    I -->|No| K{Force update?}
    K -->|No| L{Existing title?}
    L -->|Yes| M[Skip file]
    L -->|No| N[Generate metadata]
    K -->|Yes| N
    N --> O[Update file with new metadata]
    O --> P[Echo update confirmation]
    J --> G
    M --> G
    P --> G
    G -->|All files processed| Q[End]

Python Script (org_title_lint.py)

The Python script org_title_lint.py is the core of the metadata generation process. Its main features include:

  • Recursive scanning of Org-mode files
  • AI-powered metadata generation (title and keywords)
  • Flexible AI provider choice (OpenAI or Anthropic's Claude)
  • Metadata updating with various fields (TITLE, AUTHOR, EMAIL, URL, KEYWORDS, REVIEWER)
  • Dry run option for previewing changes
  • Force update option to overwrite existing metadata
  • Error handling and encoding detection

Python Setup Example

graph TD
    A[Edit .py files] -->|Save changes| B{Run Black}
    B -->|Errors found| C[Fix formatting issues]
    C --> B
    B -->|No errors| D[Run Pylint]
    D -->|Errors found| E[Fix linting issues]
    E --> D
    D -->|No errors| F[Run unit tests]
    F -->|Tests fail| G[Fix failing tests]
    G --> F
    F -->|Tests pass| H[Check for existing headers]
    H -->|Headers exist| I[Manually review headers]
    I --> J{Need update?}
    J -->|Yes| K[Remove old header]
    K --> L[Generate new header]
    J -->|No| M[Skip file]
    H -->|No headers| L
    L --> N[Apply AI-generated header]
    N --> O{Validate header}
    O -->|Invalid| P[Manual header adjustment]
    P --> O
    O -->|Valid| Q[Check for duplicates]
    Q -->|Duplicates found| R[Resolve duplicates]
    R --> Q
    Q -->|No duplicates| S[Update all files]
    S --> T[Run integration tests]
    T -->|Tests fail| U[Fix integration issues]
    U --> T
    T -->|Tests pass| V[Commit changes]
    V --> W[Run CI/CD pipeline]
    W --> X[Deploy to staging]
    X --> Y[Run acceptance tests]
    Y -->|Tests fail| Z[Fix acceptance issues]
    Z --> Y
    Y -->|Tests pass| AA[Deploy to production]
    AA --> AB[Monitor for issues]
import os
from orgparse import load

def process_org_files(directory):
    for root, dirs, files in os.walk(directory):
        for file in files:
            if file.endswith('.org'):
                file_path = os.path.join(root, file)
                org = load(file_path)
                title = org.get_property('TITLE')
                if not title:
                    print(f"Missing title in {file_path}")
                # Additional processing...

if __name__ == "__main__":
    process_org_files("/Users/username/sandbox/org-example/research")

Emacs Lisp Configuration

The Emacs Lisp configuration sets up the publishing environment for the website. Key aspects include:

  • Project structure definition for static and Org files
  • Custom HTML export settings
  • Breadcrumb generation for navigation
  • Property drawer handling
  • Babel language support for code execution

Emacs Lisp Setup Example

(require 'ox-publish)

(setq org-publish-project-alist
      '(("my-org-site"
         :base-directory "~/sandbox/org-example"
         :publishing-directory "/ssh:user@example.com:~/public_html/"
         :recursive t
         :publishing-function org-html-publish-to-html
         :with-author nil
         :with-creator nil
         :with-toc nil
         :section-numbers nil
         :time-stamp-file nil)))

(defun my-publish-project ()
  (interactive)
  (org-publish-project "my-org-site" t))

(global-set-key (kbd "C-c p") 'my-publish-project)

Makefile

The Makefile provides a set of commands to manage the workflow:

  • Environment setup
  • Website publishing
  • Cleaning output directory
  • Updating titles (with dry run and force options)
  • Deployment to server
  • Link checking
  • Local server for preview
  • Sitemap generation

Makefile Example

.PHONY: publish lint clean

EMACS = emacs
PYTHON = python3

publish:
        $(EMACS) --batch --eval "(require 'ox-publish)" \
                --eval "(setq org-publish-project-alist '((\"my-org-site\" :base-directory \"~/sandbox/org-example\" :publishing-directory \"/ssh:user@example.com:~/public_html/\" :recursive t :publishing-function org-html-publish-to-html)))" \
                --eval "(org-publish-all t)"

lint:
        $(PYTHON) org_title_lint.py --directory research

clean:
        rm -rf /ssh:user@example.com:~/public_html/*

all: lint publish

Update Process

graph TD
    A[Edit .org files] -->|Save changes| B{Run linter}
    B -->|Errors found| C[Fix issues]
    C --> B
    B -->|No errors| D[Commit changes]
    D --> E[Push to repository]
    E --> F[CI/CD pipeline]
    F --> G[Run tests]
    G -->|Tests pass| H[Deploy to server]
    G -->|Tests fail| I[Fix failing tests]
    I --> F
    H --> J[Update live site]
    J --> K[Verify changes]
    K -->|Issues found| L[Create bug report]
    L --> A
    K -->|No issues| M[End]
  1. Environment Setup: Run make env to set up the Python environment and install dependencies.
  2. Metadata Generation:
    • For a dry run: make title-dry-run
    • To update titles: make title
    • To force update titles: make title-force
  3. Publishing: Run make publish to generate the website using Emacs and the defined Org-mode project.
  4. Preview: Use make server to start a local server and preview the generated website.
  5. Deployment: Execute make deploy to publish and deploy the website to the server.
  6. Maintenance:
    • make clean to clear the output directory
    • make check-links to verify links in the published files
    • make sitemap to generate a sitemap for the website

Conclusion

This automated workflow significantly streamlines the process of maintaining and publishing an Org-mode based website. By leveraging AI for metadata generation and providing a flexible set of tools, it enhances productivity and ensures consistency across the project.

Author: Jason Walsh

j@wal.sh

Last Updated: 2025-07-30 13:45:28

build: 2025-12-29 20:02 | sha: 3c17632