Graphviz: The DOT Language and Rendering Pipeline
Table of Contents
Overview
Graphviz is the diagram substrate for every research page on this site.
Eighty-five .dot files render to PNG via the dot layout engine. This
page documents the language, the rendering pipeline, and the integration
with Emacs org-babel for literate programming.
Installed version: 14.1.1 (FreeBSD pkg, Dec 2025). Current upstream: 14.1.4 (graphviz.org/download/source).
The DOT Language
Formal grammar
The DOT language grammar from graphviz.org/doc/info/lang.html:
graph : [ 'strict' ] ('graph' | 'digraph') [ ID ] '{' stmt_list '}'
stmt_list : [ stmt [ ';' ] stmt_list ]
stmt : node_stmt | edge_stmt | attr_stmt | ID '=' ID | subgraph
attr_stmt : ('graph' | 'node' | 'edge') attr_list
attr_list : '[' [ a_list ] ']' [ attr_list ]
a_list : ID '=' ID [ (';' | ',') a_list ]
edge_stmt : (node_id | subgraph) edgeRHS [ attr_list ]
edgeRHS : edgeop (node_id | subgraph) [ edgeRHS ]
node_stmt : node_id [ attr_list ]
node_id : ID [ port ]
port : ':' ID [ ':' compass_pt ] | ':' compass_pt
subgraph : [ 'subgraph' [ ID ] ] '{' stmt_list '}'
compass_pt: 'n' | 'ne' | 'e' | 'se' | 's' | 'sw' | 'w' | 'nw' | 'c' | '_'
ID types
Four forms of identifier:
| Form | Syntax | Example | |
|---|---|---|---|
| Alphanumeric | [a-zA-Z_][a-zA-Z_0-9]* |
my_node |
|
| Numeral | =[-]?(\.[0-9]+\ | [0-9]+(\.[0-9]*)?)= | 3.14 |
| Quoted string | "..." with \" escape |
"Node A" |
|
| HTML string | <...> with matched angle brackets |
<B>bold</B> |
Quoted strings support concatenation with + and backslash-continued
newlines. HTML strings do not.
Keywords
Case-insensitive: node, edge, graph, digraph, subgraph, strict.
Edge operators
| Graph type | Operator |
|---|---|
| Directed | -> |
| Undirected | -- |
Comment syntax
Three forms, all standard C/C++:
The // line comment is the form org-babel uses for tangle breadcrumbs:
These comments survive parsing — dot -Tpng ignores them, dot -Tdot
strips them from the canonical output. The breadcrumbs are invisible to
the renderer but enable org-babel-detangle to map tangled code back to
its source block.
Attribute inheritance
Default attributes propagate forward in the file:
This is why the wal.sh style guide places node [...] and edge [...]
defaults at the top of every graph — they establish the palette for all
nodes that follow.
Clusters
A subgraph whose name starts with cluster is laid out as a visual
container by the dot engine. This is a convention, not a language rule —
other layout engines may ignore it.
Rendering pipeline
The pipeline: source text → libcgraph parser (builds an in-memory graph
AST) → layout engine (assigns coordinates to every node and edge) →
renderer (serializes to the target format).
dot -Tdot skips the renderer and emits the positioned graph as canonical
DOT with coordinates — useful for debugging layout and for verifying that
a source file parses correctly.
Layout engines
| Engine | Algorithm | Use case |
|---|---|---|
dot |
Hierarchical (Sugiyama) | Directed graphs, layered diagrams — default for wal.sh |
neato |
Spring model (Kamada-Kawai) | Undirected graphs, network topologies |
fdp |
Force-directed (Fruchterman-Reingold) | Large undirected graphs |
sfdp |
Scalable force-directed | Very large graphs (10K+ nodes) |
circo |
Circular layout | Ring topologies, cyclic structures |
twopi |
Radial layout | Trees with a central root |
osage |
Clustered rectangles | Treemaps, space-filling |
patchwork |
Squarified treemap | Area-proportional visualization |
All 85 wal.sh diagrams use dot (hierarchical). The rankdir attribute
controls the primary axis: TB (top-to-bottom) for layer stacks, LR
(left-to-right) for pipelines and state machines.
Org-babel integration
ob-dot
Emacs ob-dot enables dot src blocks in org files:
#+begin_src dot :file diagram.png :exports results :cmdline -Tpng digraph G { a -> b; } #+end_src
:file is required — ob-dot writes to this path. :cmdline -Tpng passes
flags to the dot command. :exports results shows only the image in
HTML output, not the source.
Tangle/detangle workflow
The round-trip workflow for editing diagrams:
- Tangle —
gmake tangleexports#+begin_src dot :tangle file.dotblocks from.orgto standalone.dotfiles. With:comments link, the tangled file includes// [[file:...]]breadcrumbs. - Edit — modify the
.dotfile directly (agents or manual). - Compile —
dot -Tpng file.dot -o file.pngto verify it renders. - Detangle —
gmake detanglereads the breadcrumbs and pulls changes back into the org src blocks. - Verify — re-tangle to confirm the round-trip is stable.
For this to work, the graphviz-dot-mode must set comment-start to
"// " so org-babel knows how to write the breadcrumb comments. This is
configured in project-config.el:
(add-to-list 'org-babel-tangle-lang-exts '("dot" . "dot")) (add-hook 'graphviz-dot-mode-hook (lambda () (setq-local comment-start "// ") (setq-local comment-end "")))
Canonical validation
dot -Tdot parses a source file and emits the canonical form with
computed positions. Comments are stripped. This is useful for:
- Verifying a file parses without errors
- Comparing two files for structural equivalence (ignore formatting)
- Debugging layout issues (the
posattributes show exact coordinates)
Batch validation of all diagrams:
find site/research -name '*.dot' | while read f; do dot -Tdot "$f" > /dev/null 2>&1 || echo "FAIL: $f" done
wal.sh diagram inventory
As of 2026-05-20: 85 dot files across site/research/, all rendering
without errors. See diagram-style-guide for the canonical palette and
the six diagram archetypes (pipeline, layer stack, comparison, state
machine, process loop, ecosystem map).
Emacs built-in support (31.0.50)
Emacs ships four relevant subsystems for DOT. No external packages required.
ob-dot (org-babel)
Built-in since Emacs 24. Evaluates #+begin_src dot blocks.
| Header arg | Default | Purpose |
|---|---|---|
:results |
file |
Output is always a file (image) |
:exports |
results |
Show image, not source |
:file |
(required) | Output path, extension sets format |
:cmdline |
-T<ext> |
Passed to dot command |
:cmd |
dot |
Layout engine (neato, fdp, etc.) |
The execute function writes the body to a temp file, calls
dot <tmpfile> <cmdline> -o <outfile>, and returns the file path.
Variables ($name) are interpolated from :var headers.
No session support — each evaluation is a fresh process.
ob-tangle + comment-start
Tangle writes DOT src blocks to standalone .dot files. With
:comments link, breadcrumb comments enable detangle (round-trip).
Requires comment-start set to "// " in the target buffer.
Our project-config.el registers this via graphviz-dot-mode-hook
and adds ("dot" . "dot") to org-babel-tangle-lang-exts.
wisent (LALR parser generator)
Built-in since Emacs 23 (part of CEDET/Semantic). Accepts BNF grammars
in .wy files, generates Emacs Lisp LALR(1) parsers.
The DOT grammar translates mechanically from the spec's EBNF:
| EBNF | wisent BNF |
|---|---|
[ X ] |
opt_X : /* empty */ \vert X ; |
( A \vert B ) |
Separate alternatives |
: production op |
Same syntax |
Compile path: M-x semantic-grammar-batch-build-packages on a .wy
file produces a *-wy.el parser table.
semantic-lex (lexer framework)
Built-in companion to wisent. Defines lexer rules as define-lex-regex-analyzer
forms. The DOT lexer needs:
//line comments (trivial)#preprocessor lines (trivial)/* */block comments (via syntax table)- Quoted strings with
\" escapeand backslash-NL continuation - HTML strings with balanced
<...>nesting - Numeric IDs (
-?(\.[0-9]+|[0-9]+(\.[0-9]*)?)) - Case-insensitive keywords (
graph=Graph=GRAPH)
The first three are trivial; the last four are the real work.
graphviz-dot-mode (external package)
Not built-in. Installed from MELPA via project-config.el. Provides:
- Syntax highlighting for
.dotfiles comment-start/comment-end(needed for tangle breadcrumbs)- Indentation
compileintegration (C-c C-crunsdot)
Wisent grammar for DOT
The full wisent grammar is at ../../research/graphviz/dot.wy (if tangled
from the companion note). Production-by-production match with the spec
at graphviz.org/doc/info/lang.html, verified against 85 .dot files on
this site.
Key design choice: compass_pt ({n, ne, e, …}) is folded into id
rather than a separate token class. The closed set is enforced in a
semantic pass, not the grammar. This avoids a shift-reduce conflict
where a:n could be port-name or compass-point.
References
- DOT Language specification
- Graphviz attribute reference
- Node shapes
- Color names and schemes
- Source releases (current: 12.2.1 stable, 14.1.4 dev)
- ob-dot documentation
- wal.sh diagram style guide
- wal.sh design system
