Bug Archaeology: Unix V4 Games Through a 2025 Lens

Table of Contents

Introduction

What would modern static analysis, fuzzing, and security review reveal about programs written in 1973? This document examines the Unix V4 games through the lens of 50 years of software engineering knowledge—not to exploit these programs (who else will run them?), but to understand how far we've come.

The 1973 C Programming Landscape

Pre-ANSI C: A Different Language

The C used in Unix V4 predates the ANSI standard (1989) by 16 years. Key differences:

No Function Prototypes

/* 1973 style - no argument types */
main()
{
    printf("Hello");
}

foo(a, b)
int a;
char *b;
{
    /* body */
}

/* Modern style (post-1989) */
int main(void) {
    printf("Hello\n");
    return 0;
}

int foo(int a, char *b) {
    /* body */
}

Without prototypes, the compiler cannot verify:

  • Correct number of arguments
  • Correct types of arguments
  • Return type expectations

Implicit Int

Everything defaults to int if not specified:

/* 1973: register variables default to int */
register i, j, k;

/* 1973: functions return int by default */
getchar()
{
    /* returns int implicitly */
}

K&R Declaration Style

/* Function parameters declared after signature */
copy(s1, s2)
char *s1, *s2;
{
    while(*s1++ = *s2++);
}

PDP-11 Specific Quirks

16-bit Word Size

  • int is 16 bits (range: -32768 to 32767)
  • Pointers are 16 bits
  • Maximum addressable memory: 64KB

Octal Notation

PDP-11 hardware addresses are traditionally written in octal:

/* From Unix V4 kernel main.c */
int lksp[]
{
    0177546,   /* Clock address (octal!) */
    0172540,
    0
};

No Hardware Memory Protection

The PDP-11/45 had MMU capabilities, but many programs assumed:

  • Direct memory access
  • No address space separation
  • Single-user environment

Observed Behaviors in Testing

Games That Validate Input

Game Input Tested Response
moo aaaa (letters) "bad guess"
moo 99999999 (too long) "bad guess"
ttt 10 (out of range) "Illegal move"
ttt abc (letters) "Illegal move"
chess z9z9 (invalid) "eh?"
wump 999 (invalid room) "You hit wall"

Interesting Edge Cases

Wump with Negative Room Numbers

Move or shoot (m-s) s
Give list of rooms terminated by 0
-1
You are in room 1
There are tunnel0

The output garbling suggests possible signed/unsigned confusion when processing negative input.

TTT Learning System

Accumulated knowledge? n
0 'bits' of knowledge
...
134 'bits' returned

After playing, TTT wrote 134 "bits" to /usr/games/ttt.k. This file persists—a primitive form of machine learning that could be:

  • Corrupted by concurrent access (no file locking)
  • Filled with adversarial data by a malicious player

Likely Bug Classes (Based on Era Practices)

1. Buffer Overflows

Without fgets() (introduced later), 1973 programs typically used:

/* Dangerous pattern from 1973 */
char buf[20];
scanf("%s", buf);    /* No length limit */
gets(buf);           /* Even worse - deprecated since 2011 */

Modern equivalent with protection:

char buf[20];
if (fgets(buf, sizeof(buf), stdin) == NULL) {
    /* handle error */
}

The chess program accepted our 80+ character input without crashing, likely because:

  • The terminal driver truncates lines
  • Unix V4's canonical mode buffers only ~256 chars

2. Integer Overflow

With 16-bit int, overflow is easy:

/* In bj (blackjack), tracking money */
int money = 32000;
money = money + 1000;  /* Wraps to -32536! */

The blackjack game's "Action $" prompt suggests currency tracking— betting high amounts could cause wraparound.

3. Uninitialized Variables

Pre-ANSI C didn't require variable initialization:

int score;          /* Could be anything! */
printf("%d", score);

Static analysis of the era wouldn't catch this. The behavior depends on what happened to be in that memory location.

4. Format String Issues

The games use printf extensively. While format string attacks weren't known until the 1990s, the pattern exists:

/* Potential issue if name came from user input */
printf(name);  /* Should be printf("%s", name); */

5. File Descriptor Leaks

Without close() being consistently called:

fd = open("/usr/lib/book", 0);
/* read chess opening book */
/* forgot to close(fd) */

Unix V4 had a limit of ~15 open file descriptors per process. Long chess sessions could exhaust this.

6. Signal Handling

Unix V4's signal handling was primitive:

signal(SIGINT, handler);
/* In handler: */
handler()
{
    /* Can be called again before returning! */
    /* No sigprocmask() to block re-entry */
}

Pressing Ctrl-C during a game write could corrupt state.

7. Race Conditions in TTT Learning

The ttt game reads/writes /usr/games/ttt.k:

/* Probable pattern (no source available) */
fd = open("/usr/games/ttt.k", 2);
read(fd, knowledge, sizeof(knowledge));
/* ... play game ... */
lseek(fd, 0, 0);
write(fd, knowledge, sizeof(knowledge));

Two simultaneous players would corrupt the knowledge base. There's no file locking (flock didn't exist yet).

What 1973 Programmers Got Right

Despite the limitations, these games show good practices for the era:

1. Input Validation Exists

All games reject clearly invalid input. This wasn't universal—many programs crashed on bad input.

2. Graceful Error Messages

"bad guess", "Illegal move", "eh?" are user-friendly compared to segmentation faults or silent corruption.

3. Compact Code

The entire games collection fits in 26KB. Efficiency wasn't optional when you had 64KB total memory.

4. Separation of Data and Code

The chess opening book (/usr/lib/book) and ttt knowledge file show data-driven design—easy to update without recompiling.

Modern Analysis Techniques (Not Available in 1973)

Technique Invented Would Find
Fuzz testing 1988 Buffer overflows, crashes
Static analysis 1970s+ Uninitialized vars, null derefs
Valgrind/ASan 2000+ Memory leaks, buffer overruns
Code coverage 1963* Untested paths (*primitive forms)
Formal methods 1970s+ Logic errors, race conditions

If we could run these binaries through modern tools:

# Hypothetical (would need PDP-11 Valgrind!)
valgrind --leak-check=full /usr/games/chess

The Security Non-Context

It's important to note: security wasn't a concern in 1973.

  • Unix V4 ran on a single PDP-11
  • Users were trusted Bell Labs employees
  • No network connectivity
  • Physical access = full access
  • The threat model was "prevent accidents" not "prevent attacks"

The concept of a "buffer overflow exploit" wouldn't be formalized until the Morris Worm (1988), 15 years later.

Conclusion

These games represent excellent engineering for their era. The bugs that likely exist—buffer overflows, integer issues, race conditions— reflect the tools and knowledge available in 1973, not programmer incompetence.

Thompson and Ritchie were inventing the discipline of systems programming as they went. That we can analyze their code 51 years later, find it mostly robust to our casual fuzzing, and still play the games—that's the remarkable thing.

The bugs tell us less about 1973 programmers than about how far defensive programming has advanced. Every "obvious" practice we take for granted—bounds checking, fuzzing, static analysis, memory safety—had to be invented, often after painful lessons.

References

Author: Jason Walsh

jwalsh@nexus

Last Updated: 2025-12-24 23:57:09

build: 2025-12-25 00:04 | sha: e37922b