Forth Teaches You That Readability Is a Choice, Not a Requirement

Origins: One Programmer's Refusal to Accept Overhead

In 1969, Charles Moore was writing software to control radio telescopes at the National Radio Astronomy Observatory. The machines he worked with were slow, memory was scarce, and the existing tools — assembler, early FORTRAN — felt like fighting the hardware rather than directing it. So he built his own system.

According to Wikipedia's history of programming languages, FORTH is the earliest concatenative programming language, designed by Moore as a personal development system. The name itself was a truncation — Moore wanted to call it FOURTH, as in fourth-generation language, but the operating system he was using limited filenames to five characters.

That constraint is almost too on-the-nose. Forth was born under pressure, and pressure shaped everything about it.

Moore's core insight was radical: a programming language doesn't need to do much. It needs a stack, a dictionary of named operations, and the ability to define new operations in terms of old ones. That's it. Everything else — type systems, memory management, syntax rules, standard libraries — is overhead that someone decided to add. Forth simply refused to add it.

Design Philosophy: The Stack Is the Only Data Structure You Need

Here's a small Forth program that adds two numbers and prints the result:

5 3 + .

That's it. You push 5 onto the stack, push 3, call + (which pops both and pushes their sum), then call . (which pops and prints). No variables declared. No function signatures. No parentheses negotiating operator precedence. The stack is implicit, always present, always the medium through which everything flows.

This is called Reverse Polish Notation, and it's either immediately obvious or permanently baffling depending on how your brain is wired. The point isn't that RPN is more elegant — it's that it eliminates an entire category of parsing complexity. The interpreter never has to figure out what you meant. You said exactly what you meant, in execution order.

Defining new words (Forth's term for functions) looks like this:

: SQUARE DUP * ;

You've just defined SQUARE as "duplicate the top of the stack, then multiply." Now SQUARE is a first-class citizen of the language, indistinguishable from built-in operations. This is what Wikipedia's compiler history article means when it notes that Forth is an extensible programming language — the boundary between "language" and "program" is deliberately blurred. You're not just writing in Forth; you're continuously extending it.

The practical consequence is that Forth programs tend to read like a domain-specific language someone built for exactly this problem. A program controlling a telescope doesn't look like generic software — it looks like a vocabulary for telescopes. That's a genuine design achievement. It's also where the readability tradeoff bites hard.

The Tradeoff Nobody Warns You About

Here's the honest version: Forth code written by someone else is often nearly unreadable. Not because the language is obscure, but because its minimalism puts almost all the burden of clarity on the programmer's naming choices. When every operation is a one-word definition stacked on other one-word definitions, the semantic distance between the code and its intent can be enormous.

Consider this: SQUARE is a good name. But Forth programmers under hardware pressure often write things like 2DUP OVER UNDER ROT in rapid succession, and the reader has to mentally simulate a stack to follow what's happening. There's no syntax to lean on. There are no type annotations to hint at intent. The language trusts you completely, which means it also abandons you completely.

I'd argue this is the central lesson Forth teaches: readability is not a property of code, it's a property of the relationship between code and reader. Most modern languages make readability easier by constraining what you can express and how you can express it. Forth makes no such bargain. It gives you maximum power — Rosetta Code notes that Forth can function as both an interactive shell and a compiler in the same session, which is genuinely unusual — and leaves the legibility problem entirely to you.

This is why Forth survives in embedded systems, firmware, and hardware control code, where the programmer and the reader are often the same person, working on a codebase small enough to hold in one head. In those contexts, the tradeoff is worth it. The language gets out of the way, the binary is tiny, and the system runs on hardware that would laugh at a Python runtime.

Why It Matters: What Forth Reveals About Every Language You Use

Every language you use daily has made choices that Forth refused to make. Rust's borrow checker is overhead — overhead that prevents entire classes of bugs. Python's readable syntax is overhead — overhead that makes code accessible to more people. TypeScript's type system is overhead — overhead that catches errors before runtime.

None of these are free. They cost compilation time, runtime performance, cognitive load, or all three. Most of the time, the tradeoff is obviously worth it. But Forth forces you to ask the question explicitly: what is this feature actually buying me, and what am I paying for it?

That question is worth asking about every abstraction you reach for. When you add a framework, a type annotation, a design pattern — you're making a Forth-style tradeoff in reverse. You're accepting overhead in exchange for something: safety, clarity, maintainability, team velocity. Forth just makes the cost of not accepting that overhead visible in a way that comfortable, high-level languages never do.

Moore built a language that fits in a few kilobytes and can bootstrap itself. That's not a curiosity. That's a proof of concept about what a language actually requires to exist.

Your Next Step

Write a Forth interpreter. Not a complete one — just enough to handle integer literals, +, -, *, ., and word definition with : name ... ;. It will take you a few hours in whatever language you're comfortable with, and by the end you'll understand more about how language runtimes work than most tutorials will teach you. The Rosetta Code Forth category has dozens of small programs you can use as test cases once your interpreter runs. The constraint is the lesson: when you have to implement the whole thing yourself, you stop taking any of it for granted.