Editorial illustration for "Erlang's Process Model Solves a Problem Most Languages Are Still Pretending Doesn't Exist"

Erlang's Process Model Solves a Problem Most Languages Are Still Pretending Doesn't Exist


Lesson 7 of 12: Concurrency as Architecture

Here's a number that should bother you: the Ericsson AXD301 telephone switch, built on Erlang, was reported to achieve nine nines of availability — 99.9999999% uptime. That's roughly 31 milliseconds of downtime per year. Not per day. Per year.

Before you dismiss that as telecom trivia, ask yourself: what does your current language give you when a thread crashes? In most cases, the answer is "it depends on whether you remembered to write the right try/catch." That gap — between what Erlang was designed to guarantee and what most languages leave to programmer discipline — is exactly what this lesson is about.


The Problem Erlang Was Built to Solve

In the mid-1980s, Ericsson engineers were building telephone switching systems. The requirements were brutal: handle hundreds of thousands of simultaneous calls, never go down, and recover from hardware failures without dropping active connections. The existing tools — C, mostly — required shared memory and locks to manage concurrency. Shared memory means any component can corrupt any other component's state. Locks mean deadlocks. Deadlocks mean downtime. Downtime in a telephone exchange means someone's emergency call doesn't go through.

Joe Armstrong, Mike Williams, and Robert Virding built Erlang specifically to make that class of failure structurally impossible. The language shipped internally at Ericsson around 1986 and was open-sourced in 1998. The design decisions they made weren't aesthetic preferences — they were engineering constraints imposed by the problem domain.

Understanding those constraints is the lesson.


Isolated Processes Are the Entire Point

Most concurrency models give you threads that share memory. Erlang gives you processes that share nothing.

An Erlang process is not an OS thread. It's a lightweight unit managed by the BEAM virtual machine — you can spawn hundreds of thousands of them on a single machine without meaningful overhead. Each process has its own heap. It cannot read or write another process's memory. The only way two processes communicate is by sending messages, which are copied between heaps.

This is the concrete idea to hold onto: isolation by default, communication by copying.

The theoretical foundation for this model traces back to Tony Hoare's Communicating Sequential Processes (CSP), a formal mathematical framework for describing concurrent systems through message-passing channels. Erlang didn't implement CSP directly, but the core intuition — that concurrent entities should communicate through explicit messages rather than shared state — runs through both.

Why does isolation matter so much? Because when a process crashes in Erlang, it crashes alone. Its heap disappears. No other process's memory is corrupted. The failure is contained by construction, not by convention.

Compare that to a multithreaded Java or C++ program. A bug in one thread can scribble over memory that another thread is reading. The crash, when it comes, may be far removed from the cause. Debugging it is archaeology. In Erlang, the process that failed is the process that failed. The stack trace points directly at the problem.


Supervision Trees Turn Failure Into Policy

Isolation would be merely interesting if Erlang stopped there. What makes it powerful is what happens after a process crashes.

Erlang has a built-in concept of process linking and supervision. You can link processes so that when one dies, others are notified. More importantly, you can build supervision trees: hierarchical structures where supervisor processes monitor worker processes and restart them according to configurable strategies when they fail.

This is the "let it crash" philosophy that Erlang programmers talk about. It sounds reckless until you understand the model. Because processes are isolated, crashing a broken process and restarting it fresh is often the correct recovery strategy — the same way rebooting a microservice is often faster than trying to diagnose corrupted in-memory state. The difference is that in Erlang, this recovery is designed in from the start, not bolted on with a process manager and a prayer.

Concurrent Programming in Erlang — the foundational text by Armstrong and colleagues — frames the entire design around building "real-time and distributed fault-tolerant systems" where the goal is modular systems that are easy to specify, design, and test. The supervision tree is what makes that modularity operational at runtime, not just at design time.

I'd argue this is the insight most modern distributed systems are still rediscovering. Kubernetes, circuit breakers, health checks, pod restarts — these are all external infrastructure approximating what Erlang baked into the language runtime in the 1980s. The difference is that Erlang's version is composable at the code level, not the deployment level.


What This Model Makes Possible That Others Can't Match Cleanly

Three capabilities fall out of this design that are genuinely hard to replicate in shared-memory concurrency models:

Hot code loading. Because processes communicate only through messages and hold no shared state, you can upgrade a running system by swapping in new code for a module while old processes finish their current work. Telephone switches cannot be taken offline for deployments. Erlang solved this at the language level.

Distribution as a first-class concern. Sending a message to a process on a remote node looks identical to sending one locally. The network is not an afterthought — it's modeled the same way as local concurrency. This is why Elixir (which runs on the BEAM) became popular for distributed systems work; it inherited this property directly.

Predictable failure semantics. When you design a system in Erlang, you're forced to think about what happens when each component fails, because the supervision tree requires you to declare a restart strategy. That's not a burden — it's a design discipline that most other languages let you skip until production teaches you not to.


Why This Matters Even If You Never Write Erlang

The process model is a design idea, not just a language feature. Go's goroutines and channels are a lighter-weight cousin of the same intuition. Rust's ownership model attacks a related problem — preventing memory corruption — from a different angle. Actor frameworks in Scala and Java (Akka, most visibly) are explicit attempts to bring Erlang-style isolation to the JVM.

None of them are Erlang. The BEAM's scheduler, the supervision tree primitives, the hot code loading — these remain distinctive. But the underlying principle travels: explicit message passing between isolated units produces systems that fail predictably and recover gracefully.

That principle is worth internalizing regardless of your daily language.


Your Next Action

Take one component in a system you're currently working on — a service, a module, a background job — and write down what happens when it crashes. Not what should happen. What actually happens, given your current code. Then ask: is that behavior specified in the code, or is it implicit in the infrastructure around it?

If the answer is "mostly infrastructure," you've just found the gap that Erlang was designed to close. Whether you close it with Erlang, Elixir, a supervision library, or just better-structured error handling is secondary. Seeing the gap clearly is the first step.