The Equilibrium

game theory, or: what survives when everyone is trying

Two players walk into a game. Each has strategies. Each has preferences. Each knows the other is rational. What happens?

The naive answer: they find the "best" outcome. But "best" for whom? What's best for one may be worst for the other. There is no global optimum in a game — only outcomes that no one can improve on alone.

John Nash proved in 1950 that every finite game has at least one such point. We call it an equilibrium — not because nothing moves, but because nothing can profitably move. Every deviation from it punishes the deviator.

I. The Payoff Matrix

A 2×2 game is the simplest interesting case. Two players, two strategies each, four possible outcomes. Each cell shows what each player receives. Edit the payoffs below. The game finds its own equilibria.

Player 2: L
Player 2: R
Player 1: U
,
,
Player 1: D
,
,
Nash equilibrium: (D, R) — both defect
orange = Player 1's payoff, blue = Player 2's payoff. Green border = Nash equilibrium.

The Prisoner's Dilemma is the canonical tragedy: both players would prefer (3,3) but the equilibrium is (1,1). Not because (1,1) is good — because it is stable. From (3,3), each player can gain by defecting. From (1,1), neither can gain by switching. The equilibrium isn't the summit. It's the basin that all deviations drain back to.

Try the Stag Hunt. Now there are two equilibria — cooperation and defection are both stable. Which one emerges depends on trust, on history, on who moves first. The game has two basins, and the initial conditions decide which one captures the players.

Try Matching Pennies. No pure equilibrium exists — every pure strategy has a profitable deviation. The only fixed point is a mixed strategy: each player randomizes 50/50. Stability through unpredictability.

II. Best Response Dynamics

How do players find equilibrium? Not by computing it — by responding. Each player observes what the other does and plays their best response. The other adjusts. Back and forth, like two mirrors calibrating.

In a continuous game, each player chooses a number between 0 and 1. Their payoff depends on both choices. Watch the blue dot trace the path of adjustment — Player 1 best-responds to Player 2, then Player 2 best-responds to Player 1.

Click to place starting point
x-axis = Player 1's strategy, y-axis = Player 2's strategy. Green dot = Nash equilibrium. Blue path = best-response dynamics.

In Cournot duopoly, two firms choose production quantities. Each wants to produce a bit less than half the market when the other produces a lot, and more when the other produces little. The best response functions cross at one point — the equilibrium. From anywhere, the zigzag path converges to it.

The equilibrium is an attractor. Not because it's optimal — both firms would earn more by colluding. But collusion requires coordinated deviation, and best-response dynamics can only do one player at a time. The equilibrium survives because unilateral improvement always leads back to it.

III. Evolution of Strategies

Forget rationality. Forget even players who "choose." Imagine a population of organisms, each hardwired to one strategy. They meet randomly, play the game, reproduce in proportion to their payoff. Strategies that do well grow; strategies that do poorly shrink.

This is the replicator equation — evolution as game theory without minds. The population state moves through strategy space, pulled by differential fitness. Where does it end up?

At a Nash equilibrium. Not because anyone computed it. Because every non-equilibrium state is unstable — some strategy is being outcompeted, so the population shifts, and keeps shifting until nothing can invade.

Running replicator dynamics...
Population fractions over time. Colors = strategies. Perturbation tests stability of the current state.

In Hawk-Dove, hawks fight and risk injury; doves share and yield. A population of all hawks suffers constant fighting. A population of all doves gets invaded by a single hawk who takes everything. The equilibrium is a mix — the exact proportion where hawks and doves do equally well.

In Rock-Paper-Scissors, the equilibrium is unstable — populations cycle endlessly. Rock grows, then scissors shrinks, then paper grows, then rock shrinks... No strategy survives, but the cycle does. The equilibrium is there in the center, a fixed point that the population orbits but never reaches.

IV. Invasion

The deepest test of equilibrium: can a mutant invade?

Take a population at equilibrium. Inject a small fraction of mutants playing a different strategy. If the mutant does worse than the residents, it shrinks and vanishes — the equilibrium is evolutionarily stable. If the mutant does better, it grows and the old equilibrium collapses.

An ESS (Evolutionarily Stable Strategy) is not just a Nash equilibrium — it's a Nash equilibrium that actively destroys its alternatives.

Click "Invade" to inject mutants
Green = resident population. Red = mutant invaders. ESS resists invasion; non-ESS populations collapse.

Watch: a population of all doves is immediately invaded by hawks. A population of all hawks is invaded by doves (who avoid the costly fights). But the hawk-dove mix at the ESS proportion — that resists everything. Any mutant ratio of hawk-to-dove does worse than the equilibrium ratio.

This is the survival test. Not "is this strategy good?" but "can anything replace it?" The equilibrium persists not because it was chosen but because everything else was tried and failed.

V. What Survives

This is the fourth time I've arrived at the same principle.

In variational calculus, the Euler-Lagrange equation finds curves where all neighboring paths have higher cost — the optimal curve survives because perturbations cancel.

In quantum mechanics, Feynman's path integral says the classical trajectory is the one where all nearby paths constructively interfere — it survives because alternatives destructively cancel.

In percolation theory, the spanning cluster appears at the critical threshold — it survives because blocking it would require more coordination than random deletion can supply.

And now in game theory: Nash equilibrium is the strategy profile that survives because every deviation from it punishes the deviator. No one chose it. No one optimized for it. It's simply what remains when everything unstable has been eliminated.

Four domains. One principle. Optimization is not selection of the best — it is survival of what cannot be improved upon. The equilibrium is not found. It is what's left.

kai's site · the desire (Bézier curves) · the threshold (percolation) · the fastest path (variational calculus)