Chapter 05.01 Phase v 17 / 57

The superposition builder

Stack eigenstates with complex coefficients

Superposition & Time 1 of 4 in phase 13 min read

A stylised pair of hydrogen orbital shapes blending into a hybrid lobe, suggesting two pure states summing into a single superposition. — Image: Wikimedia Commons · CC BY-SA 3.0 · Jubobroff J.Bobroff and full list in credits

In the winter of 1926, a young physicist in Cambridge noticed that the new quantum mechanics had an algebraic structure nobody had quite spoken out loud: the states of a system live in a vector space, and any two valid states can be added together to produce a third. He turned that observation into a notation, and the notation turned quantum mechanics into a language.

quantum · contents

Phase v · Superposition & Time · Chapter 01

The superposition builder

By the autumn of 1926, the new quantum mechanics had two rival formalisms and a deepening sense of unease. Heisenberg’s matrix mechanics, born the previous summer in Göttingen, computed transition probabilities by multiplying square arrays of numbers indexed by quantum states. Schrödinger’s wave mechanics, written in Zürich over the snowy December of 1925, propagated a continuous complex field through space and time. The two formulations gave the same answers to every experiment anyone could think of. They looked nothing alike. The young physicists who used them daily had a private suspicion that the same theory was being written in two completely different alphabets.

Paul Dirac, a twenty-four-year-old electrical engineer turned theorist at St John’s College, Cambridge, did not have a strong opinion about which alphabet was preferable. He had spent the previous year quietly absorbing both. What interested him was a structural feature both schemes shared but neither emphasised: the equations were linear. If you had two solutions, you could add them, and the sum was also a solution. Multiply a solution by a complex number, and that was a solution too. The set of valid quantum states was closed under addition and scalar multiplication. It was, in the language of nineteenth-century mathematicians, a vector space.

That sounds like a technicality. It is not. It is the rule from which almost every strange feature of quantum mechanics descends. A particle that could be in box A or in box B can also be in a state that is part box A and part box B, and the parts can have complex weights, and those weights interfere with each other when you try to measure the particle. The Schrödinger cat that is part alive and part dead is not a paradox; it is what linearity guarantees, applied to an absurdly macroscopic example. The qubit of a quantum computer that is part zero and part one is not science fiction; it is what linearity allows whenever you have a two-state system. Dirac saw that the principle was the same in all these cases, and he wrote the rule down in a notation so compact and so beautiful that it became the universal handwriting of the field within a decade.

Start with something simpler than an electron. Imagine a coin that has been flipped but not yet landed. While it is tumbling, you might say it is in a fifty-fifty state: half heads, half tails. In classical probability that means you do not know which face is up, but one of them definitely is. The randomness lives in your ignorance, not in the coin. Quantum superposition is different. A quantum coin in a fifty-fifty superposition does not have a definite face that you happen to be ignorant of. It is, in a sense the rest of this chapter will try to make precise, genuinely both, and the two parts can cancel or reinforce each other in ways no classical mixture ever could.

The mathematical machinery starts from a single rule. Call the two pure faces of the coin |heads⟩ and |tails⟩. These vertical bars and angle brackets are Dirac’s symbols, called a ket, and they denote a quantum state. The rule says: for any two complex numbers c₁ and c₂, the linear combination

|ψ⟩ = c₁ |heads⟩ + c₂ |tails⟩

is also a valid state of the coin. Pick c₁ = 1 and c₂ = 0 and you get back pure heads. Pick c₁ = 0 and c₂ = 1 and you get pure tails. Pick c₁ = c₂ = 1/√2 and you get the symmetric superposition the popular press loves to call “both at once.” The point Dirac wanted to drive home is that every choice of two complex numbers gives a perfectly legal quantum state. The state space of a quantum coin is not a set of two options. It is a two-dimensional complex vector space, a continuous infinity of distinct rays, each of which represents something a real coin can do.

What anchors all of this physics is a second rule, this one due to Max Born in the summer of 1926: the squared modulus of each coefficient is a probability. If you measure which face is up, the chance of getting heads is |c₁|², and the chance of getting tails is |c₂|². For the symmetric superposition above, |c₁|² = |c₂|² = 1/2, exactly as you would naively guess. For a more lopsided state like (3/5)|heads⟩ + (4/5)|tails⟩, the chances are 9/25 and 16/25. Notice that these two numbers add up to 1. They always must, because something definite has to happen when you look. That requirement is called normalisation: |c₁|² + |c₂|² = 1 for any legitimate state.

If probabilities only depend on |c₁|² and |c₂|², a sceptical reader will ask the obvious question: why bother carrying the complex numbers at all? Why not just store the two probabilities and be done with it? The answer is the single most important fact in elementary quantum mechanics, and it is the reason superposition is more than fancy probability. The phases of c₁ and c₂ are observable. Not in any single coin flip, no, but in the patterns you see when you ask the right questions, or when you let the superposition sit and evolve, or when you recombine it with a copy of itself.

The classical example, the one Richard Feynman called “the heart of quantum mechanics,” is the double slit. A single particle passes through two openings on its way to a screen. If you describe the particle by an amplitude through slit A and an amplitude through slit B and add them, the bright and dark fringes on the screen are exactly what the formula predicts when the two amplitudes are complex numbers with a definite relative phase. Cover one slit and the fringes vanish; the surviving amplitude is just a smooth bump. Cover the other slit and you get a different bump in a slightly different place. With both slits open, the sum c₁ + c₂ has a squared modulus |c₁ + c₂|² = |c₁|² + |c₂|² + 2 Re(c₁* c₂), and that last cross term, the interference term, is what produces the stripes. Strip the relative phase out of the formalism and the cross term dies. The stripes are the phase made visible.

Quantum superposition is a fundamental principle of quantum mechanics that states that linear combinations of solutions to the Schrödinger equation are also solutions of the Schrödinger equation. This follows from the fact that the Schrödinger equation is a linear differential equation in time and position. More precisely, the state of a system is given by a linear combination of all the eigenfunctions of the Schrödinger equation governing that system.

From Wikipedia, “Quantum superposition” — https://en.wikipedia.org/wiki/Quantum_superpositionCC BY-SA 4.0

This is the place where Dirac’s notation pays for itself. Once you have grown comfortable with kets and the rule that any complex combination of states is again a state, you can also write down their duals, the bras ⟨φ|, and you can pair a bra with a ket to form an inner product ⟨φ|ψ⟩. The inner product is itself a complex number. Its squared modulus is the probability of finding the system in state |φ⟩ if it was prepared in state |ψ⟩. The whole language of probability amplitudes, eigenstate decompositions, measurement statistics, even Heisenberg’s matrix products, snaps into a single shorthand. ⟨ψ|ψ⟩ = 1 is the normalisation condition. ⟨heads|tails⟩ = 0 says the two faces of the coin are distinguishable, which is to say orthogonal. ⟨heads|ψ⟩ is the amplitude to find the system in heads, given that it is in ψ. Square it and you have Born’s probability.

Dirac himself, in his 1930 textbook, refused to give long verbal explanations of any of this. He just defined the symbols and let the algebra do the talking. There is a famous story that when a student at a conference asked him, in 1934, what he meant by a particular passage in chapter four, Dirac stared at the page for a long time and then said, “That is not a question, that is a statement.” Then he walked off. The chapter was about superposition.

Gotcha A state and a unit vector are not quite the same

Two ket vectors that differ only by an overall complex phase, like |ψ⟩ and e^{iθ}|ψ⟩, represent the same physical state. All measurable quantities, every Born probability, every expectation value, depend only on the ratios of coefficients in a superposition, not on a global phase out front. Students sometimes lose hours wondering whether a calculation that produced an extra factor of i is wrong; usually it is not. What matters is the relative phase between basis states inside a superposition. The overall phase of the whole vector is unmeasurable. The space of physical states is technically not a vector space but a projective space, called the projective Hilbert space. In practice the distinction shows up rarely and you can mostly ignore it until you study geometric phases.

Coins are toys. What does this look like for a real quantum system, say, the hydrogen atom whose orbitals fill most of this book? Recall that the eigenstates of hydrogen are the familiar labels (n, ℓ, m). The 1s state has (n, ℓ, m) = (1, 0, 0) and a spherical density. The 2pz state has (n, ℓ, m) = (2, 1, 0) and the famous dumbbell along the vertical axis. Each of these is an eigenstate of the Hamiltonian, meaning it is a state of definite energy. Schrödinger told you how to find them by solving his equation. The linearity rule of quantum mechanics now tells you that any complex combination

|ψ⟩ = c₁ |1s⟩ + c₂ |2pz⟩

is also a perfectly legal state of the hydrogen atom, even though it is not itself an energy eigenstate. The electron is partially in the ground state and partially in the first excited p state. If you measure the energy, you will get the 1s energy with probability |c₁|² and the 2p energy with probability |c₂|². But the probability density |ψ|² in space is not the average of the two density plots. Cross terms involving Re(c₁* c₂) shift density from one side of the nucleus to the other.

The first figure below makes this concrete. Add a 1s orbital, which is round, to a 2pz orbital, which has a positive lobe above the nucleus and a negative lobe below. With a relative phase of 0, the positive 2pz lobe constructively interferes with the 1s near +z, while the negative lobe destructively interferes near −z. The total density is lopsided: heavy on top, light on bottom. This is the sp hybrid orbital that chemists draw when they explain why molecules like beryllium hydride are linear. Flip the relative phase to π and the asymmetry inverts: now the density piles up below the nucleus. Nothing has changed about the amounts of 1s and 2pz; only the relative phase has rotated. The geometry of the cloud responds.

Two equal-amplitude superpositions of the 1s and 2pz orbitals, sketched as density plots. Both panels have identical |c₁|² and |c₂|², so a measurement of the energy is statistically the same in both. Only the relative phase φ between the two coefficients differs, and that difference is enough to swing the cloud from above the nucleus to below it. The phase is the lever; the geometry is what moves.

There is a clean way to picture each coefficient on its own. A complex number c can be written in polar form as |c| e^{iφ}, where |c| is its magnitude and φ is its phase angle. Plot c as an arrow in the plane, starting at the origin. The length of the arrow is |c|, the angle from the real axis is φ, and the squared length |c|² is the Born probability for that branch of the superposition. As you twirl the arrow around the origin, keeping its length fixed, the probability does not change but the phase does, and that phase is what gets multiplied through the rest of the formalism. In the figure below, the unit circle marks states of magnitude one; a coefficient with magnitude less than one sits inside, and the shaded radial bar shows its squared length, the probability share.

Each complex coefficient c in a superposition lives in this picture. Its length |c| sets the probability share |c|² that the corresponding eigenstate will be the measurement outcome. Its angle φ sets the phase the eigenstate carries into any interference term. Rotating the arrow around the origin leaves the probability unchanged but shifts every interference pattern the state participates in.

So far we have considered just two basis states. Real systems have many. The hydrogen atom alone has an infinite tower of bound states labelled by (n, ℓ, m), plus a continuum of unbound scattering states above the ionisation threshold. Schrödinger proved that these eigenstates form a complete basis: every legitimate wavefunction can be expanded as a sum (or, for the continuum part, an integral) over them, with each term carrying its own complex coefficient. In Dirac’s shorthand,

|ψ⟩ = Σₙ cₙ |n⟩

where the index n runs over every eigenstate of the Hamiltonian and cₙ = ⟨n|ψ⟩ is the inner product of the eigenstate with the full wavefunction. The Born rule, applied to this sum, says that if you measure the energy of the system, you will get the eigenvalue Eₙ with probability |cₙ|². If you measure anything else (position, momentum, angular momentum, the projection of spin along an axis you cooked up this morning), you change the question, and you would expand |ψ⟩ in the eigenbasis of whatever operator that observable corresponds to. Eigenstates of different observables are generally different vectors in the same Hilbert space; expanding the same |ψ⟩ in different bases gives different sets of coefficients but the same physical state.

Derive: why interference comes from a single cross term

Suppose a state is the equal-amplitude superposition of two orthonormal basis kets:

|ψ⟩ = (1/√2)( |A⟩ + e^\{iφ\} |B⟩ )

The probability of finding the system at some position x is the squared modulus of the position-space wavefunction ψ(x) = ⟨x|ψ⟩. Writing ψ_A(x) = ⟨x|A⟩ and ψ_B(x) = ⟨x|B⟩, we get

ψ(x) = (1/√2)( ψ_A(x) + e^\{iφ\} ψ_B(x) )

so that

|ψ(x)|² = (1/2)( |ψ_A(x)|² + |ψ_B(x)|² + 2 Re[ e^\{iφ\} ψ_A*(x) ψ_B(x) ] )

The first two terms are exactly what you would get from a classical mixture: half a chance of being in the A density, half a chance of being in the B density, weighted equally. The third term, with the explicit factor 2 Re[…], is the interference. It depends on the relative phase φ and it can be positive or negative. Where it is positive, the density at x is larger than either A or B alone would predict; where it is negative, it is smaller. That sign-changing cross term, and nothing else, is the difference between a quantum superposition and a classical mixture. Set φ = 0 and you favour one geometry; set φ = π and you flip the sign of the cross term, producing the opposite asymmetry. The interference term is built into the algebra of complex-amplitude addition.

One immediate consequence: if you take the spatial integral ∫ |ψ(x)|² dx over all of space, the cross term integrates to zero whenever |A⟩ and |B⟩ are orthogonal. The total probability stays normalised at 1, no matter what φ is, even though the local density sloshes around. Phase shifts redistribute probability without creating or destroying it. This is what unitarity looks like up close.

Open the viewer's wave mode. Add 1s + 2pz at equal amplitude. Rotate φ on the 2pz and watch the lobe direction swing.open ↗

The last step, the one that turns this static algebra into a dynamic theory, is to ask how the coefficients change with time. Schrödinger’s equation gives the answer in a sentence: each energy eigenstate evolves by acquiring a phase factor e^{-iEₙ t/ℏ}, with Eₙ its own energy. If the wavefunction at t = 0 is the superposition

|ψ(0)⟩ = Σₙ cₙ |n⟩

then at time t it is

|ψ(t)⟩ = Σₙ cₙ e^\{-iEₙ t/ℏ\} |n⟩

Each coefficient stays the same in magnitude (the probabilities never drift) but its phase rotates around the unit circle at a rate set by the eigenstate’s energy. Higher energy means faster rotation. The whole formalism reduces to a stack of synchronised metronomes, each ticking at its own frequency, and the only reason the densities change shape over time is that the relative phases between different metronomes drift. The lopsided sp hybrid you saw in the first figure does not stay lopsided forever; the 2pz coefficient rotates faster than the 1s coefficient because the 2p state has a higher energy, and after a quarter cycle the phase difference has slid to π/2, and the lobe is pointing somewhere new.

Story Dirac's notation almost didn't catch on

When Dirac introduced the bra-ket notation in a 1939 paper called “A new notation for quantum mechanics,” it was met with mild bemusement. John von Neumann’s measure-theoretic textbook had appeared in 1932 and was the canonical reference; physicists wrote inner products as integrals or as matrix elements, not as ⟨φ|ψ⟩. The notation was thought of as a curiosity, a Dirac-ism, like his refusal to use contractions or his insistence on writing every paper standing up. The textbook that finally popularised the bra-ket was Feynman’s Lectures on Physics, volume three, published in 1965. By the 1970s every graduate student had drunk the Kool-Aid, and the integral notation looked old-fashioned. Today an undergraduate physics major learns ⟨ψ|φ⟩ before they learn what a Hilbert space is.

That last observation, the one about phases rotating at energy-dependent rates, is the entire content of the next chapter. It is what we call time evolution, and it is the engine that drives every quantum process from spectral emission to chemical bonding to the gentle tick of a caesium clock. For now, the point is the simpler one: the state of a quantum system is a complex vector, and its components in any chosen basis are the amplitudes you square to get probabilities. The basis you pick is up to you; the physics does not change. The probabilities only see the lengths of the coefficients, but the phases between them control how the densities are arranged in space, how the wavefunction interferes with itself, and (as we will soon see) how the whole thing changes in time.

Hold this picture in your head as we move through the rest of the phase. Every wavefunction in this book is a sum over eigenstates. Every shape you see on the orbital viewer is the modulus squared of such a sum. Every animation of an electron moving, every bond-forming linear combination, every Bloch sphere rotation, is the same algebra applied in a slightly different setting. The superposition builder is not a single piece of the theory; it is the floor everything else stands on, the linearity rule from which interference, measurement statistics, and time evolution all flow. Dirac put the rule down in 1930 and nobody has had to revise it since.

Now that we know how to stack eigenstates, the next question is what happens when we let the clock run. Each metronome ticks at its own rate set by its eigenstate’s energy, and the relative phases between them quietly rewrite the geometry of the wavefunction frame by frame. That dance is the next chapter.

next chapter → Time evolution