Chapter 04.02 Phase iv 15 / 57

Chapter 15 of 57

Pauli matrices

Three 2×2 matrices that generate every spin rotation

Spin 2 of 3 in phase 13 min read

Wolfgang Pauli at a blackboard in Zurich, chalk in hand, writing 2×2 matrices. The Pauli matrices were the first piece of mathematics built to describe a spin-½ electron.
Image: Wikimedia Commons · CC BY 4.0 · Unknown authorUnknown author

In the spring of 1927, a 26-year-old Wolfgang Pauli sat down to do something Schrödinger had refused to attempt: write a wave equation that knew about spin. The answer was not a single scalar wavefunction but a pair of them, stacked, with three little 2×2 matrices steering the rotation between them. Those three matrices have since spread across physics like ink in water. They show up in NMR scanners, qubits, magnetic memory, and the very first line of every undergraduate quantum mechanics homework.

quantum · contents

Wolfgang Pauli at a blackboard in Zurich, chalk in hand, writing 2×2 matrices. The Pauli matrices were the first piece of mathematics built to describe a spin-½ electron.
Pauli around the time he introduced the 2×2 spin matrices that now bear his name. He was 26 years old and already feared as the most caustic critic in European physics. Image: Wikimedia Commons · CC BY 4.0 · Unknown authorUnknown author

Phase iv · Spin · Chapter 02

Pauli matrices

In May 1927, a 26-year-old Wolfgang Pauli sat down to do what Schrödinger had refused to attempt: write a wave equation that knew about spin. The answer was not a single scalar wavefunction but a pair of them, stacked, with three little 2×2 matrices steering the rotation between them. Those three matrices have since spread across physics like ink in water, into NMR scanners, qubits, magnetic memory, and the first line of every undergraduate quantum mechanics homework.

In the autumn of 1926, Erwin Schrödinger had a wave equation for the electron and was, in a quiet, principled way, refusing to put spin into it. Otto Stern and Walther Gerlach had already shown, four years earlier, that a beam of silver atoms passing through a magnetic field gradient split into exactly two spots, top and bottom. Samuel Goudsmit and George Uhlenbeck had floated the idea, in late 1925, that the electron carried its own internal angular momentum: half a unit of h-bar, no more, no less. The picture was uncomfortable. A point particle that “spins” has no surface to spin with. To rotate by 360 degrees and return to its original state, the particle should behave like a vector under rotation. Goudsmit and Uhlenbeck’s spin did not. To recover the same state you had to rotate it twice, by 720 degrees. The thing was not a vector. It was something stranger.

Onto this scene walked Wolfgang Pauli, the wunderkind of Vienna who had written a 237-page review of general relativity at age 21 that Einstein himself blurbed. By 1927 Pauli was 26, holding a lectureship in Hamburg, and known across European physics as its sharpest tongue. He had told colleagues their work was wrong, premature, sloppy, or trivial, and he had usually been right. He had also, two years earlier, introduced what we now call the exclusion principle, that famous rule which forbids two electrons from sharing a quantum state. The exclusion principle had required a fourth quantum number with two values, but Pauli had refused to interpret it as a physical rotation, calling Goudsmit and Uhlenbeck’s spin a “Klassischbarkeit” mistake, a contamination of quantum theory by classical pictures. By 1927 he had quietly conceded that the data forced the issue. The electron really did carry an internal degree of freedom with two states. Pauli’s job, as he saw it, was now to write the smallest, cleanest piece of mathematics that could describe it.

His starting point was the observation that a spin-½ system has, by experiment, exactly two states. Stern-Gerlach said so. Atomic spectra said so. So whatever wavefunction described the electron’s spin had to live in a 2-dimensional complex vector space. Schrödinger’s ψ(x) was a single complex number at each point. Pauli’s ψ was instead a column of two complex numbers:

ψ = ( ψ↑ )
    ( ψ↓ )

He called this a “spinor,” reaching back to a piece of mathematical vocabulary from Cartan. The two entries are the amplitudes for finding the electron with spin up and spin down along some chosen axis. Squared and summed they total one. This much was forced by the data. The harder question was how this two-component object should rotate.

Rotation is the heart of physics. The world looks the same whether you face north or east, and any sensible piece of physics has to respect that symmetry. For Schrödinger’s scalar ψ the answer was almost trivial: scalars do not change under rotation. For a classical vector like a velocity, the answer was old: rotate the three components into each other using a 3×3 orthogonal matrix. But Pauli’s spinor had two complex components, not three real ones. What is the 2×2 matrix that turns a spin-½ state by a chosen angle around a chosen axis?

In mathematical physics and mathematics, the Pauli matrices are a set of three 2 \times 2 complex matrices that are traceless, Hermitian, involutory and unitary. They are usually denoted by the Greek letter \sigma (sigma), and occasionally by \tau (tau) when used in connection with isospin symmetries. \begin{align} \sigma_1 = \sigma_x &= \begin{pmatrix} 0&1\\ 1&0 \end{pmatrix}, \\ \sigma_2 = \sigma_y &= \begin{pmatrix} 0& -i \\ i&0 \end{pmatrix}, \\ \sigma_3 = \sigma_z &= \begin{pmatrix} 1&0\\ 0&-1 \end{pmatrix}.…

From Wikipedia, “Pauli matrices”https://en.wikipedia.org/wiki/Pauli_matricesCC BY-SA 4.0

The three matrices Pauli wrote down are stamped on every quantum mechanics textbook printed since. Here they are, with the same names he used:

σ_x = ( 0  1 )      σ_y = ( 0  -i )      σ_z = ( 1   0 )
      ( 1  0 )            ( i   0 )            ( 0  -1 )

Three Hermitian, traceless, unitary 2×2 matrices. Each squares to the identity. Each has eigenvalues +1 and -1, which is just the statement that a measurement of spin along any axis returns ±½ in units of h-bar (the factor of 2 lives in how we relate σ to the spin operator S = h-bar σ / 2). The off-diagonal entries of σ_x flip spin-up to spin-down and back. The factor of i in σ_y is what makes it rotate the basis through phase. The diagonal σ_z just measures: it reports +1 on a spin-up state, -1 on a spin-down state, and leaves them otherwise unchanged.

σ_x0110σ_x² = Iflips ↑ ↔ ↓σ_y0-ii0σ_y² = Iphase-twist flipσ_z100-1σ_z² = Ireads ↑ as +1, ↓ as -1Hermitian: σ† = σ (complex-conjugate the transpose, get back the same matrix)eigenvalues ±1, trace 0, determinant -1
The three Pauli matrices, with their squaring rule beneath each. Hermiticity (the complex-conjugate of the transpose equals the original matrix) is the reason their eigenvalues are real, which is the reason a spin measurement always returns a real number. The same three matrices reappear in every qubit gate, every NMR pulse, every spin chain.

Three numerical patterns lock the matrices into place. The first is that each one squares to the identity. Multiply σ_x by itself and you recover the 2×2 unit matrix. That single fact has a beautiful consequence: applying σ_x to any state, then applying σ_x again, returns the original state exactly. The flip is its own inverse. The second pattern is that no two of them commute with each other. If you measure σ_x and then σ_y, you do not get the same result as measuring σ_y and then σ_x. Specifically, [σ_x, σ_y] = σ_x σ_y - σ_y σ_x = 2i σ_z. The commutator of two Pauli matrices is i times the third, doubled. This is the canonical commutation relation of angular momentum, the same algebra that Heisenberg had been wrestling with in his matrix mechanics two years earlier. The third pattern is that the matrices anticommute pairwise: σ_x σ_y + σ_y σ_x = 0, and the same for any two distinct indices. They flip sign when you swap their order. Adding the commutator and the anticommutator gives the elegant product rule σ_j σ_k = δ_jk I + i ε_jkl σ_l, which encodes the whole algebra in one line.

Now for the magic trick that makes spinors weird. The rotation of a spin-½ state by an angle θ around an axis n is given by the matrix R(θ, n) = exp(-i θ n·σ / 2). The factor of 1/2 in the exponent is doing the real work. For a vector you would expect R(2π, n) = I; rotate any arrow by 360 degrees and you get the original arrow back. Plug θ = 2π into Pauli’s formula and you get something else. Using the identity that (n·σ)² = I, the exponential collapses to cos(θ/2) I - i sin(θ/2) (n·σ). At θ = 2π that is cos(π) I = -I. A full 360-degree turn maps the spinor to its negative. You must rotate twice, all the way to 720 degrees, before you arrive back at the original state. The wavefunction picks up a minus sign on the way around. This is not a calculational artifact. It is the deepest geometric fact about spinors, and it has been measured: neutron interferometry in the 1970s passed beams of neutrons through magnetic fields that rotated their spin by exactly 2π, and the resulting interference pattern shifted in precisely the way Pauli’s formula predicted. Spinors really are square roots of vectors. They live in a covering space that requires two trips around the circle to close.

Click the σ_x, σ_y, σ_z buttons in the sandbox and watch the Bloch arrow flip. Try σ_z on a state pointing along +x to see the equator rotation. Drag the arrow yourself and watch how each matrix acts geometrically.open ↗
|↑⟩|↓⟩σ_x : flip about x|+x⟩|-x⟩σ_z : flip about z|+y⟩|-y⟩σ_y : flip about y
Each Pauli matrix acts as a 180-degree rotation about its own axis on the Bloch sphere. σ_x sends spin-up to spin-down. σ_z sends a state on the +x equator to its negative. σ_y does the same about the y-axis. The three matrices together generate every possible rotation of a spin-½ state.
Derive the rotation formula exp(-i θ n·σ / 2)

A rotation in three dimensions by angle θ about a unit axis n is generated, in the Lie algebra of SO(3), by the matrix exponential exp(-i θ n·J), where J = (J_x, J_y, J_z) are the angular-momentum operators. For ordinary spatial vectors (spin 1) the J’s are 3×3. For spin-½, the generators are J = h-bar σ / 2. Pulling the factor of h-bar into the rotation angle and writing R(θ, n) = exp(-i θ n·σ / 2), the question becomes how to evaluate that matrix exponential.

Start from the algebraic miracle (n·σ)² = I, which follows because (n·σ)² = (n_x σ_x + n_y σ_y + n_z σ_z)² and all the cross terms cancel using anticommutation while each diagonal term gives n_j². Since n is a unit vector, the result is n·n times the identity, which is just I.

Now expand the exponential as a Taylor series:

exp(-i θ n·σ / 2) = Σ_k (-i θ / 2)^k (n·σ)^k / k!

Split into even and odd terms. The even powers all collapse to ±I, and the odd powers all collapse to ±(n·σ). Group them:

= [Σ_k (-1)^k (θ/2)^(2k) / (2k)!] I  -  i [Σ_k (-1)^k (θ/2)^(2k+1) / (2k+1)!] (n·σ)
= cos(θ/2) I  -  i sin(θ/2) (n·σ)

That is Pauli’s rotation formula. Now check the famous 2π puzzle. At θ = 2π we have cos(π) = -1 and sin(π) = 0, giving R(2π, n) = -I. A 360-degree rotation maps every spin-½ state ψ to -ψ. At θ = 4π we have cos(2π) = +1, sin(2π) = 0, giving R(4π, n) = +I. Spinors really do need two trips around the circle to come back to themselves. This is the fact that distinguishes the group SU(2), in which Pauli’s rotations live, from the group SO(3) of ordinary rotations: SU(2) is a double cover of SO(3). Every spatial rotation corresponds to two distinct SU(2) elements, related by a sign.

The Pauli matrices are not just a clever rewriting of spin. They are the smallest example of a much wider algebraic pattern called a Clifford algebra, the same structure Paul Dirac was about to generalize when he wrote down his relativistic electron equation in 1928. Dirac needed four 4×4 matrices, the gamma matrices, that anticommuted pairwise the way Pauli’s anticommuted. He built them by tensoring the Pauli matrices with themselves, which is how the world’s most famous physics equation, the Dirac equation, hides three little 2×2 blocks inside its machinery. Werner Heisenberg, watching Pauli and Dirac from a desk in Leipzig, took the same algebra and applied it to isospin, the symmetry that makes protons and neutrons interchangeable as far as the strong force can tell. Murray Gell-Mann would later borrow it again for the eightfold way. Every modern qubit gate (the bit-flip X, the phase-flip Z, the combination Y) is literally a Pauli matrix in disguise. NMR scanners exploit the σ_z eigenstates as their bit zero and bit one. Magnetic-memory bits store one classical Pauli outcome per cell. The same three matrices, written down by a 26-year-old in two months in Hamburg, run through almost every technology that uses spin.

There is one final aesthetic point worth pressing. The Pauli matrices are not merely a bookkeeping device for a two-state system. They are a complete basis for everything that can act on a single qubit. Any 2×2 Hermitian matrix M can be written as a real linear combination M = a I + b σ_x + c σ_y + d σ_z, with the four coefficients (a, b, c, d) read off by tracing M against the identity and each Pauli matrix in turn. The same expansion works for any density matrix, any observable, any Hamiltonian on a single qubit. The four basis elements (the identity plus the three Paulis) span the entire vector space of 2×2 Hermitian operators, and they are mutually orthogonal under the natural inner product. That is why every textbook starts here. Once you can decompose any single-qubit operation into a Pauli expansion, you can compose two-qubit gates as tensor products of single-qubit pieces, and the entire quantum-computing stack follows by induction.

Pauli won the 1945 Nobel Prize, not for the matrices, but for the exclusion principle he had introduced two decades earlier. Einstein had nominated him. By then Pauli was in exile at the Institute for Advanced Study in Princeton, having left Zurich one step ahead of the German annexation of Austria, which had retroactively made him a German citizen. He returned to ETH in 1946 and continued working until his death from pancreatic cancer in 1958, in Zurich’s Rotkreuz hospital, in room 137, a number he had spent half his life trying to derive from first principles (the fine-structure constant has the reciprocal value α ≈ 1/137, and Pauli was convinced that one day a theory would explain why). He is also responsible for the spin-statistics theorem (the deep result that particles with half-integer spin must be fermions, the ones with integer spin must be bosons, derived in 1940), for predicting the neutrino in 1930 in a letter that began “Dear radioactive ladies and gentlemen,” and for the running joke that any experiment near him would mysteriously fail, the so-called “Pauli effect.” The matrices, however, remained his quiet, perfect contribution: three Hermitian 2×2 grids that taught physics how to rotate things that are not arrows.

Pauli gave us three matrices. The next chapter gives us a place to draw them. Felix Bloch, working in Stanford in the 1940s on what would become magnetic resonance imaging, found that every spin-½ state could be plotted as a point on the unit sphere, with Pauli’s σ_x, σ_y, σ_z marking the three orthogonal axes. The sphere has been the picture in every quantum-information textbook ever since.

next chapter → The Bloch sphere