Topological qubits: Arriving in 2018?

Editor‘s note: This post was prepared jointly by Ryan Mishmash and Jason Alicea.

Physicists appear to be on the verge of demonstrating proof-of-principle “usefulness” of small quantum computers.  Preskill’s notion of quantum supremacy spotlights a particularly enticing goal: use a quantum device to perform some computation—any computation in fact—that falls beyond the reach of the world’s best classical computers.  Efforts along these lines are being vigorously pursued along many fronts, from academia to large corporations to startups.  IBM’s publicly accessible 16-qubit superconducting device, Google’s pursuit of a 7×7 superconducting qubit array, and the recent synthesis of a 51-qubit quantum simulator using rubidium atoms are a few of many notable highlights.  While the number of qubits obtainable within such “conventional” approaches has steadily risen, synthesizing the first “topological qubit” remains an outstanding goal.  That ceiling may soon crumble however—vaulting topological qubits into a fascinating new chapter in the quest for scalable quantum hardware.

Why topological quantum computing?

As quantum computing progresses from minimalist quantum supremacy demonstrations to attacking real-world problems, hardware demands will naturally steepen.  In, say, a superconducting-qubit architecture, a major source of overhead arises from quantum error correction needed to combat decoherence.  Quantum-error-correction schemes such as the popular surface-code approach encode a single fault-tolerant logical qubit in many physical qubits, perhaps thousands.  The number of physical qubits required for practical applications can thus rapidly balloon.

The dream of topological quantum computing (introduced by Kitaev) is to construct hardware inherently immune to decoherence, thereby mitigating the need for active error correction.  In essence, one seeks physical qubits that by themselves function as good logical qubits.  This lofty objective requires stabilizing exotic phases of matter that harbor emergent particles known as “non-Abelian anyons”.  Crucially, nucleating non-Abelian anyons generates an exponentially large set of ground states that cannot be distinguished from each other by any local measurement.  Topological qubits encode information in those ground states, yielding two key virtues:

(1) Insensitivity to local noise.  For reference, consider a conventional qubit encoded in some two-level system, with the 0 and 1 states split by an energy \hbar \omega.  Local noise sources—e.g., random electric and magnetic fields—cause that splitting to fluctuate stochastically in time, dephasing the qubit.  In practice one can engender immunity against certain environmental perturbations.  One famous example is the transmon qubit (see “Charge-insensitive qubit design derived from the Cooper pair box” by Koch et al.) used extensively at IBM, Google, and elsewhere.  The transmon is a superconducting qubit that cleverly suppresses the effects of charge noise by operating in a regime where Josephson couplings are sizable compared to charging energies.  Transmons remain susceptible, however, to other sources of randomness such as flux noise and critical-current noise.  By contrast, topological qubits embed quantum information in global properties of the system, building in immunity against all local noise sources.  Topological qubits thus realize “perfect” quantum memory.

(2) Perfect gates via braiding.  By exploiting the remarkable phenomenon of non-Abelian statistics, topological qubits further enjoy “perfect” quantum gates: Moving non-Abelian anyons around one another reshuffles the system among the ground states—thereby processing the qubits—in exquisitely precise ways that depend only on coarse properties of the exchange.

Disclaimer: Adjectives like “perfect” should come with the qualifier “up to exponentially small corrections”, a point that we revisit below.

Experimental status

The catch is that systems supporting non-Abelian anyons are not easily found in nature.  One promising topological-qubit implementation exploits exotic 1D superconductors whose ends host “Majorana modes”—novel zero-energy degrees of freedom that underlie non-Abelian-anyon physics.  In 2010, two groups (Lutchyn et al. and Oreg et al.) proposed a laboratory realization that combines semiconducting nanowires, conventional superconductors, and modest magnetic fields.

Since then, the materials-science progress on nanowire-superconductor hybrids has been remarkable.  Researchers can now grow extremely clean, versatile devices featuring various manipulation and readout bells and whistles.  These fabrication advances paved the way for experiments that have reported increasingly detailed Majorana characteristics: tunneling signatures including recent reports of long-sought quantized response, evolution of Majorana modes with system size, mapping out of the phase diagram as a function of external parameters, etc.  Alternate explanations are still being debated though.  Perhaps the most likely culprit are conventional localized fermionic levels (“Andreev bound states”) that can imitate Majorana signatures under certain conditions; see in particular Liu et al.  Still, the collective experimental effort on this problem over the last 5+ years has provided mounting evidence for the existence of Majorana modes.  Revealing their prized quantum-information properties poses a logical next step.

Validating a topological qubit

Ideally one would like to verify both hallmarks of topological qubits noted above—“perfect” insensitivity to local noise and “perfect” gates via braiding.  We will focus on the former property, which can be probed in simpler device architectures.  Intuitively, noise insensitivity should imply long qubit coherence times.  But how do you pinpoint the topological origin of long coherence times, and in any case what exactly qualifies as “long”?

Here is one way to sharply address these questions (for more details, see our work in Aasen et al.).  As alluded to in our disclaimer above, logical 0 and 1 topological-qubit states aren’t exactly degenerate.  In nanowire devices they’re split by an energy \hbar \omega that is exponentially small in the separation distance L between Majorana modes divided by the superconducting coherence length \xi.  Correspondingly, the qubit states are not quite locally indistinguishable either, and hence not perfectly immune to local noise.  Now imagine pulling apart Majorana modes to go from a relatively poor to a perfect topological qubit.  During this process two things transpire in tandem: The topological qubit’s oscillation frequency, \omega, vanishes exponentially while the dephasing time T_2 becomes exponentially long.  That is,


This scaling relation could in fact be used as a practical definition of a topologically protected quantum memory.  Importantly, mimicking this property in any non-topological qubit would require some form of divine intervention.  For example, even if one fine-tuned conventional 0 and 1 qubit states (e.g., resulting from the Andreev bound states mentioned above) to be exactly degenerate, local noise could still readily produce dephasing.

As discussed in Aasen et al., this topological-qubit scaling relation can be tested experimentally via Ramsey-like protocols in a setup that might look something like the following:


This device contains two adjacent Majorana wires (orange rectangles) with couplings controlled by local gates (“valves” represented by black switches).  Incidentally, the design was inspired by a gate-controlled variation of the transmon pioneered in Larsen et al. and de Lange et al.  In fact, if only charge noise was present, we wouldn’t stand to gain much in the way of coherence times: both the transmon and topological qubit would yield exponentially long T_2 times.  But once again, other noise sources can efficiently dephase the transmon, whereas a topological qubit enjoys exponential protection from all sources of local noise.  Mathematically, this distinction occurs because the splitting for transmon qubit states is exponentially flat only with respect to variations in a “gate offset” n_g.  For the topological qubit, the splitting is exponentially flat with respect to variations in all external parameters (e.g., magnetic field, chemical potential, etc.), so long as Majorana modes still survive.  (By “exponentially flat” we mean constant up to exponentially small deviations.)  Plotting the energies of the qubit states in the two respective cases versus external parameters, the situation can be summarized as follows:


Outlook: Toward “topological quantum ascendancy”

These qubit-validation experiments constitute a small stepping stone toward building a universal topological quantum computer.  Explicitly demonstrating exponentially protected quantum information as discussed above would, nevertheless, go a long way toward establishing practical utility of Majorana-based topological qubits.  One might even view this goal as single-qubit-level “topological quantum ascendancy”.  Completion of this milestone would further set the stage for implementing “perfect” quantum gates, which requires similar capabilities albeit in more complex devices.  Researchers at Microsoft and elsewhere have their sights set on bringing a prototype topological qubit to life in the very near future.  It is not unreasonable to anticipate that 2018 will mark the debut of the topological qubit.  We could of course be off target.  There is, after all, still plenty of time in 2017 to prove us wrong.

Taming wave functions with neural networks

Note from Nicole Yunger Halpern: One sunny Saturday this spring, I heard Sam Greydanus present about his undergraduate thesis. Sam was about to graduate from Dartmouth with a major in physics. He had worked with quantum-computation theorist Professor James Whitfield. The presentation — about applying neural networks to quantum computation — so intrigued me that I asked him to share his research on Quantum Frontiers. Sam generously agreed; this is his story.

Wave functions in the wild


The wave function, \psi , is a mixed blessing. At first, it causes unsuspecting undergrads (me) some angst via the Schrodinger’s cat paradox. This angst morphs into full-fledged panic when they encounter concepts such as nonlocality and Bell’s theorem (which, by the way, is surprisingly hard to verify experimentally). The real trouble with \psi , though, is that it grows exponentially with the number of entangled particles in a system. We couldn’t even hope to write the wavefunction of 100 entangled particles, much less perform computations on it…but there’s a lot to gain from doing just that.

The thing is, we (a couple of luckless physicists) love \psi . Manipulating wave functions can give us ultra-precise timekeeping, secure encryption, and polynomial-time factoring of integers (read: break RSA). Harnessing quantum effects can also produce better machine learning, better physics simulations, and even quantum teleportation.

Taming the beast

Though \psi grows exponentially with the number of particles in a system, most physical wave functions can be described with a lot less information. Two algorithms for doing this are the Density Matrix Renormalization Group (DMRG) and Quantum Monte Carlo (QMC).


Density Matrix Renormalization Group (DMRG). Imagine we want to learn about trees, but studying a full-grown, 50-foot tall tree in the lab is too unwieldy. One idea is to keep the tree small, like a bonsai tree. DMRG is an algorithm which, like a bonsai gardener, prunes the wave function while preserving its most important components. It produces a compressed version of the wave function called a Matrix Product State (MPS). One issue with DMRG is that it doesn’t extend particularly well to 2D and 3D systems.

Screen Shot 2017-07-29 at 12.01.23 AM

Quantum Monte Carlo (QMC). Another way to study the concept of “tree” in a lab (bear with me on this metaphor) would be to study a bunch of leaf, seed, and bark samples. Quantum Monte Carlo algorithms do this with wave functions, taking “samples” of a wave function (pure states) and using the properties and frequencies of these samples to build a picture of the wave function as a whole. The difficulty with QMC is that it treats the wave function as a black box. We might ask, “how does flipping the spin of the third electron affect the total energy?” and QMC wouldn’t have much of a physical answer.

Brains \gg Brawn

Neural Quantum States (NQS). Some state spaces are far too large for even Monte Carlo to sample adequately. Suppose now we’re studying a forest full of different species of trees. If one type of tree vastly outnumbers the others, choosing samples from random trees isn’t an efficient way to map biodiversity. Somehow, we need to make the sampling process “smarter”. Last year, Google DeepMind used a technique called deep reinforcement learning to do just that – and achieved fame for defeating the world champion human Go player. A recent Science paper by Carleo and Troyer (2017) used the same technique to make QMC “smarter” and effectively compress wave functions with neural networks. This approach, called “Neural Quantum States (NQS)”, produced several state-of-the-art results.


The general idea of my thesis.

My thesis. My undergraduate thesis centered upon much the same idea. In fact, I had to abandon some of my initial work after reading the NQS paper. I then focused on using machine learning techniques to obtain MPS coefficients. Like Carleo and Troyer, I used neural networks to approximate  \psi . Unlike Carleo and Troyer, I trained my model to output a set of Matrix Product State coefficients which have physical meaning (MPS coefficients always correspond to a certain state and site, e.g. “spin up, electron number 3”).

Cool – but does it work?

Yes – for small systems. In my thesis, I considered a toy system of 4 spin-\frac{1}{2} particles interacting via the Heisenberg Hamiltonian. Solving this system is not difficult so I was able to focus on fitting the two disparate parts – machine learning and Matrix Product States – together.

Success! My model solved for ground states with arbitrary precision. Even more interestingly, I used it to automatically obtain MPS coefficients. Shown below, for example, is a visualization of my model’s coefficients for the GHZ state, compared with coefficients taken from the literature.

Screen Shot 2017-07-28 at 11.46.45 PM

A visual comparison of a 4-site Matrix Product State for the GHZ state a) listed in the literature b) obtained from my neural network model. Colored squares correspond to real-valued elements of 2×2 matrices.

Limitations. The careful reader might point out that, according to the schema of my model (above), I still have to write out the full wave function. To scale my model up, I instead trained it variationally over a subspace of the Hamiltonian (just as the authors of the NQS paper did). Results are decent for larger (10-20 particle) systems, but the training itself is still unstable. I’ll finish ironing out the details soon, so keep an eye on arXiv* :).

Outside the ivory tower


A quantum computer developed by Joint Quantum Institute, U. Maryland.

Quantum computing is a field that’s poised to take on commercial relevance. Taming the wave function is one of the big hurdles we need to clear before this happens. Hopefully my findings will have a small role to play in making this happen.

On a more personal note, thank you for reading about my work. As a recent undergrad, I’m still new to research and I’d love to hear constructive comments or criticisms. If you found this post interesting, check out my research blog.

*arXiv is an online library for electronic preprints of scientific papers

The sign problem(s)

The thirteen-month-old had mastered the word “dada” by the time I met her. Her parents were teaching her to communicate other concepts through sign language. Picture her, dark-haired and bibbed, in a high chair. Banana and mango slices litter the tray in front of her. More fruit litters the floor in front of the tray. The baby lifts her arms and flaps her hands.

Dada looks up from scrubbing the floor.

“Look,” he calls to Mummy, “she’s using sign language! All done.” He performs the gesture that his daughter seems to have aped: He raises his hands and rotates his forearms about his ulnas, axes perpendicular to the floor. “All done!”

The baby looks down, seizes another morsel, and stuffs it into her mouth.

“Never mind,” Dada amends. “You’re not done, are you?”

His daughter had a sign(-language) problem.


So does Dada, MIT professor Aram Harrow. Aram studies quantum information theory. His interests range from complexity to matrices, from resource theories to entropies. He’s blogged for The Quantum Pontiff, and he studies—including with IQIM postdoc Elizabeth Crossonthe quantum sign problem.

Imagine calculating properties of a chunk of fermionic quantum matter. The chunk consists of sites, each inhabited by one particle or by none. Translate as “no site can house more than one particle” the jargon “the particles are fermions.”

The chunk can have certain amounts of energy. Each amount E_j corresponds to some particle configuration indexed by j: If the system has some amount E_1 of energy, particles occupy certain sites and might not occupy others. If the system has a different amount E_2 \neq E_1 of energy, particles occupy different sites. A Hamiltonian, a mathematical object denoted by H, encodes the energies E_j and the configurations. We represent H with a matrix, a square grid of numbers.

Suppose that the chunk has a temperature T = \frac{ 1 }{ k_{\rm B} \beta }. We could calculate the system’s heat capacity, the energy required to raise the chunk’s temperature by one Kelvin. We could calculate the free energy, how much work the chunk could perform in powering a motor or lifting a weight. To calculate those properties, we calculate the system’s partition function, Z.

How? We would list the configurations j. With each configuration, we would associate the weight e^{ - \beta E_j }. We would sum the weights: Z = e^{ - \beta E_1 }  +  e^{ - \beta E_2}  +  \ldots  =  \sum_j e^{ - \beta E_j}.

Easier—like feeding a 13-month-old—said than done. Let N denote the number of qubits in the chunk. If N is large, the number of configurations is gigantic. Our computers can’t process so many configurations. This inability underlies quantum computing’s promise of speeding up certain calculations.

We don’t have quantum computers, and we can’t calculate Z. Can we  approximate Z?

Yes, if H “lacks the sign problem.” The math that models our system models also a classical system. If our system has D dimensions, the classical system has D+1 dimensions. Suppose, for example, that our sites form a line. The classical system forms a square.

We replace the weights e^{ - \beta E_j } with different weights—numbers formed from a matrix that represents H. If H lacks the sign problem, the new weights are nonnegative and behave like probabilities. Many mathematical tools suit probabilities. Aram and Elizabeth apply such tools to Z, here and here, as do many other researchers.

We call Hamiltonians that lack the sign problem “stoquastic,” which I think fanquastic.Stay tuned for a blog post about stoquasticity by Elizabeth.

What if H has the sign problem? The new weights can assume negative and nonreal values. The weights behave unlike probabilities; we can’t apply those tools. We find ourselves knee-deep in banana and mango chunks.

Mango chunks

Solutions to the sign problem remain elusive. Theorists keep trying to mitigate the problem, though. Aram, Elizabeth, and others are improving calculations of properties of sign-problem-free systems. One scientist-in-the-making has achieved a breakthrough: Aram’s daughter now rotates her hands upon finishing meals and when she wants to leave her car seat or stroller.

One sign problem down; one to go.


With gratitude to Aram’s family for its hospitality and to Elizabeth Crosson for sharing her expertise.

1For experts: A local Hamiltonian is stoquastic relative to the computational basis if each local term is represented, relative to the computational basis, by a matrix whose off-diagonal entries are real and nonpositive.

Entropy Avengers

As you already know if you read my rare (but highly refined!) blog samples, I have spent a big chunk of my professorial career teaching statistical mechanics. And if you teach statistical mechanics, there is pretty much one thing you obsess about: entropy.

So you can imagine my joy of finally seeing a fully anti-entropic superhero appearing on my facebook account (physics enthusiasts out there – the project is seeking support on Kickstarter):

Apart from the plug for Assa Auerbach’s project (which, for full disclosure, I have just supported), I would like to use this as an excuse to share my lessons about entropy. With the same level of seriousness. Here they are, in order of increasing entropy.

1. Cost of entropy. Entropy is always marketed as a very palpable thing. Disorder. In class, however, it is calculated via an enumeration of the ‘microscopic states of the system’. For an atomic gas I know how to calculate the entropy (throw me at the blackboard in the middle of the night, no problem. Bosons or Fermions – anytime!) But how can the concept be applied to our practical existence? I have a proposal:

Quantify entropy by the cost (in $’s) of cleaning up the mess!

Examples can be found at all scales. For anything household-related, we should use the H_k constant. H_k=$25/hour for my housekeeper. You break a glass – it takes about 10 minutes to clean. That puts the entropy of the wreckage at $4.17. Having a birthday party takes about 2 hours to clean up: $50 entropy.

Another insight which my combined experience as professor and parent has produced:

2. Conjecture: Babies are maximally efficient topological entropy machines. If you raised a 1 year-old you know exactly what I mean. You can at least guess why maximum efficiency. But why topological? A baby sauntering through the house leaves a string of destruction behind itself. The baby is a mess-creation string-operator! If you start lagging behind, doom will emerge – hence the maximum efficiency. By the way, the only strategy viable is to undo the damage as it happens. But this blog post is about entropy, not about parenting.

In fact, this allows us to establish a conversion of entropy measured in k_B units, to its, clearly more natural, measure in dollar units. A baby eats about 1000kCal/day=4200kJ/day. To fully deal with the consequences, we need a housekeeper to visit about once a week. 4200kJ/day times 7 days=29400 kJoules. These are consumed at T=300K. So an entropy of S=Q/T~100J/K, which is also S~6 \times 10^{24} (Q/k_B T) in dimensionless units, converts to S~$120, which is the cost of our weekly housekeeper visit. This gives a value of $ 10^{-23} per entropy of a two-level system. Quite a reasonable bang for the buck, don’t you think?

3. My conjecture (2) fails. The second law of thermodynamics is an inequality. Entropy \geq Q/T. Why does the conjecture fail? Babies are not ‘maximal’. Consider presidents. Consider the mess that the government can make. It is at the scale of trillions per year. $ 10^{12}. Using the rigorous conversion rule established above, this corresponds to 10^{35} two-level systems. Which happens to quite precisely match the combined number of electrons present in the human bodies of all our military personnel. But the mess, however, is created by very few individuals.

Given the large amounts of taxpayer money we dish out to deal with entropy in the world, Auerbach’s book is bound to make a big impact. In fact, maybe Max the demon would one day be nominated for the national medal of freedom, or at least be inducted into the National Academy of Sciences.

The world of hackers and secrets

I’m Evgeny Mozgunov, and some of you may remember my earlier posts on Quantum Frontiers. I’ve recently graduated with a PhD after 6 years in the quantum information group at Caltech. As I’m navigating the job market in quantum physics, it was only a matter of time before I got dragged into a race between startups. Those who can promise impressive quantum speedups for practical tasks get a lot of money from venture capitalists. Maybe there’s something about my mind and getting paid: when I’m paid to do something, I suddenly start coming up with solutions that never occurred to me while I was wandering around as a student. And this time, I’ve noticed a possibility of impressing the public with quantum speedups that nobody has ever used before.

Three former members of John Preskill’s group, Gorjan Alagic, Stacey Jeffery and Stephen Jordan, have already proposed this idea (Circuit Obfuscation Using Braids, p.10), but none of the startups seems to have picked it up. You only need a small quantum computer. Imagine you are in the audience. I ask you to come up with a number. Don’t tell it out loud: instead, write it on a secret piece of paper, and take a little time to do a few mathematical operations based on the number. Then announce the result of those operations. Once you are done, people will automatically be split into two categories. Those with access to a small quantum computer (like the one at IBM) will be able to put on a magic hat (the computer…) and recover your number. But the rest of the audience will be left in awe, with no clue as to how this is even possible. There’s nothing they could do to guess your number based only on the result you announced, unless you’re willing to wait for a few days and they have access to the world’s supercomputing powers.

So far I’ve described the general setting of encryption – a cipher is announced, the key to the cipher is destroyed, and only those who can break the code can decipher.  For instance, if RSA encryption is used for the magic show above, indeed only people with a big quantum computer will be able to recover the secret number. To complete my story, I need to describe what the result that you announce (the cipher) looks like:

A sequence of instructions for a small quantum computer that is equivalent to a simple instruction for spitting out your number. However, the announced sequence of instructions is obfuscated, such that you can’t just read off the number from it.

You really need to feed the sequence into a quantum computer, and see what it outputs. Obfuscation is more general than encryption, but here we’re going to use it as a method of encryption.

Alagic et al. taught us how to do something called obfuscation by compiling for a quantum computer: much like when you compile a .c file in your CS class, you can’t really understand the .out file. Of course you can just execute the .out file, but not if it describes a quantum circuit, unless you have access to a quantum computer. The proposed classical compiler turns either a classical or a quantum algorithm into a hard-to-read quantum circuit that looks like braidsBraid_1000.gif. Unfortunately, any obfuscation by compiling scheme has the problem that whoever understands the compiler well enough will be able to actually read the .out file (or notice a pattern in braids reduced to a compact “normal” form), and guess your number without resorting to a quantum computer. Surprisingly, even though Alagic et al.’s scheme doesn’t claim any protection under this attack, it still satisfies one of the theoretical definitions of obfuscation: if two people write two different sets of instructions to perform the same operation, and then each obfuscate their own set of instructions by a restricted set of tricks, then it should be impossible to tell from the end result which program was obtained by whom.


Theoretical obfuscation can be illustrated by these video game Nier cosplayers: when they put on their wig and blindfold, they look like the same person. The character named 2B is an android, whose body is disposable, and whose mind is a set of instructions stored on a server. Other characters try to hack her mind as the story progresses.

Quantum scientists can have their own little world of hackers and secrets, organized in the following way: some researchers present their obfuscated code outputting a secret message, and other researchers become hackers trying to break it. Thanks to another result by Alagic et al, we know that hard-to-break obfuscated circuits secure against classical computers exist. But we don’t know how the obfuscator that produces those worst-case instances reliably looks like, so a bit of crowdsourcing to find it is in order. It’s a free-for-all, where all tools and tricks are allowed. In fact, even you can enter! All you need to know is a universal gate set H,T = R(π/4),CNOT and good old matrix multiplication. Come up with a product of these matrices that multiplies to a bunch of X‘s (X=HT⁴H), but such that only you know on which qubits the X are applied. This code will spit out your secret bitstring on an input of all 0’es. Publish it and wait until some hacker breaks it!

Here’s mine, can anyone see what’s my secret bitstring?

Obfuscated circuit.png

One can run it on a 5 qubit quantum computer in less than 1ms. But if you try to multiply the corresponding 32×32 matrices on your laptop, it takes more than 1ms. Quantum speedup right there. Of course I didn’t prove that there’s no better way of finding out my secret than multiplying matrices. In fact, had I used only even powers of the matrix T in the picture above, then there is a classical algorithm available in open source (Aaronson, Gottesman) that recovers the number without having to multiply large matrices.

I’m in luck: startups and venture capitalists never cared about theoretical proofs, it only has to work until it fails. I think they should give millions to me instead of D-wave. Seriously, there’s plenty of applications for practical obfuscation, besides magic shows. One can set up a social network where posts are gibberish except for those who have a quantum computer (that would be a good conspiracy theory some years from now). One can verify when a private company claims to sell a small quantum computer.

I’d like to end on a more general note: small quantum computers are already faster than classical hardware at multiplying certain kinds of matrices. This has already been proven for a restricted class of quantum computers and a task called boson sampling. If there’s a competition in matrix multiplication somewhere in the world, we can already win.

Time capsule at the Dibner Library

The first time I met Lilla Vekerdy, she was holding a book.

“What’s that?” I asked.

“A second edition of Galileo’s Siderius nuncius. Here,” she added, thrusting the book into my hands. “Take it.”

So began my internship at the Smithsonian Institution’s Dibner Library for the History of Science and Technology.

Many people know the Smithsonian for its museums. The Smithsonian, they know, houses the ruby slippers worn by Dorothy in The Wizard of Oz. The Smithsonian houses planes constructed by Orville and Wilbur Wright, the dresses worn by First Ladies on presidential inauguration evenings, a space shuttle, and a Tyrannosaurus Rex skeleton. Smithsonian museums line the National Mall in Washington, D.C.—the United States’ front lawn—and march beyond.

Most people don’t know that the Smithsonian has 21 libraries.

Lilla heads the Smithsonian Libraries’ Special-Collections Department. She also directs a library tucked into a corner of the National Museum of American History. I interned at that library—the Dibner—in college. Images of Benjamin Franklin, of inventor Eli Whitney, and of astronomical instruments line the walls. The reading room contains styrofoam cushions on which scholars lay crumbling rare books. Lilla and the library’s technician, Morgan Aronson, find references for researchers, curate exhibitions, and introduce students to science history. They also care for the vault.

The vault. How I’d missed the vault.


A corner of the Dibner’s reading room and part of the vault

The vault contains manuscripts and books from the past ten centuries. We handle the items without gloves, which reduce our fingers’ sensitivities: Interpose gloves between yourself and a book, and you’ll raise your likelihood of ripping a page. A temperature of 65°F inhibits mold from growing. Redrot mars some leather bindings, though, and many crowns—tops of books’ spines—have collapsed. Aging carries hazards.

But what the ages have carried to the Dibner! We1 have a survey filled out by Einstein and a first edition of Newton’s Principia mathematica. We have Euclid’s Geometry in Latin, Arabic, and English, from between 1482 and 1847. We have a note, handwritten by quantum physicist Erwin Schödinger, about why students shouldn’t fear exams.

I returned to the Dibner one day this spring. Lilla and I fetched out manuscripts and books related to quantum physics and thermodynamics. “Hermann Weyl” labeled one folder.

Weyl contributed to physics and mathematics during the early 1900s. I first encountered his name when studying particle physics. The Dibner, we discovered, owns a proof for part of his 1928 book Gruppentheorie und Quantenmechanik. Weyl appears to have corrected a typed proof by hand. He’d handwritten also spin matrices.

Electrons have a property called “spin.” Spin resembles a property of yours, your position relative to the Earth’s center. We represent your position with three numbers: your latitude, your longitude, and your distance above the Earth’s surface. We represent electron spin with three blocks of numbers, three 2 \times 2 matrices. Today’s physicists write the matrices as2

S_x  = \begin{bmatrix}  0  &  1  \\  1  &  0  \end{bmatrix}  \, , \quad  S_y  = \begin{bmatrix}  0  &  -i  \\  i  &  0  \end{bmatrix}  \, , \quad \text{and} \quad  S_z  = \begin{bmatrix}  -1  &  0  \\  0  &  1  \end{bmatrix} \, .

We needn’t write these matrices. We could represent electron spin with different 2 \times 2 matrices, so long as the matrices obey certain properties. But most physicists choose the above matrices, in my experience. We call our choice “a convention.”

Weyl chose a different convention:

S_x  = \begin{bmatrix}  1  &  0  \\  0  &  -1  \end{bmatrix}  \, , \quad  S_y  = \begin{bmatrix}  0  &  1  \\  1  &  0  \end{bmatrix}  \, , \quad \text{and} \quad  S_z  = \begin{bmatrix}  0  &  i  \\  -i  &  0  \end{bmatrix} \, .

The difference surprised me. Perhaps it shouldn’t have: Conventions change. Approaches to quantum physics change. Weyl’s matrices differ from ours little: Permute our matrices and negate one matrix, and you recover Weyl’s.

But the electron-spin matrices play a role, in quantum physics, like the role played by T. Rex in paleontology exhibits: All quantum scientists recognize electron spin. We illustrate with electron spin in examples. Students memorize spin matrices in undergrad classes. Homework problems feature electron spin. Physicists have known of electron spin’s importance for decades. I didn’t expect such a bedrock to have changed its trappings.

How did scientists’ convention change? When did it? Why? Or did the convention not change—did Weyl’s contemporaries use today’s convention, and did Weyl stand out?

Weyl 2

A partially typed, partially handwritten, proof of a book by Hermann Weyl.

I intended to end this article with these questions. I sent a draft to John Preskill, proposing to post soon. But he took up the questions like a knight taking up arms.

Wolfgang Pauli, John emailed, appears to have written the matrices first. (Physicist call the matrices “Pauli matrices.”) A 1927 paper by Pauli contains the notation used today. Paul Dirac copied the notation in a 1928 paper, acknowledging Pauli. Weyl’s book appeared the same year. The following year, Weyl used Pauli’s notation in a paper.

No document we know of, apart from the Dibner proof, contains the Dibner-proof notation. Did the notation change between the proof-writing and publication? Does the Dibner hold the only anomalous electron-spin matrices? What accounts for the anomaly?

If you know, feel free to share. If you visit DC, drop Lilla and Morgan a line. Bring a research project. Bring a class. Bring zeal for the past. You might find yourself holding a time capsule by Galileo.

Lilla and me

Dibner librarian Lilla Vekerdy and a former intern

With thanks to Lilla and Morgan for their hospitality, time, curiosity, and expertise. With thanks to John for burrowing into the Pauli matrices’ history.

1I continue to count myself as part of the Dibner community. Part of me refuses to leave.

2I’ll omit factors of \hbar / 2 \, .

The power of information

Sara Imari Walker studies ants. Her entomologist colleague Gabriele Valentini cultivates ant swarms. Gabriele coaxes a swarm from its nest, hides the nest, and offers two alternative nests. Gabriele observe the ants’ responses, then analyzes their data with Sara.

Sara doesn’t usually study ants. She trained in physics, information theory, and astrobiology. (Astrobiology is the study of life; life’s origins; and conditions amenable to life, on Earth and anywhere else life may exist.) Sara analyzes how information reaches, propagates through, and manifests in the swarm.

Some ants inspect one nest; some, the other. Few ants encounter both choices. Yet most of the ants choose simultaneously. (How does Gabriele know when an ant chooses? Decided ants carry other ants toward the chosen nest. Undecided ants don’t.)

Gabriele and Sara plotted each ant’s status (decided or undecided) at each instant. All the ants’ lines start in the “undecided” region, high up in the graph. Most lines drop to the “decided” region together. Physicists call such dramatic, large-scale changes in many-particle systems “phase transitions.” The swarm transitions from the “undecided” phase to the “decided,” as moisture transitions from vapor to downpour.

Sara presentation

Sara versus the ants

Look from afar, and you’ll see evidence of a hive mind: The lines clump and slump together. Look more closely, and you’ll find lags between ants’ decisions. Gabriele and Sara grouped the ants according to their behaviors. Sara explained the grouping at a workshop this spring.

The green lines, she said, are undecided ants.

My stomach dropped like Gabriele and Sara’s ant lines.

People call data “cold” and “hard.” Critics lambast scientists for not appealing to emotions. Politicians weave anecdotes into their numbers, to convince audiences to care.

But when Sara spoke, I looked at her green lines and thought, “That’s me.”

I’ve blogged about my indecisiveness. Postdoc Ning Bao and I formulated a quantum voting scheme in which voters can superpose—form quantum combinations of—options. Usually, when John Preskill polls our research group, I abstain from voting. Politics, and questions like “Does building a quantum computer require only engineering or also science?”,1 have many facets. I want to view such questions from many angles, to pace around the questions as around a sculpture, to hear other onlookers, to test my impressions on them, and to cogitate before choosing.2 However many perspectives I’ve gathered, I’m missing others worth seeing. I commiserated with the green-line ants.


I first met Sara in the building behind the statue. Sara earned her PhD in Dartmouth College’s physics department, with Professor Marcelo Gleiser.

Sara presented about ants at a workshop hosted by the Beyond Center for Fundamental Concepts in Science at Arizona State University (ASU). The organizers, Paul Davies of Beyond and Andrew Briggs of Oxford, entitled the workshop “The Power of Information.” Participants represented information theory, thermodynamics and statistical mechanics, biology, and philosophy.

Paul and Andrew posed questions to guide us: What status does information have? Is information “a real thing” “out there in the world”? Or is information only a mental construct? What roles can information play in causation?

We paced around these questions as around a Chinese viewing stone. We sat on a bench in front of those questions, stared, debated, and cogitated. We taught each other about ants, artificial atoms, nanoscale machines, and models for information processing.


Chinese viewing stone in Yuyuan Garden in Shanghai

I wonder if I’ll acquire opinions about Paul and Andrew’s questions. Maybe I’ll meander from “undecided” to “decided” over a career. Maybe I’ll phase-transition like Sara’s ants. Maybe I’ll remain near the top of her diagram, a green holdout.

I know little about information’s power. But Sara’s plot revealed one power of information: Information can move us—from homeless to belonging, from ambivalent to decided, from a plot’s top to its bottom, from passive listener to finding yourself in a green curve.


With thanks to Sara Imari Walker, Paul Davies, Andrew Briggs, Katherine Smith, and the Beyond Center for their hospitality and thoughts.


1By “only engineering,” I mean not “merely engineering” pejoratively, but “engineering and no other discipline.”

2I feel compelled to perform these activities before choosing. I try to. Psychological experiments, however, suggest that I might decide before realizing that I’ve decided.