Peeking into the world of quantum intelligence

Intelligent beings have the ability to receive, process, store information, and based on the processed information, predict what would happen in the future and act accordingly.

An illustration of receiving, processing, and storing information. Based on the processed information, one can make prediction about the future.
[Credit: Claudia Cheng]

We, as intelligent beings, receive, process, and store classical information. The information comes from vision, hearing, smell, and tactile sensing. The data is encoded as analog classical information through the electrical pulses sending through our nerve fibers. Our brain processes this information classically through neural circuits (at least that is our current understanding, but one should check out this blogpost). We then store this processed classical information in our hippocampus that allows us to retrieve it later to combine it with future information that we obtain. Finally, we use the stored classical information to make predictions about the future (imagine/predict the future outcomes if we perform certain action) and choose the action that would most likely be in our favor.

Such abilities have enabled us to make remarkable accomplishments: soaring in the sky by constructing accurate models of how air flows around objects, or building weak forms of intelligent beings capable of performing basic conversations and play different board games. Instead of receiving/processing/storing classical information, one could imagine some form of quantum intelligence that deals with quantum information instead of classical information. These quantum beings can receive quantum information through quantum sensors built up from tiny photons and atoms. They would then process this quantum information with quantum mechanical evolutions (such as quantum computers), and store the processed qubits in a quantum memory (protected with a surface code or toric code).

A caricature of human intelligence dating long before 1950, artificial intelligence that began in the 50’s, and the emergence of quantum intelligence.
[Credit: Claudia Cheng]

It is natural to wonder what a world of quantum intelligence would be like. While we have never encountered such a strange creature in the real world (yet), the mathematics of quantum mechanics, machine learning, and information theory allow us to peek into what such a fantastic world would be like. The physical world we live in is intrinsically quantum. So one may imagine that a quantum being is capable of making more powerful predictions than a classical being. Maybe he/she/they could better predict events that happened further away, such as tell us how a distant black hole was engulfing another? Or perhaps he/she/they could improve our lives, for example by presenting us with an entirely new approach for capturing energy from sunlight?

One may be skeptical about finding quantum intelligent beings in nature (and rightfully so). But it may not be so absurd to synthesize a weak form of quantum (artificial) intelligence in an experimental lab, or enhance our classical human intelligence with quantum devices to approximate a quantum-mechanical being. Many famous companies, like Google, IBM, Microsoft, and Amazon, as well as many academic labs and startups have been building better quantum machines/computers day by day. By combining the concepts of machine learning on classical computers with these quantum machines, the future of us interacting with some form of quantum (artificial) intelligence may not be so distant.

Before the day comes, could we peek into the world of quantum intelligence? And could one better understand how much more powerful they could be over classical intelligence?

A cartoon depiction of me (Left), Richard Kueng (Middle), and John Preskill (Right).
[Credit: Claudia Cheng]

In a recent publication [1], my advisor John Preskill, my good friend Richard Kueng, and I made some progress toward these questions. We consider a quantum mechanical world where classical beings could obtain classical information by measuring the world (performing POVM measurement). In contrast, quantum beings could retrieve quantum information through quantum sensors and store the data in a quantum memory. We study how much better quantum over classical beings could learn from the physical world to accurately predict the outcomes of unseen events (with the focus on the number of interactions with the physical world instead of computation time). We cast these problems in a rigorous mathematical framework and utilize high-dimensional probability and quantum information theory to understand their respective prediction power. Rigorously, one refers to a classical/quantum being as a classical/quantum model, algorithm, protocol, or procedure. This is because the actions of these classical/quantum beings are the center of the mathematical analysis.

Formally, we consider the task of learning an unknown physical evolution described by a CPTP map \mathcal{E} that takes in n-qubit state and maps to m-qubit state. The classical model can select an arbitrary classical input to the CPTP map and measure the output state of the CPTP map with some POVM measurement. The quantum model can access the CPTP map coherently and obtain quantum data from each access, which is equivalent to composing multiple CPTP maps with quantum computations to learn about the CPTP map. The task is to predict a property of the output state \mathcal{E}(\lvert x \rangle\!\langle x \rvert), given by \mathrm{Tr}(O \mathcal{E}(\lvert x \rangle\!\langle x \rvert)), for a new classical input x \in \{0, 1\}^n. And the goal is to achieve the task while accessing \mathcal{E} as few times as possible (i.e., fewer interactions or experiments in the physical world). We denote the number of interactions needed by classical and quantum models as N_{\mathrm{C}}, N_{\mathrm{Q}}.

In general, quantum models could learn from fewer interactions with the physical world (or experiments in the physical world) than classical models. This is because coherent quantum information can facilitate better information synthesis with information obtained from previous experiments. Nevertheless, in [1], we show that there is a fundamental limit to how much more efficient quantum models can be. In order to achieve a prediction error

\mathbb{E}_{x \sim \mathcal{D}} |h(x) -  \mathrm{Tr}(O \mathcal{E}(\lvert x \rangle\!\langle x \rvert))| \leq \mathcal{O}(\epsilon),

where h(x) is the hypothesis learned from the classical/quantum model and \mathcal{D} is an arbitrary distribution over the input space \{0, 1\}^n, we found that the speed-up N_{\mathrm{C}} / N_{\mathrm{Q}} is upper bounded by m / \epsilon, where m > 0 is the number of qubits each experiment provides (the output number of qubits in the CPTP map \mathcal{E}), and \epsilon > 0 is the desired prediction error (smaller \epsilon means we want to predict more accurately).

In contrast, when we want to accurately predict all unseen events, we prove that quantum models could use exponentially fewer experiments than classical models. We give a construction for predicting properties of quantum systems showing that quantum models could substantially outperform classical models. These rigorous results show that quantum intelligence shines when we seek stronger prediction performance.

We have only scratched the surface of what is possible with quantum intelligence. As the future unfolds, I am hopeful that we will discover more that can be done only by quantum intelligence, through mathematical analysis, rigorous numerical studies, and physical experiments.

Further information:

  • A classical model that can be used to accurately predict properties of quantum systems is the classical shadow formalism [2] that we proposed a year ago. In many tasks, this model can be shown to be one of the strongest rivals that quantum models have to surpass.
  • Even if a quantum model only receives and stores classical data, the ability to process the data using a quantum-mechanical evolution can still be advantageous [3]. However, obtaining large advantage will be harder in this case as the computational power in data can slightly boost classical machines/intelligence [3].
  • Another nice paper by Dorit Aharonov, Jordan Cotler, and Xiao-Liang Qi [4] also proved advantages of quantum models over classical one in some classification tasks.

References:

[1] Huang, Hsin-Yuan, Richard Kueng, and John Preskill. “Information-Theoretic Bounds on Quantum Advantage in Machine Learning.” Physical Review Letters 126: 190505 (2021). https://doi.org/10.1103/PhysRevLett.126.190505

[2] Huang, Hsin-Yuan, Richard Kueng, and John Preskill. “Predicting many properties of a quantum system from very few measurements.” Nature Physics 16: 1050-1057 (2020). https://doi.org/10.1038/s41567-020-0932-7

[3] Huang, Hsin-Yuan, et al. “Power of data in quantum machine learning.” Nature communications 12.1 (2021): 1-9. https://doi.org/10.1038/s41467-021-22539-9

[4] Aharonov, Dorit, Jordan Cotler, and Xiao-Liang Qi. “Quantum Algorithmic Measurement.” arXiv preprint arXiv:2101.04634 (2021).

Learning about learning

The autumn of my sophomore year of college was mildly hellish. I took the equivalent of three semester-long computer-science and physics courses, atop other classwork; co-led a public-speaking self-help group; and coordinated a celebrity visit to campus. I lived at my desk and in office hours, always declining my flatmates’ invitations to watch The West Wing

Hard as I studied, my classmates enjoyed greater facility with the computer-science curriculum. They saw immediately how long an algorithm would run, while I hesitated and then computed the run time step by step. I felt behind. So I protested when my professor said, “You’re good at this.” 

I now see that we were focusing on different facets of learning. I rued my lack of intuition. My classmates had gained intuition by exploring computer science in high school, then slow-cooking their experiences on a mental back burner. Their long-term exposure to the material provided familiarity—the ability to recognize a new problem as belonging to a class they’d seen examples of. I was cooking course material in a mental microwave set on “high,” as a semester’s worth of material was crammed into ten weeks at my college.

My professor wasn’t measuring my intuition. He only saw that I knew how to compute an algorithm’s run time. I’d learned the material required of me—more than I realized, being distracted by what I hadn’t learned that difficult autumn.

We can learn a staggering amount when pushed far from our comfort zones—and not only we humans can. So can simple collections of particles.

Examples include a classical spin glass. A spin glass is a collection of particles that shares some properties with a magnet. Both a magnet and a spin glass consist of tiny mini-magnets called spins. Although I’ve blogged about quantum spins before, I’ll focus on classical spins here. We can imagine a classical spin as a little arrow that points upward or downward.  A bunch of spins can form a material. If the spins tend to point in the same direction, the material may be a magnet of the sort that’s sticking the faded photo of Fluffy to your fridge.

The spins may interact with each other, similarly to how electrons interact with each other. Not entirely similarly, though—electrons push each other away. In contrast, a spin may coax its neighbors into aligning or anti-aligning with it. Suppose that the interactions are random: Any given spin may force one neighbor into alignment, gently ask another neighbor to align, entreat a third neighbor to anti-align, and having nothing to say to neighbors four and five.

The spin glass can interact with the external world in two ways. First, we can stick the spins in a magnetic field, as by placing magnets above and below the glass. If aligned with the field, a spin has negative energy; and, if antialigned, positive energy. We can sculpt the field so that it varies across the spin glass. For instance, spin 1 can experience a strong upward-pointing field, while spin 2 experiences a weak downward-pointing field.

Second, say that the spins occupy a fixed-temperature environment, as I occupy a 74-degree-Fahrenheit living room. The spins can exchange heat with the environment. If releasing heat to the environment, a spin flips from having positive energy to having negative—from antialigning with the field to aligning.

Let’s perform an experiment on the spins. First, we design a magnetic field using random numbers. Whether the field points upward or downward at any given spin is random, as is the strength of the field experienced by each spin. We sculpt three of these random fields and call the trio a drive.

Let’s randomly select a field from the drive and apply it to the spin glass for a while; again, randomly select a field from the drive and apply it; and continue many times. The energy absorbed by the spins from the fields spikes, then declines.

Now, let’s create another drive of three random fields. We’ll randomly pick a field from this drive and apply it; again, randomly pick a field from this drive and apply it; and so on. Again, the energy absorbed by the spins spikes, then tails off.

Here comes the punchline. Let’s return to applying the initial fields. The energy absorbed by the glass will spike—but not as high as before. The glass responds differently to a familiar drive than to a new drive. The spin glass recognizes the original drive—has learned the first fields’ “fingerprint.” This learning happens when the fields push the glass far from equilibrium,1 as I learned when pushed during my mildly hellish autumn.

So spin glasses learn drives that push them far from equilibrium. So do many other simple, classical, many-particle systems: polymers, viscous liquids, crumpled sheets of Mylar, and more. Researchers have predicted such learning and observed it experimentally. 

Scientists have detected many-particle learning by measuring thermodynamic observables. Examples include the energy absorbed by the spin glass—what thermodynamicists call work. But thermodynamics developed during the 1800s, to describe equilibrium systems, not to study learning. 

One study of learning—the study of machine learning—has boomed over the past two decades. As described by the MIT Technology Review, “[m]achine-learning algorithms use statistics to find patterns in massive amounts of data.” Users don’t tell the algorithms how to find those patterns.

xkcd.com/1838

It seems natural and fitting to use machine learning to learn about the learning by many-particle systems. That’s what I did with collaborators from the group of Jeremy England, a GlaxoSmithKline physicist who studies complex behaviors of many particle systems. Weishun Zhong, Jacob Gold, Sarah Marzen, Jeremy, and I published our paper last month. 

Using machine learning, we detected and measured many-particle learning more reliably and precisely than thermodynamic measures seem able to. Our technique works on multiple facets of learning, analogous to the intuition and the computational ability I encountered in my computer-science course. We illustrated our technique on a spin glass, but one can apply our approach to other systems, too. I’m exploring such applications with collaborators at the University of Maryland.

The project pushed me far from my equilibrium: I’d never worked with machine learning or many-body learning. But it’s amazing, what we can learn when pushed far from equilibrium. I first encountered this insight sophomore fall of college—and now, we can quantify it better than ever.

1Equilibrium is a quiet, restful state in which the glass’s large-scale properties change little. No net flow of anything—such as heat or particles—enter or leave the system.

One if by land minus two if by sea, over the square-root of two

Happy National Poetry Month! The United States salutes word and whimsy in April, and Quantum Frontiers is continuing its tradition of celebrating. As a resident of Cambridge, Massachusetts and as a quantum information scientist, I have trouble avoiding the poem “Paul Revere’s Ride.” 

Henry Wadsworth Longfellow wrote the poem, as well as others in the American canon, during the 1800s. Longfellow taught at Harvard in Cambridge, and he lived a few blocks away from the university, in what’s now a national historic site. Across the street from the house, a bust of the poet gazes downward, as though lost in thought, in Longfellow Park. Longfellow wrote one of his most famous poems about an event staged a short drive from—and, arguably, partially in—Cambridge.

Longfellow Park

The event took place “on the eighteenth of April, in [Seventeen] Seventy-Five,” as related by the narrator of “Paul Revere’s Ride.” Revere was a Boston silversmith and a supporter of the American colonies’ independence from Britain. Revolutionaries anticipated that British troops would set out from Boston sometime during the spring. The British planned to seize revolutionaries’ weapons in the nearby town of Concord and to jail revolutionary leaders in Lexington. The troops departed Boston during the night of April 18th. 

Upon learning of their movements, sexton Robert Newman sent a signal from Boston’s old North Church to Charlestown. Revere and the physician William Dawes rode out from Charlestown to warn the people of Lexington and the surrounding areas. A line of artificial hoof prints, pressed into a sidewalk a few minutes from the Longfellow house, marks part of Dawes’s trail through Cambridge. The initial riders galvanized more riders, who stirred up colonial militias that resisted the troops’ advance. The Battles of Lexington and Concord ensued, initiating the Revolutionary War.

Longfellow took liberties with the facts he purported to relate. But “Paul Revere’s Ride” has blown the dust off history books for generations of schoolchildren. The reader shares Revere’s nervous excitement as he fidgets, awaiting Newman’s signal: 

Now he patted his horse’s side, 
Now gazed on the landscape far and near, 
Then impetuous stamped the earth, 
And turned and tightened his saddle-girth;
But mostly he watched with eager search 
The belfry-tower of the old North Church.

The moment the signal arrives, that excitement bursts its seams, and Revere leaps astride his horse. The reader comes to gallop through with the silversmith the night, the poem’s clip-clop-clip-clop rhythm evoking a horse’s hooves on cobblestones.

The author, outside Longfellow House, on the eighteenth of April in…Twenty Twenty.

Not only does “Paul Revere’s Ride” revitalize history, but it also offers a lesson in information theory. While laying plans, Revere instructs Newman: 

He said to his friend, “If the British march
By land or sea from the town to-night,
Hang a lantern aloft in the belfry-arch
Of the North-Church-tower, as a signal light.

Then comes one of the poem’s most famous lines: “One if by land, and two if by sea.” The British could have left Boston by foot or by boat, and Newman had to communicate which. Specifying one of two options, he related one bit, or one basic unit of information. Newman thereby exemplifies a cornerstone of information theory: the encoding of a bit of information—an abstraction—in a physical system that can be in one of two possible states—a light that shines from one or two lanterns.

Benjamin Schumacher and Michael Westmoreland point out the information-theoretic interpretation of Newman’s action in their quantum-information textbook. I used their textbook in my first quantum-information course, as a senior in college. Before reading the book, I’d never felt that I could explain what information is or how it can be quantified. Information is an abstraction and a Big Idea, like consciousness, life, and piety. But, Schumacher and Westmoreland demonstrated, most readers already grasp the basics of information theory; some readers even studied the basics while memorizing a poem in elementary school. So I doff my hat—or, since we’re discussing the 1700s, my mobcap—to the authors.

Reading poetry enriches us more than we realize. So read a poem this April. You can find Longfellow’s poem here or ride off wherever your fancy takes you.  

Project Ant-Man

The craziest challenge I’ve undertaken hasn’t been skydiving; sailing the Amazon on a homemade raft; scaling Mt. Everest; or digging for artifacts atop a hill in a Middle Eastern desert, near midday, during high summer.1 The craziest challenge has been to study the possibility that quantum phenomena affect cognition significantly. 

Most physicists agree that quantum phenomena probably don’t affect cognition significantly. Cognition occurs in biological systems, which have high temperatures, many particles, and watery components. Such conditions quash entanglement (a relationship that quantum particles can share and that can produce correlations stronger than any produceable by classical particles). 

Yet Matthew Fisher, a condensed-matter physicist, proposed a mechanism by which entanglement might enhance coordinated neuron firing. Phosphorus nuclei have spins (quantum properties similar to angular momentum) that might store quantum information for long times when in Posner molecules. These molecules may protect the information from decoherence (leaking quantum information to the environment), via mechanisms that Fisher described.

I can’t check how correct Fisher’s proposal is; I’m not a biochemist. But I’m a quantum information theorist. So I can identify how Posners could process quantum information if Fisher were correct. I undertook this task with my colleague Elizabeth Crosson, during my PhD

Experimentalists have begun testing elements of Fisher’s proposal. What if, years down the road, they find that Posners exist in biofluids and protect quantum information for long times? We’ll need to test whether Posners can share entanglement. But detecting entanglement tends to require control finer than you can exert with a stirring rod. How could you check whether a beakerful of particles contains entanglement?

I asked that question of Adam Bene Watts, a PhD student at MIT, and John Wright, then an MIT postdoc and now an assistant professor in Texas. John gave our project its codename. At a meeting one day, he reported that he’d watched the film Avengers: Endgame. Had I seen it? he asked.

No, I replied. The only superhero movie I’d seen recently had been Ant-Man and the Wasp—and that because, according to the film’s scientific advisor, the movie riffed on research of mine. 

Go on, said John.

Spiros Michalakis, the Caltech mathematician in charge of this blog, served as the advisor. The film came out during my PhD; during a meeting of our research group, Spiros advised me to watch the movie. There was something in it “for you,” he said. “And you,” he added, turning to Elizabeth. I obeyed, to hear Laurence Fishburne’s character tell Ant-Man that another character had entangled with the Posner molecules in Ant-Man’s brain.2 

John insisted on calling our research Project Ant-Man.

John and Adam study Bell tests. Bell test sounds like a means of checking whether the collar worn by your cat still jingles. But the test owes its name to John Stewart Bell, a Northern Irish physicist who wrote a groundbreaking paper in 1964

Say you’d like to check whether two particles share entanglement. You can run an experiment, described by Bell, on them. The experiment ends with a measurement of the particles. You repeat this experiment in many trials, using identical copies of the particles in subsequent trials. You accumulate many measurement outcomes, whose statistics you calculate. You plug those statistics into a formula concocted by Bell. If the result exceeds some number that Bell calculated, the particles shared entanglement.

We needed a variation on Bell’s test. In our experiment, every trial would involve hordes of particles. The experimentalists—large, clumsy, classical beings that they are—couldn’t measure the particles individually. The experimentalists could record only aggregate properties, such as the intensity of the phosphorescence emitted by a test tube.

Adam, MIT physicist Aram Harrow, and I concocted such a Bell test, with help from John. Physical Review A published our paper this month—as a Letter and an Editor’s Suggestion, I’m delighted to report.

For experts: The trick was to make the Bell correlation function nonlinear in the state. We assumed that the particles shared mostly pairwise correlations, though our Bell inequality can accommodate small aberrations. Alas, no one can guarantee that particles share only mostly pairwise correlations. Violating our Bell inequality therefore doesn’t rule out hidden-variables theories. Under reasonable assumptions, though, a not-completely-paranoid experimentalist can check for entanglement using our test. 

One can run our macroscopic Bell test on photons, using present-day technology. But we’re more eager to use the test to characterize lesser-known entities. For instance, we sketched an application to Posner molecules. Detecting entanglement in chemical systems will require more thought, as well as many headaches for experimentalists. But our paper broaches the cask—which I hope to see flow in the next Ant-Man film. Due to debut in 2022, the movie has the subtitle Quantumania. Sounds almost as crazy as studying the possibility that quantum phenomena affect cognition.

1Of those options, I’ve undertaken only the last.

2In case of any confusion: We don’t know that anyone’s brain contains Posner molecules. The movie features speculative fiction.

Love in the time of thermo

An 81-year-old medical doctor has fallen off a ladder in his house. His pet bird hopped out of his reach, from branch to branch of a tree on the patio. The doctor followed via ladder and slipped. His servants cluster around him, the clamor grows, and he longs for his wife to join him before he dies. She arrives at last. He gazes at her face; utters, “Only God knows how much I loved you”; and expires.

I set the book down on my lap and looked up. I was nestled in a wicker chair outside the Huntington Art Gallery in San Marino, California. Busts of long-dead Romans kept me company. The lawn in front of me unfurled below a sky that—unusually for San Marino—was partially obscured by clouds. My final summer at Caltech was unfurling. I’d walked to the Huntington, one weekend afternoon, with a novel from Caltech’s English library.1

What a novel.

You may have encountered the phrase “love in the time of corona.” Several times. Per week. Throughout the past six months. Love in the Time of Cholera predates the meme by 35 years. Nobel laureate Gabriel García Márquez captured the inhabitants, beliefs, architecture, mores, and spirit of a Colombian city around the turn of the 20th century. His work transcends its setting, spanning love, death, life, obsession, integrity, redemption, and eternity. A thermodynamicist couldn’t ask for more-fitting reading.

Love in the Time of Cholera centers on a love triangle. Fermina Daza, the only child of a wealthy man, excels in her studies. She holds herself with poise and self-assurance, and she spits fire whenever others try to control her. The girl dazzles Florentino Ariza, a poet, who restructures his life around his desire for her. Fermina Daza’s pride impresses Dr. Juvenal Urbino, a doctor renowned for exterminating a cholera epidemic. After rejecting both men, Fermina Daza marries Dr. Juvenal Urbino. The two personalities clash, and one betrays the other, but they cling together across the decades. Florentino Ariza retains his obsession with Fermina Daza, despite having countless affairs. Dr. Juvenal Urbino dies by ladder, whereupon Florentino Ariza swoops in to win Fermina Daza over. Throughout the book, characters mistake symptoms of love for symptoms of cholera; and lovers block out the world by claiming to have cholera and self-quarantining.

As a thermodynamicist, I see the second law of thermodynamics in every chapter. The second law implies that time marches only forward, order decays, and randomness scatters information to the wind. García Márquez depicts his characters aging, aging more, and aging more. Many characters die. Florentino Ariza’s mother loses her memory to dementia or Alzheimer’s disease. A pawnbroker, she buys jewels from the elite whose fortunes have eroded. Forgetting the jewels’ value one day, she mistakes them for candies and distributes them to children.

The second law bites most, to me, in the doctor’s final words, “Only God knows how much I loved you.” Later, the widow Fermina Daza sighs, “It is incredible how one can be happy for so many years in the midst of so many squabbles, so many problems, damn it, and not really know if it was love or not.” She doesn’t know how much her husband loved her, especially in light of the betrayal that rocked the couple and a rumor of another betrayal. Her husband could have affirmed his love with his dying breath, but he refused: He might have loved her with all his heart, and he might not have loved her; he kept the truth a secret to all but God. No one can retrieve the information after he dies.2 

Love in the Time of Cholera—and thermodynamics—must sound like a mouthful of horseradish. But each offers nourishment, an appetizer and an entrée. According to the first law of thermodynamics, the amount of energy in every closed, isolated system remains constant: Physics preserves something. Florentino Ariza preserves his love for decades, despite Fermina Daza’s marrying another man, despite her aging.

The latter preservation can last only so long in the story: Florentino Ariza, being mortal, will die. He claims that his love will last “forever,” but he won’t last forever. At the end of the novel, he sails between two harbors—back and forth, back and forth—refusing to finish crossing a River Styx. I see this sailing as prethermalization: A few quantum systems resist thermalizing, or flowing to the physics analogue of death, for a while. But they succumb later. Florentino Ariza can’t evade the far bank forever, just as the second law of thermodynamics forbids his boat from functioning as a perpetuum mobile.

Though mortal within his story, Florentino Ariza survives as a book character. The book survives. García Márquez wrote about a country I’d never visited, and an era decades before my birth, 33 years before I checked his book out of the library. But the book dazzled me. It pulsed with the vibrancy, color, emotion, and intellect—with the fullness—of life. The book gained another life when the coronavius hit. Thermodynamics dictates that people age and die, but the laws of thermodynamics remain.3 I hope and trust—with the caveat about humanity’s not destroying itself—that Love in the Time of Cholera will pulse in 350 years. 

What’s not to love?

1Yes, Caltech has an English library. I found gems in it, and the librarians ordered more when I inquired about books they didn’t have. I commend it to everyone who has access.

2I googled “Only God knows how much I loved you” and was startled to see the line depicted as a hallmark of romance. Please tell your romantic partners how much you love them; don’t make them guess till the ends of their lives.

3Lee Smolin has proposed that the laws of physics change. If they do, the change seems to have to obey metalaws that remain constant.

If the (quantum-metrology) key fits…

My maternal grandfather gave me an antique key when I was in middle school. I loved the workmanship: The handle consisted of intertwined loops. I loved the key’s gold color and how the key weighed on my palm. Even more, I loved the thought that the key opened something. I accompanied my mother to antique shops, where I tried unlocking chests, boxes, and drawers.

Z

My grandfather’s antique key

I found myself holding another such key, metaphorically, during the autumn of 2018. MIT’s string theorists had requested a seminar, so I presented about quasiprobabilities. Quasiprobabilities represent quantum states similarly to how probabilities represent a swarm of classical particles. Consider the steam rising from asphalt on a summer day. Calculating every steam particle’s position and momentum would require too much computation for you or me to perform. But we can predict the probability that, if we measure every particle’s position and momentum, we’ll obtain such-and-such outcomes. Probabilities are real numbers between zero and one. Quasiprobabilities can assume negative and nonreal values. We call these values “nonclassical,” because they’re verboten to the probabilities that describe classical systems, such as steam. I’d defined a quasiprobability, with collaborators, to describe quantum chaos. 

k2

David Arvidsson-Shukur was sitting in the audience. David is a postdoctoral fellow at the University of Cambridge and a visiting scholar in the other Cambridge (at MIT). He has a Swedish-and-southern-English accent that I’ve heard only once before and, I learned over the next two years, an academic intensity matched by his kindliness.1 Also, David has a name even longer than mine: David Roland Miran Arvidsson-Shukur. We didn’t know then, but we were destined to journey together, as postdoctoral knights-errant, on a quest for quantum truth.

David studies the foundations of quantum theory: What distinguishes quantum theory from classical? David suspected that a variation on my quasiprobability could unlock a problem in metrology, the study of measurements.

k1

Suppose that you’ve built a quantum computer. It consists of gates—uses of, e.g., magnets or lasers to implement logical operations. A classical gate implements operations such as “add 11.” A quantum gate can implement an operation that involves some number \theta more general than 11. You can try to build your gate correctly, but it might effect the wrong \theta value. You need to measure \theta.

How? You prepare some quantum state | \psi \rangle and operate on it with the gate. \theta imprints itself on the state, which becomes | \psi (\theta) \rangle. Measure some observable \hat{O}. You repeat this protocol in each of many trials. The measurement yields different outcomes in different trials, according to quantum theory. The average amount of information that you learn about \theta per trial is called the Fisher information.

1

Let’s change this protocol. After operating with the gate, measure another observable, \hat{F}, and postselect: If the \hat{F} measurement yields a desirable outcome f, measure \hat{O}. If the \hat{F}-measurement doesn’t yield the desirable outcome, abort the trial, and begin again. If you choose \hat{F} and f adroitly, you’ll measure \hat{O} only when the trial will provide oodles of information about \theta. You’ll save yourself many \hat{O} measurements that would have benefited you little.2

2

Why does postselection help us? We could understand easily if the system were classical: The postselection would effectively improve the input state. To illustrate, let’s suppose that (i) a magnetic field implemented the gate and (ii) the input were metal or rubber. The magnetic field wouldn’t affect the rubber; measuring \hat{O} in rubber trials would provide no information about the field. So you could spare yourself \hat{O} measurements by postselecting on the system’s consisting of metal.

Magnet

Postselection on a quantum system can defy this explanation. Consider optimizing your input state | \psi \rangle, beginning each trial with the quantum equivalent of metal. Postselection could still increase the average amount of information information provided about \theta per trial. Postselection can enhance quantum metrology even when postselection can’t enhance the classical analogue.

David suspected that he could prove this result, using, as a mathematical tool, the quasiprobability that collaborators and I had defined. We fulfilled his prediction, with Hugo Lepage, Aleks Lasek, Seth Lloyd, and Crispin Barnes. Nature Communications published our paper last month. The work bridges the foundations of quantum theory with quantum metrology and quantum information theory—and, through that quasiprobability, string theory. David’s and my quantum quest continues, so keep an eye out for more theory from us, as well as a photonic experiment based on our first paper.

k3

I still have my grandfather’s antique key. I never found a drawer, chest, or box that it opened. But I don’t mind. I have other mysteries to help unlock.

 

1The morning after my wedding this June, my husband and I found a bouquet ordered by David on our doorstep.

2Postselection has a catch: The \hat{F} measurement has a tiny probability of yielding the desirable outcome. But, sometimes, measuring \hat{O} costs more than preparing | \psi \rangle, performing the gate, and postselecting. For example, suppose that the system is a photon. A photodetector will measure \hat{O}. Some photodetectors have a dead time: After firing, they take a while to reset, to be able to fire again. The dead time can outweigh the cost of the beginning of the experiment.

A quantum walk down memory lane

In elementary and middle school, I felt an affinity for the class three years above mine. Five of my peers had siblings in that year. I carpooled with a student in that class, which partnered with mine in holiday activities and Grandparents’ Day revues. Two students in that class stood out. They won academic-achievement awards, represented our school in science fairs and speech competitions, and enrolled in rigorous high-school programs.

Those students came to mind as I grew to know David Limmer. David is an assistant professor of chemistry at the University of California, Berkeley. He studies statistical mechanics far from equilibrium, using information theory. Though a theorist ardent about mathematics, he partners with experimentalists. He can pass as a physicist and keeps an eye on topics as far afield as black holes. According to his faculty page, I discovered while writing this article, he’s even three years older than I. 

I met David in the final year of my PhD. I was looking ahead to postdocking, as his postdoc fellowship was fading into memory. The more we talked, the more I thought, I’d like to be like him.

Playground

I had the good fortune to collaborate with David on a paper published by Physical Review A this spring (as an Editors’ Suggestion!). The project has featured in Quantum Frontiers as the inspiration for a rewriting of “I’m a little teapot.” 

We studied a molecule prevalent across nature and technologies. Such molecules feature in your eyes, solar-fuel-storage devices, and more. The molecule has two clumps of atoms. One clump may rotate relative to the other if the molecule absorbs light. The rotation switches the molecule from a “closed” configuration to an “open” configuration.

Molecular switch

These molecular switches are small, quantum, and far from equilibrium; so modeling them is difficult. Making assumptions offers traction, but many of the assumptions disagreed with David. He wanted general, thermodynamic-style bounds on the probability that one of these molecular switches would switch. Then, he ran into me.

I traffic in mathematical models, developed in quantum information theory, called resource theories. We use resource theories to calculate which states can transform into which in thermodynamics, as a dime can transform into ten pennies at a bank. David and I modeled his molecule in a resource theory, then bounded the molecule’s probability of switching from “closed” to “open.” I accidentally composed a theme song for the molecule; you can sing along with this post.

That post didn’t mention what David and I discovered about quantum clocks. But what better backdrop for a mental trip to elementary school or to three years into the future?

I’ve blogged about autonomous quantum clocks (and ancient Assyria) before. Autonomous quantum clocks differ from quantum clocks of another type—the most precise clocks in the world. Scientists operate the latter clocks with lasers; autonomous quantum clocks need no operators. Autonomy benefits you if you want for a machine, such as a computer or a drone, to operate independently. An autonomous clock in the machine ensures that, say, the computer applies the right logical gate at the right time.

What’s an autonomous quantum clock? First, what’s a clock? A clock has a degree of freedom (e.g., a pair of hands) that represents the time and that moves steadily. When the clock’s hands point to 12 PM, you’re preparing lunch; when the clock’s hands point to 6 PM, you’re reading Quantum Frontiers. An autonomous quantum clock has a degree of freedom that represents the time fairly accurately and moves fairly steadily. (The quantum uncertainty principle prevents a perfect quantum clock from existing.)

Suppose that the autonomous quantum clock constitutes one part of a machine, such as a quantum computer, that the clock guides. When the clock is in one quantum state, the rest of the machine undergoes one operation, such as one quantum logical gate. (Experts: The rest of the machine evolves under one Hamiltonian.) When the clock is in another state, the rest of the machine undergoes another operation (evolves under another Hamiltonian).

Clock 2

Physicists have been modeling quantum clocks using the resource theory with which David and I modeled our molecule. The math with which we represented our molecule, I realized, coincided with the math that represents an autonomous quantum clock.

Think of the molecular switch as a machine that operates (mostly) independently and that contains an autonomous quantum clock. The rotating clump of atoms constitutes the clock hand. As a hand rotates down a clock face, so do the nuclei rotate downward. The hand effectively points to 12 PM when the switch occupies its “closed” position. The hand effectively points to 6 PM when the switch occupies its “open” position.

The nuclei account for most of the molecule’s weight; electrons account for little. They flit about the landscape shaped by the atomic clumps’ positions. The landscape governs the electrons’ behavior. So the electrons form the rest of the quantum machine controlled by the nuclear clock.

Clock 1

Experimentalists can create and manipulate these molecular switches easily. For instance, experimentalists can set the atomic clump moving—can “wind up” the clock—with ultrafast lasers. In contrast, the only other autonomous quantum clocks that I’d read about live in theory land. Can these molecules bridge theory to experiment? Reach out if you have ideas!

And check out David’s theory lab on Berkeley’s website and on Twitter. We all need older siblings to look up to.

Up we go! or From abstract theory to experimental proposal

Mr. Mole is trapped indoors, alone. Spring is awakening outside, but he’s confined to his burrow. Birds are twittering, and rabbits are chattering, but he has only himself for company.

Sound familiar? 

Spring—crocuses, daffodils, and hyacinths budding; leaves unfurling; and birds warbling—burst upon Cambridge, Massachusetts last month. The city’s shutdown vied with the season’s vivaciousness. I relieved the tension by rereading The Wind in the Willows, which I’ve read every spring since 2017. 

Project Gutenberg offers free access to Kenneth Grahame’s 1908 novel. He wrote the book for children, but never mind that. Many masterpieces of literature happen to have been written for children.

Book cover

One line in the novel demanded, last year, that I memorize it. On page one, Mole is cleaning his house beneath the Earth’s surface. He’s been dusting and whitewashing for hours when the spring calls to him. Life is pulsating on the ground and in the air above him, and he can’t resist joining the party. Mole throws down his cleaning supplies and tunnels upward through the soil: “he scraped and scratched and scrabbled and scrooged, and then he scrooged again and scrabbled and scratched and scraped.”

The quotation appealed to me not only because of its alliteration and chiasmus. Mole’s journey reminded me of research. 

Take a paper that I published last month with Michael Beverland of Microsoft Research and Amir Kalev of the Joint Center for Quantum Information and Computer Science (now of the Information Sciences Institute at the University of Southern California). We translated a discovery from the abstract, mathematical language of quantum-information-theoretic thermodynamics into an experimental proposal. We had to scrabble, but we kept on scrooging.

Mole 1

Over four years ago, other collaborators and I uncovered a thermodynamics problem, as did two other groups at the same time. Thermodynamicists often consider small systems that interact with large environments, like a magnolia flower releasing its perfume into the air. The two systems—magnolia flower and air—exchange things, such as energy and scent particles. The total amount of energy in the flower and the air remains constant, as does the total number of perfume particles. So we call the energy and the perfume-particle number conserved quantities. 

We represent quantum conserved quantities with matrices Q_1 and Q_2. We nearly always assume that, in this thermodynamic problem, those matrices commute with each other: Q_1 Q_2 = Q_2 Q_1. Almost no one mentions this assumption; we make it without realizing. Eliminating this assumption invalidates a derivation of the state reached by the small system after a long time. But why assume that the matrices commute? Noncommutation typifies quantum physics and underlies quantum error correction and quantum cryptography.

What if the little system exchanges with the large system thermodynamic quantities represented by matrices that don’t commute with each other?

Magnolia

Colleagues and I began answering this question, four years ago. The small system, we argued, thermalizes to near a quantum state that contains noncommuting matrices. We termed that state, e^{ - \sum_\alpha \beta_\alpha Q_\alpha } / Z, the non-Abelian thermal state. The Q_\alpha’s represent conserved quantities, and the \beta_\alpha’s resemble temperatures. The real number Z ensures that, if you measure any property of the state, you’ll obtain some outcome. Our arguments relied on abstract mathematics, resource theories, and more quantum information theory.

Over the past four years, noncommuting conserved quantities have propagated across quantum-information-theoretic thermodynamics.1 Watching the idea take root has been exhilarating, but the quantum information theory didn’t satisfy me. I wanted to see a real physical system thermalize to near the non-Abelian thermal state.

Michael and Amir joined the mission to propose an experiment. We kept nosing toward a solution, then dislodging a rock that would shower dirt on us and block our path. But we scrabbled onward.

Toad

Imagine a line of ions trapped by lasers. Each ion contains the physical manifestation of a qubit—a quantum two-level system, the basic unit of quantum information. You can think of a qubit as having a quantum analogue of angular momentum, called spin. The spin has three components, one per direction of space. These spin components are represented by matrices Q_x = S_x, Q_y = S_y, and Q_z = S_z that don’t commute with each other. 

A couple of qubits can form the small system, analogous to the magnolia flower. The rest of the qubits form the large system, analogous to the air. I constructed a Hamiltonian—a matrix that dictates how the qubits evolve—that transfers quanta of all the spin’s components between the small system and the large. (Experts: The Heisenberg Hamiltonian transfers quanta of all the spin components between two qubits while conserving S_{x, y, z}^{\rm tot}.)

The Hamiltonian led to our first scrape: I constructed an integrable Hamiltonian, by accident. Integrable Hamiltonians can’t thermalize systems. A system thermalizes by losing information about its initial conditions, evolving to a state with an exponential form, such as e^{ - \sum_\alpha \beta_\alpha Q_\alpha } / Z. We clawed at the dirt and uncovered a solution: My Hamiltonian coupled together nearest-neighbor qubits. If the Hamiltonian coupled also next-nearest-neighbor qubits, or if the ions formed a 2D or 3D array, the Hamiltonian would be nonintegrable.

Oars

We had to scratch at every stage—while formulating the setup, preparation procedure, evolution, measurement, and prediction. But we managed; Physical Review E published our paper last month. We showed how a quantum system can evolve to the non-Abelian thermal state. Trapped ions, ultracold atoms, and quantum dots can realize our experimental proposal. We imported noncommuting conserved quantities in thermodynamics from quantum information theory to condensed matter and atomic, molecular, and optical physics.

As Grahame wrote, the Mole kept “working busily with his little paws and muttering to himself, ‘Up we go! Up we go!’ till at last, pop! his snout came out into the sunlight and he found himself rolling in the warm grass of a great meadow.”

Mole 2

1See our latest paper’s introduction for references. https://journals.aps.org/pre/abstract/10.1103/PhysRevE.101.042117

In the hour of darkness and peril and need

I recited the poem “Paul Revere’s Ride” to myself while walking across campus last week. 

A few hours earlier, I’d cancelled the seminar that I’d been slated to cohost two days later. In a few hours, I’d cancel the rest of the seminars in the series. Undergraduates would begin vacating their dorms within a day. Labs would shut down, and postdocs would receive instructions to work from home.

I memorized “Paul Revere’s Ride” after moving to Cambridge, following tradition: As a research assistant at Lancaster University in the UK, I memorized e. e. cummings’s “anyone lived in a pretty how town.” At Caltech, I memorized “Kubla Khan.” Another home called for another poem. “Paul Revere’s Ride” brooked no competition: Campus’s red bricks run into Boston, where Revere’s story began during the 1700s. 

Henry Wadsworth Longfellow, who lived a few blocks from Harvard, composed the poem. It centers on the British assault against the American colonies, at Lexington and Concord, on the eve of the Revolutionary War. A patriot learned of the British troops’ movements one night. He communicated the information to faraway colleagues by hanging lamps in a church’s belfry. His colleagues rode throughout the night, to “spread the alarm / through every Middlesex village and farm.” The riders included Paul Revere, a Boston silversmith.

The Boston-area bricks share their color with Harvard’s crest, crimson. So do the protrusions on the coronavirus’s surface in colored pictures. 

Screen Shot 2020-03-13 at 6.40.04 PM

I couldn’t have designed a virus to suit Harvard’s website better.

The yard that I was crossing was about to “de-densify,” the red-brick buildings were about to empty, and my home was about to lock its doors. I’d watch regulations multiply, emails keep pace, and masks appear. Revere’s messenger friend, too, stood back and observed his home:

he climbed to the tower of the church,
Up the wooden stairs, with stealthy tread,
To the belfry-chamber overhead, [ . . . ]
By the trembling ladder, steep and tall,
To the highest window in the wall,
Where he paused to listen and look down
A moment on the roofs of the town,
And the moonlight flowing over all.

I commiserated also with Revere, waiting on tenterhooks for his message:

Meanwhile, impatient to mount and ride,
Booted and spurred, with a heavy stride,
On the opposite shore walked Paul Revere.
Now he patted his horse’s side,
Now gazed on the landscape far and near,
Then impetuous stamped the earth,
And turned and tightened his saddle-girth…

The lamps ended the wait, and Revere rode off. His mission carried a sense of urgency, yet led him to serenity that I hadn’t expected:

He has left the village and mounted the steep,
And beneath him, tranquil and broad and deep,
Is the Mystic, meeting the ocean tides…

The poem’s final stanza kicks. Its message carries as much relevance to the 21st century as Longfellow, writing about the 1700s during the 1800s, could have dreamed:

So through the night rode Paul Revere;
And so through the night went his cry of alarm
To every Middlesex village and farm,—
A cry of defiance, and not of fear,
A voice in the darkness, a knock at the door,
And a word that shall echo forevermore!
For, borne on the night-wind of the Past,
Through all our history, to the last,
In the hour of darkness and peril and need,
The people will waken and listen to hear
The hurrying hoof-beats of that steed,
And the midnight message of Paul Revere.

Reciting poetry clears my head. I can recite on autopilot, while processing other information or admiring my surroundings. But the poem usually wins my attention at last. The rhythm and rhyme sweep me along, narrowing my focus. Reciting “Paul Revere’s Ride” takes me 5-10 minutes. After finishing that morning, I repeated the poem, and began repeating it again, until arriving at my institute on the edge of Harvard’s campus.

Isolation can benefit theorists. Many of us need quiet to study, capture proofs, and disentangle ideas. Many of us need collaboration; but email, Skype, Google hangouts, and Zoom connect us. Many of us share and gain ideas through travel; but I can forfeit a  little car sickness, air turbulence, and waiting in lines. Many of us need results from experimentalist collaborators, but experimental results often take long to gather in the absence of pandemics. Many of us are introverts who enjoy a little self-isolation.

 

April is National Poetry Month in the United States. I often celebrate by intertwining physics with poetry in my April blog post. Next month, though, I’ll have other news to report. Besides, my walk demonstrated, we need poetry now. 

Paul Revere found tranquility on the eve of a storm. Maybe, when the night clears and doors reopen, science born of the quiet will flood journals. Aren’t we fortunate, as physicists, to lead lives steeped in a kind of poetry?

The shape of MIP* = RE

There’s a famous parable about a group of blind men encountering an elephant for the very first time. The first blind man, who had his hand on the elephant’s side, said that it was like an enormous wall. The second blind man, wrapping his arms around the elephant’s leg, exclaimed that surely it was a gigantic tree trunk. The third, feeling the elephant’s tail, declared that it must be a thick rope. Vehement disagreement ensues, but after a while the blind men eventually come to realize that, while each person was partially correct, there is much more to the elephant than initially thought.

6-blind-men-hans-1024x6541-1

Last month, Zhengfeng, Anand, Thomas, John and I posted MIP* = RE to arXiv. The paper feels very much like the elephant of the fable — and not just because of the number of pages! To a computer scientist, the paper is ostensibly about the complexity of interactive proofs. To a quantum physicist, it is talking about mathematical models of quantum entanglement. To the mathematician, there is a claimed resolution to a long-standing problem in operator algebras. Like the blind men of the parable, each are feeling a small part of a new phenomenon. How do the wall, the tree trunk, and the rope all fit together?

I’ll try to trace the outline of the elephant: it starts with a mystery in quantum complexity theory, curves through the mathematical foundations of quantum mechanics, and arrives at a deep question about operator algebras.

The rope: The complexity of nonlocal games

In 2004, computer scientists Cleve, Hoyer, Toner, and Watrous were thinking about a funny thing called nonlocal games. A nonlocal game G involves three parties: two cooperating players named Alice and Bob, and someone called the verifier. The verifier samples a pair of random questions (x,y) and sends x to Alice (who responds with answer a), and y to Bob (who responds with answer b). The verifier then uses some function D(x,y,a,b) that tells her whether the players win, based on their questions and answers.

All three parties know the rules of the game before it starts, and Alice and Bob’s goal is to maximize their probability of winning the game. The players aren’t allowed to communicate with each other during the game, so it’s a nontrivial task for them to coordinate an optimal strategy (i.e., how they should individually respond to the verifier’s questions) before the game starts.

The most famous example of a nonlocal game is the CHSH game (which has made several appearances on this blog already): in this game, the verifier sends a uniformly random bit x to Alice (who responds with a bit a) and a uniformly random bit y to Bob (who responds with a bit b). The players win if a \oplus b = x \wedge y (in other words, the sum of their answer bits is equal to the product of the input bits modulo 2).

What is Alice’s and Bob’s maximum winning probability? Well, it depends on what type of strategy they use. If they use a strategy that can be modeled by classical physics, then their winning probability cannot exceed 75\% (we call this the classical value of CHSH). On the other hand, if they use a strategy based on quantum physics, Alice and Bob can do better by sharing two quantum bits (qubits) that are entangled. During the game each player measures their own qubit (where the measurement depends on their received question) to obtain answers that win the CHSH game with probability \cos^2(\pi/8) \approx .854\ldots (we call this the quantum value of CHSH). So even though the entangled qubits don’t allow Alice and Bob to communicate with each other, entanglement gives them a way to win with higher probability! In technical terms, their responses are more correlated than what is possible classically.

The CHSH game comes from physics, and was originally formulated not as a game involving Alice and Bob, but rather as an experiment involving two spatially separated devices to test whether stronger-than-classical correlations exist in nature. These experiments are known as Bell tests, named after John Bell. In 1964, he proved that correlations from quantum entanglement cannot be explained by any “local hidden variable theory” — in other words, a classical theory of physics.1 He then showed that a Bell test, like the CHSH game, gives a simple statistical test for the presence of nonlocal correlations between separated systems. Since the 1960s, numerous Bell tests have been conducted experimentally, and the verdict is clear: nature does not behave classically.

Cleve, Hoyer, Toner and Watrous noticed that nonlocal games/Bell tests can be viewed as a kind of multiprover interactive proof. In complexity theory, interactive proofs are protocols where some provers are trying to convince a verifier of a solution to a long, difficult computation, and the verifier is trying to efficiently determine if the solution is correct. In a Bell test, one can think of the provers as instead trying to convince the verifier of a physical statement: that they possess quantum entanglement.

With the computational lens trained firmly on nonlocal games, it then becomes natural to ask about their complexity. Specifically, what is the complexity of approximating the optimal winning probability in a given nonlocal game G? In complexity-speak, this is phrased as a question about characterizing the class MIP* (pronounced “M-I-P star”). This is also a well-motivated question for an experimentalist conducting Bell tests: at the very least, they’d want to determine if (a) quantum players can do better than classical players, and (b) what can the best possible quantum strategy achieve?

Studying this question in the case of classical players led to some of the most important results in complexity theory, such as MIP = NEXP and the PCP Theorem. Indeed, the PCP Theorem says that it is NP-hard to approximate the classical value of a nonlocal game (i.e. the maximum winning probability of classical players) to within constant additive accuracy (say \pm \frac{1}{10}). Thus, assuming that P is not equal to NP, we shouldn’t expect a polynomial-time algorithm for this. However it is easy to see that there is a “brute force” algorithm for this problem: by taking exponential time to enumerate over all possible deterministic player strategies, one can exactly compute the classical value of nonlocal games.

When considering games with entangled players, however, it’s not even clear if there’s a similar “brute force” algorithm that solves this in any amount of time — forget polynomial time; even if we allow ourselves exponential, doubly-exponential, Ackermann function amount of time, we still don’t know how to solve this quantum value approximation problem. The problem is that there is no known upper bound on the amount of entanglement that is needed for players to play a nonlocal game. For example, for a given game G, does an optimal quantum strategy require one qubit, ten qubits, or 10^{10^{10}} qubits of entanglement? Without any upper bound, a “brute force” algorithm wouldn’t know how big of a quantum strategy to search for — it would keep enumerating over bigger and bigger strategies in hopes of finding a better one.

Thus approximating the quantum value may not even be solvable in principle! But could it really be uncomputable? Perhaps we just haven’t found the right mathematical tool to give an upper bound on the dimension — maybe we just need to come up with some clever variant of, say, Johnson-Lindenstrauss or some other dimension reduction technique.2

In 2008, there was promising progress towards an algorithmic solution for this problem. Two papers [DLTW, NPA] (appearing on arXiv on the same day!) showed that an algorithm based on semidefinite programming can produce a sequence of numbers that converge to something called the commuting operator value of a nonlocal game.3 If one could show that the commuting operator value and the quantum value of a nonlocal game coincide, then this would yield an algorithm for solving this approximation problem!

Asking whether this commuting operator and quantum values are the same, however, immediately brings us to the precipice of some deep mysteries in mathematical physics and operator algebras, far removed from computer science and complexity theory. This takes us to the next part of the elephant.

The tree: mathematical foundations of locality

The mystery about the quantum value versus the commuting operator value of nonlocal games has to do with two different ways of modeling Alice and Bob in quantum mechanics. As I mentioned earlier, quantum physics predicts that the maximum winning probability in, say, the CHSH game when Alice and Bob share entanglement is approximately 85%. As with any physical theory, these predictions are made using some mathematical framework — formal rules for modeling physical experiments like the CHSH game.

In a typical quantum information theory textbook, players in the CHSH game are usually modelled in the following way: Alice’s device is described a state space \mathcal{H}_A (all the possible states the device could be in), a particular state |\psi_A\rangle from \mathcal{H}_A, and a set of measurement operators \mathcal{M}_A (operations that can be performed by the device). It’s not necessary to know what these things are formally; the important feature is that these three things are enough to make any prediction about Alice’s device — when treated in isolation, at least. Similarly, Bob’s device can be described using its own state space \mathcal{H}_B, state |\psi_B\rangle, and measurement operators \mathcal{M}_B.

In the CHSH game though, one wants to make predictions about Alice’s and Bob’s devices together. Here the textbooks say that Alice and Bob are jointly described by the tensor product formalism, which is a natural mathematical way of “putting separate spaces together”. Their state space is denoted by \mathcal{H}_A \otimes \mathcal{H}_B. The joint state |\psi_{AB}\rangle describing the devices comes from this tensor product space. When Alice and Bob independently make their local measurements, this is described by a measurement operator from the tensor product of operators from \mathcal{M}_A and \mathcal{M}_B. The strange correlations of quantum mechanics arise when their joint state |\psi_{AB}\rangle is entangled, i.e. it cannot be written as a well-defined state on Alice’s side combined with a well-defined state on Bob’s side (even though the state space itself is two independent spaces combined together!)

The tensor product model works well; it satisfies natural properties you’d want from the CHSH experiment, such as the constraint that Alice and Bob can’t instantaneously signal to each other. Furthermore, predictions made in this model match up very accurately with experimental results!

This is the not the whole story, though. The tensor product formalism works very well in non-relativistic quantum mechanics, where things move slowly and energies are low. To describe more extreme physical scenarios — like when particles are being smashed together at near-light speeds in the Large Hadron Collider — physicists turn to the more powerful quantum field theory. However, the notion of spatiotemporal separation in relativistic settings gets especially tricky. In particular, when trying to describe quantum mechanical systems, it is no longer evident how to assign Alice and Bob their own independent state spaces, and thus it’s not clear how to put relativistic Alice and Bob in the tensor product framework!

In quantum field theory, locality is instead described using the commuting operator model. Instead of assigning Alice and Bob their own individual state spaces and then tensoring them together to get a combined space, the commuting operator model stipulates that there is just a single monolithic space \mathcal{H} for both Alice and Bob. Their joint state is described using a vector |\psi\rangle from \mathcal{H}, and Alice and Bob’s measurement operators both act on \mathcal{H}. The constraint that they can’t communicate is captured by the fact that Alice’s measurement operators commute with Bob’s operators. In other words, the order in which the players perform their measurements on the system does not matter: Alice measuring before Bob, or Bob measuring before Alice, both yield the same statistical outcomes. Locality is enforced through commutativity.

The commuting operator framework contains the tensor product framework as a special case4, so it’s more general. Could the commuting operator model allow for correlations that can’t be captured by the tensor product model, even approximately56? This question is known as Tsirelson’s problem, named after the late mathematician Boris Tsirelson.

There is a simple but useful way to phrase this question using nonlocal games. What we call the “quantum value” of a nonlocal game G (denoted by \omega^* (G)) really refers to the supremum of success probabilities over tensor product strategies for Alice and Bob. If they use strategies from the more general commuting operator model, then we call their maximum success probability the commuting operator value of G (denoted by \omega^{co}(G)). Since tensor product strategies are a special case of commuting operator strategies, we have the relation \omega^* (G) \leq \omega^{co}(G) for all nonlocal games G.

Could there be a nonlocal game G whose tensor product value is different from its commuting operator value? With tongue-in-cheek: is there a game G that Alice and Bob could succeed at better if they were using quantum entanglement at near-light speeds? It is difficult to find even a plausible candidate game for which the quantum and commuting operator values may differ. The CHSH game, for example, has the same quantum and commuting operator value; this was proved by Tsirelson.

If the tensor product and the commuting operator models are the same (i.e., the “positive” resolution of Tsirelson’s problem), then as I mentioned earlier, this has unexpected ramifications: there would be an algorithm for approximating the quantum value of nonlocal games.

How does this algorithm work? It comes in two parts: a procedure to search from below, and one to search from above. The “search from below” algorithm computes a sequence of numbers \alpha_1,\alpha_2,\alpha_3,\ldots where \alpha_d is (approximately) the best winning probability when Alice and Bob use a d-qubit tensor product strategy. For fixed d, the number \alpha_d can be computed by enumerating over (a discretization of) the space of all possible d-qubit strategies. This takes a doubly-exponential amount of time in d — but at least this is still a finite time! This naive “brute force” algorithm will slowly plod along, computing a sequence of better and better winning probabilities. We’re guaranteed that in the limit as d goes to infinity, the sequence \{ \alpha_d\} converges to the quantum value \omega^* (G). Of course the issue is that the “search from below” procedure never knows how close it is to the true quantum value.

This is where the “search from above” comes in. This is an algorithm that computes a different sequence of numbers \beta_1,\beta_2,\beta_3,\ldots where each \beta_d is an upper bound on the commuting operator value \omega^{co}(G), and furthermore as d goes to infinity, \beta_d eventually converges to \omega^{co}(G). Furthermore, each \beta_d can be computed by a technique known as semidefinite optimization; this was shown by the two papers I mentioned.

Let’s put the pieces together. If the quantum and commuting operator values of a game G coincide (i.e. \omega^* (G) = \omega^{co}(G)), then we can run the “search from below” and “search from above” procedures in parallel, interleaving the computation of the \{\alpha_d\} and \{ \beta_d\}. Since both are guaranteed to converge to the quantum value, at some point the upper bound \beta_d will come within some \epsilon to the lower bound \alpha_d, and thus we would have homed in on (an approximation of) \omega^* (G). There we have it: an algorithm to approximate the quantum value of games.

All that remains to do, surely, is to solve Tsirelson’s problem in the affirmative (that commuting operator correlations can be approximated by tensor product correlations), and then we could put this pesky question about the quantum value to rest. Right?

The wall: Connes’ embedding problem

At the end of the 1920s, polymath extraordinaire John von Neumann formulated the first rigorous mathematical framework for the recently developed quantum mechanics. This framework, now familiar to physicists and quantum information theorists everywhere, posits that quantum states are vectors in a Hilbert space, and measurements are linear operators acting on those spaces. It didn’t take long for von Neumann to realize that there was a much deeper theory of operators on Hilbert spaces waiting to be discovered. With Francis Murray, in the 1930s he started to develop a theory of “rings of operators” — today these are called von Neumann algebras.

The theory of operator algebras has since flourished into a rich and beautiful area of mathematics. It remains inseparable from mathematical physics, but has established deep connections with subjects such as knot theory and group theory. One of the most important goals in operator algebras has been to provide a classification of von Neumann algebras. In their series of papers on the subject, Murray and von Neumann first showed that classifying von Neumann algebras reduces to understanding their factors, the atoms out of which all von Neumann algebras are built. Then, they showed that factors of von Neumann algebras come in one of three species: type I, type II, and type III. Type I factors were completely classified by Murray and von Neumann, and they made much progress on characterizing certain type II factors. However progress stalled until the 1970s, when Alain Connes provided a classification of type III factors (work for which he would later receive the Fields Medal). In the same 1976 classification paper, Connes makes a casual remark about something called type II_1 factors7:

We now construct an embedding of N into \mathcal{R}. Apparently such an embedding ought to exist for all II_1 factors.

This line, written in almost a throwaway manner, eventually came to be called “Connes’ embedding problem”: does every separable II_1 factor embed into an ultrapower of the hyperfinite II_1 factor? It seems that Connes surmises that it does (and thus this is also called “Connes’ embedding conjecture“). Since 1976, this problem has grown into a central question of operator algebras, with numerous equivalent formulations and consequences across mathematics.

In 2010, two papers (again appearing on the arXiv on the same day!) showed that the reach of Connes’ embedding conjecture extends back to the foundations of quantum mechanics. If Connes’ embedding problem has a positive answer (i.e. an embedding exists), then Tsirelson’s problem (i.e. whether commuting operator can be approximated by tensor product correlations) also has a positive answer! Later it was shown by Ozawa that Connes’ embedding problem is in fact equivalent to Tsirelson’s problem.

Remember that our approach to compute the value of nonlocal games hinged on obtaining a positive answer to Tsirelson’s problem. The sequence of papers [NPA, DLTW, Fritz, JNPPSW] together show that resolving — one way or another — whether this search-from-below, search-from-above algorithm works would essentially settle Connes’ embedding conjecture. What started as a funny question at the periphery of computer science and quantum information theory has morphed into an attack on one of the central problems in operator algebras.

MIP* = RE

We’ve now ended back where we started: the complexity of nonlocal games. Let’s take a step back and try to make sense of the elephant.

Even to a complexity theorist, “MIP* = RE” may appear esoteric. The complexity classes MIP* and RE refer to a bewildering grabbag of concepts: there’s Alice, Bob, Turing machines, verifiers, interactive proofs, quantum entanglement. What is the meaning of the equality of these two classes?

First, it says that the Halting problem has an interactive proof involving quantum entangled provers. In the Halting problem, you want to decide whether a Turing machine M, if you started running it, would eventually terminate with a well-defined answer, or if it would get stuck in an infinite loop. Alan Turing showed that this problem is undecidable: there is no algorithm that can solve this problem in general. Loosely speaking, the best thing you can do is to just flick on the power switch to M, and wait to see if it eventually stops. If M gets stuck in an infinite loop — well, you’re going to be waiting forever.

MIP* = RE shows with the help of all-powerful Alice and Bob, a time-limited verifier can run an interactive proof to “shortcut” the waiting. Given the Turing machine M‘s description (its “source code”), the verifier can efficiently compute a description of a nonlocal game G_M whose behavior reflects that of M. If M does eventually halt (which could happen after a million years), then there is a strategy for Alice and Bob that causes the verifier to accept with probability 1. In other words, \omega^* (G_M) = 1. If M gets stuck in an infinite loop, then no matter what strategy Alice and Bob use, the verifier always rejects with high probability, so \omega^* (G_M) is close to 0.

By playing this nonlocal game, the verifier can obtain statistical evidence that M is a Turing machine that eventually terminates. If the verifier plays G_M and the provers win, then the verifier should believe that it is likely that M halts. If they lose, then the verifier concludes there isn’t enough evidence that M halts8. The verifier never actually runs M in this game; she has offloaded the task to Alice and Bob, who we can assume are computational gods capable of performing million-year-long computations instantly. For them, the challenge is instead to convince the verifier that if she were to wait millions of years, she would witness the termination of M. Incredibly, the amount of work put in by the verifier in the interactive proof is independent of the time it takes for M to halt!

The fact that the Halting problem has an interactive proof seems borderline absurd: if the Halting problem is unsolvable, why should we expect it to be verifiable? Although complexity theory has taught us that there can be a large gap between the complexity of verification versus search, it has always been a difference of efficiency: if solutions to a problem can be efficiently verified, then solutions can also be found (albeit at drastically higher computational cost). MIP* = RE shows that, with quantum entanglement, there can be a chasm of computability between verifying solutions and finding them.

Now let’s turn to the non-complexity consequences of MIP* = RE. The fact that we can encode the Halting problem into nonlocal games also immediately tells us that there is no algorithm whatsoever to approximate the quantum value. Suppose there was an algorithm that could approximate \omega^* (G). Then, using the transformation from Turing machines to nonlocal games mentioned above, we could use this algorithm to solve the Halting problem, which is impossible.

Now the dominoes start to fall. This means that, in particular, the proposed “search-from-below”/”search-from-above” algorithm cannot succeed in approximating \omega^* (G). There must be a game G, then, for which the quantum value is different from the commuting operator value. But this implies Tsirelson’s problem has a negative answer, and therefore Connes’ embedding conjecture is false.

We’ve only sketched the barest of outlines of this elephant, and yet it is quite challenging to hold it in the mind’s eye all at once9. This story is intertwined with some of the most fundamental developments in the past century: modern quantum mechanics, operator algebras, and computability theory were birthed in the 1930s. Einstein, Podolsky and Rosen wrote their landmark paper questioning the nature of quantum entanglement in 1935, and John Bell discovered his famous test and inequality in 1964. Connes’ formulated his conjecture in the ’70s, Tsirelson made his contributions to the foundations of quantum mechanics in the ’80s, and about the same time computer scientists were inventing the theory of interactive proofs and probabilistically checkable proofs (PCPs).

We haven’t said anything about the proof of MIP* = RE yet (this may be the subject of future blog posts), but it is undeniably a product of complexity theory. The language of interactive proofs and Turing machines is not just convenient but necessary: at its heart MIP* = RE is the classical PCP Theorem, with the help of quantum entanglement, recursed to infinity.

What is going on in this proof? What parts of it are fundamental, and which parts are unnecessary? What is the core of it that relates to Connes’ embedding conjecture? Are there other consequences of this uncomputability result? These are questions to be explored in the coming days and months, and the answers we find will be fascinating.

Acknowledgments. Thanks to William Slofstra and Thomas Vidick for helpful feedback on this post.


  1. This is why quantum correlations are called “nonlocal”, and why we call the CHSH game a “nonlocal game”: it is a test for nonlocal behavior. 
  2. A reasonable hope would be that, for every nonlocal game G, there is a generic upper bound on the number of qubits needed to approximate the optimal quantum strategy (e.g., a game G with Q possible questions and A possible answers would require at most, say, 2^{O(Q \cdot A)} qubits to play optimally). 
  3. In those papers, they called it the field theoretic value
  4. The space \mathcal{H} can be broken down into the tensor product \mathcal{H}_A \otimes \mathcal{H}_B, and Alice’s measurements only act on the \mathcal{H}_A space and Bob’s measurements only act on the \mathcal{H}_B space. In this case, Alice’s measurements clearly commute with Bob’s. 
  5. In a breakthrough work in 2017, Slofstra showed that the tensor product framework is not exactly the same as the commuting operator framework; he shows that there is a nonlocal game G where players using commuting operator strategies can win with probability 1, but when they use a tensor-product strategy they can only win with probability strictly less than 1. However the perfect commuting operator strategy can be approximated by tensor-product strategies arbitrarily well, so the quantum values and the commuting operator values of G are the same. 
  6. The commuting operator model is motivated by attempts to develop a rigorous mathematical framework for quantum field theory from first principles (see, for example algebraic quantum field theory (AQFT)). In the “vanilla” version of AQFT, tensor product decompositions between casually independent systems do not exist a priori, but mathematical physicists often consider AQFTs augmented with an additional “split property”, which does imply tensor product decompositions. Thus in such AQFTs, Tsirelson’s problem has an affirmative answer. 
  7. Type II_1 is pronounced “type two one”. 
  8. This is not the same as evidence that M loops forever! 
  9. At least, speaking for myself.