Every spring, a portal opens between Waltham, Massachusetts and another universe.
The other universe has a Watch City dual to Waltham, known for its watch factories. The cities throw a festival to which explorers, inventors, and tourists flock. Top hats, goggles, leather vests, bustles, and lace-up boots dot the crowds. You can find pet octopodes, human-machine hybrids, and devices for bending space and time. Steam powers everything.
Watch City Steampunk Festival
So I learned thanks to Maxim Olshanyi, a professor of physics at the University of Massachusetts Boston. He hosted my colloquium, “Quantumsteampunk: Quantum information meets thermodynamics,” earlier this month. Maxim, I discovered, has more steampunk experience than I. He digs up century-old designs for radios, builds the radios, and improves upon the designs. He exhibits his creations at the Watch City Steampunk Festival.
I never would have guessed that Maxim moonlights with steampunkers. But his hobby makes sense: Maxim has transformed our understanding of quantum integrability.
Integrability is to thermalization as Watch City is to Waltham. A bowl of baked beans thermalizes when taken outside in Boston in October: Heat dissipates into the air. After half-an-hour, large-scale properties bear little imprint of their initial conditions: The beans could have begun at 112ºF or 99º or 120º. Either way, the beans have cooled.
Integrable systems avoid thermalizing; more of their late-time properties reflect early times. Why? We can understand through an example, an integrable system whose particles don’t interact with each other (whose particles are noninteracting fermions). The dynamics conserve the particles’ momenta. Consider growing the system by adding particles. The number of conserved quantities grows as the system size. The conserved quantities retain memories of the initial conditions.
Imagine preparing an integrable system, analogously to preparing a bowl of baked beans, and letting it sit for a long time. Will the system equilibrate, or settle down to, a state predictable with a simple rule? We might expect not. Obeying the same simple rule would cause different integrable systems to come to resemble each other. Integrable systems seem unlikely to homogenize, since each system retains much information about its initial conditions.
Maxim and collaborators exploded this expectation. Integrable systems do relax to simple equilibrium states, which the physicists called the generalized Gibbs ensemble (GGE). Josiah Willard Gibbs cofounded statistical mechanics during the 1800s. He predicted the state to which nonintegrable systems, like baked beans in autumnal Boston, equilibrate. Gibbs’s theory governs classical systems, like baked beans, as does the GGE theory. But also quantum systems equilibrate to the GGE, and Gibbs’s conclusions translate into quantum theory with few adjustments. So I’ll explain in quantum terms.
Consider quantum baked beans that exchange heat with a temperature- environment. Let denote the system’s Hamiltonian, which basically represents the beans’ energy. The beans equilibrate to a quantum Gibbs state, . The denotes Boltzmann’s constant, a fundamental constant of nature. The partition function enables the quantum state to obey probability theory (normalizes the state).
Maxim and friends modeled their generalized Gibbs ensemble on the Gibbs state. Let denote a quantum integrable system’s conserved quantity. This system equilibrates to . The normalizes the state. The intensive parameters ’s serve analogously to temperature and depend on the conserved quantities’ values. Maxim and friends predicted this state using information theory formalized by Ed Jaynes. Inventing the GGE, they unlocked a slew of predictions about integrable quantum systems.
A radio built by Maxim. According to him, “The invention was to replace a diode with a diode bridge, in a crystal radio, thus gaining a factor of two in the output power.”
I define quantum steampunk as the intersection of quantum theory, especially quantum information theory, with thermodynamics, and the application of this intersection across science. Maxim has used information theory to cofound a branch of quantum statistical mechanics. Little wonder that he exhibits homemade radios at the Watch City Steampunk Festival. He also holds a license to drive steam engines and used to have my postdoc position. I appreciate having older cousins to look up to. Here’s hoping that I become half the quantum steampunker that I found by Massachusetts Bay.
With thanks to Maxim and the rest of the University of Massachusetts Boston Department of Physics for their hospitality.
The next Watch City Steampunk Festival takes place on May 9, 2020. Contact me if you’d attend a quantum-steampunk meetup!
I was working with Tony Bartolotta, a PhD student in theoretical physics at Caltech, and Jason Pollack, a postdoc in cosmology at the University of British Columbia. They acted as the souls of consideration. We missed out on dozens of opportunities to bicker—about the paper’s focus, who undertook which tasks, which journal to submit to, and more. Bickering would have spiced up the story behind our paper, because the paper concerns disagreement.
Quantum observables can disagree. Observables are measurable properties, such as position and momentum. Suppose that you’ve measured a quantum particle’s position and obtained an outcome . If you measure the position immediately afterward, you’ll obtain again. Suppose that, instead of measuring the position again, you measure the momentum. All the possible outcomes have equal probabilities of obtaining. You can’t predict the outcome.
The particle’s position can have a well-defined value, or the momentum can have a well-defined value, but the observables can’t have well-defined values simultaneously. Furthermore, if you measure the position, you randomize the outcome of a momentum measurement. Position and momentum disagree.
How should we quantify the disagreement of two quantum observables, and ? The question splits physicists into two camps. Pure quantum information (QI) theorists use uncertainty relations, whereas condensed-matter and high-energy physicists prefer out-of-time-ordered correlators. Let’s meet the camps in turn.
Heisenberg intuited an uncertainty relation that Robertson formalized during the 1920s,
Imagine preparing a quantum state and measuring , then repeating this protocol in many trials. Each trial has some probability of yielding the outcome . Different trials will yield different ’s. We quantify the spread in values with the standard deviation . We define analogously. denotes Planck’s constant, a number that characterizes our universe as the electron’s mass does.
denotes the observables’ commutator. The numbers that we use in daily life commute: . Quantum numbers, or operators, represent and . Operators don’t necessarily commute. The commutator represents how little and resemble 7 and 5.
Robertson’s uncertainty relation means, “If you can predict an measurement’s outcome precisely, you can’t predict a measurement’s outcome precisely, and vice versa. The uncertainties must multiply to at least some number. The number depends on how much fails to commute with .” The higher an uncertainty bound (the greater the inequality’s right-hand side), the more the operators disagree.
Heisenberg and Robertson explored operator disagreement during the 1920s. They wouldn’t have seen eye to eye with today’s QI theorists. For instance, QI theorists consider how we can apply quantum phenomena, such as operator disagreement, to information processing. Information processing includes cryptography. Quantum cryptography benefits from operator disagreement: An eavesdropper must observe, or measure, a message. The eavesdropper’s measurement of one observable can “disturb” a disagreeing observable. The message’s sender and intended recipient can detect the disturbance and so detect the eavesdropper.
How efficiently can one perform an information-processing task? The answer usually depends on an entropy , a property of quantum states and of probability distributions.Uncertainty relations cry out for recasting in terms of entropies. So QI theorists have devised entropic uncertainty relations, such as
The entropy quantifies the difficulty of predicting the outcome of an measurement. is defined analogously. is called the overlap. It quantifies your ability to predict what happens if you prepare your system with a well-defined value, then measure . For further analysis, check out this paper. Entropic uncertainty relations have blossomed within QI theory over the past few years.
Pure QI theorists, we’ve seen, quantify operator disagreement with entropic uncertainty relations. Physicists at the intersection of condensed matter and high-energy physics prefer out-of-time-ordered correlators (OTOCs). I’ve blogged about OTOCs so many times, Quantum Frontiers regulars will be able to guess the next two paragraphs.
Consider a quantum many-body system, such as a chain of qubits. Imagine poking one end of the system, such as by flipping the first qubit upside-down. Let the operator represent the poke. Suppose that the system evolves chaotically for a time afterward, the qubits interacting.Information about the poke spreads through many-body entanglement, or scrambles.
Imagine measuring an observable of a few qubits far from the qubits. A little information about migrates into the qubits. But measuring reveals almost nothing about , because most of the information about has spread across the system. disagrees with , in a sense. Actually, disagrees with . The represents the time evolution.
The OTOC’s smallness reflects how much disagrees with at any instant . At early times , the operators agree, and the OTOC . At late times, the operators disagree loads, and the OTOC .
Different camps of physicists, we’ve seen, quantify operator disagreement with different measures: Today’s pure QI theorists use entropic uncertainty relations. Condensed-matter and high-energy physicists use OTOCs. Trust physicists to disagree about what “quantum operator disagreement” means.
I want peace on Earth. I conjectured, in 2016 or so, that one could reconcile the two notions of quantum operator disagreement. One must be able to prove an entropic uncertainty relation for scrambling, wouldn’t you think?
You might try substituting for the in Ineq. , and for the . You’d expect the uncertainty bound to tighten—the inequality’s right-hand side to grow—when the system scrambles. Scrambling—the condensed-matter and high-energy-physics notion of disagreement—would coincide with a high uncertainty bound—the pure-QI-theory notion of disagreement. The two notions of operator disagreement would agree. But the bound I’ve described doesn’t reflect scrambling. Nor do similar bounds that I tried constructing. I banged my head against the problem for about a year.
The sky brightened when Jason and Tony developed an interest in the conjecture. Their energy and conversation enabled us to prove an entropic uncertainty relation for scrambling, published this month.1 We tested the relation in computer simulations of a qubit chain. Our bound tightens when the system scrambles, as expected: The uncertainty relation reflects the same operator disagreement as the OTOC. We reconciled two notions of quantum operator disagreement.
As Quantum Frontiers regulars will anticipate, our uncertainty relation involves weak measurements and quasiprobability distributions: I’ve been studying their roles in scrambling over the pastthreeyears,with colleagues for whose collaborations I have the utmost gratitude. I’m grateful to have collaborated with Tony and Jason. Harmony helps when you’re tackling (quantum operator) disagreement—even if squabbling would spice up your paper’s backstory.
Yoram Alhassid asked the question at the end of my Yale Quantum Institute colloquium last February. I knew two facts about Yoram: (1) He belongs to Yale’s theoretical-physics faculty. (2) His PhD thesis’s title—“On the Information Theoretic Approach to Nuclear Reactions”—ranks among my three favorites.1
Over the past few months, I’ve grown to know Yoram better. He had reason to ask about quantum statistical mechanics, because his research stands up to its ears in the field. If forced to synopsize quantum statistical mechanics in five words, I’d say, “study of many-particle quantum systems.” Examples include gases of ultracold atoms. If given another five words, I’d add, “Calculate and use partition functions.” A partition function is a measure of the number of states, or configurations, accessible to the system. Calculate a system’s partition function, and you can calculate the system’s average energy, the average number of particles in the system, how the system responds to magnetic fields, etc.
My colloquium concerned quantum thermodynamics, which I’ve bloggedabout many times.So I should have been able to distinguish quantum thermodynamics from its neighbors. But the answer I gave Yoram didn’t satisfy me. I mulled over the exchange for a few weeks, then emailed Yoram a 502-word essay. The exercise grew my appreciation for the question and my understanding of my field.
An adaptation of the email appears below. The adaptation should suit readers who’ve majored in physics, but don’t worry if you haven’t. Bits of what distinguishes quantum thermodynamics from quantum statistical mechanics should come across to everyone—as should, I hope, the value of question-and-answer sessions:
One distinction is a return to the operational approach of 19th-century thermodynamics. Thermodynamicists such as Sadi Carnot wanted to know how effectively engines could operate. Their practical questions led to fundamental insights, such as the Carnot bound on an engine’s efficiency. Similarly, quantum thermodynamicists often ask, “How can this state serve as a resource in thermodynamic tasks?” This approach helps us identify what distinguishes quantum theory from classical mechanics.
Asking, “How can this state serve as a resource?” leads quantum thermodynamicists to design quantum engines, ratchets, batteries, etc. We analyze how these devices can outperform classical analogues, identifying which aspects of quantum theory power the outperformance. This question and these tasks contrast with the questions and tasks of many non-quantum-thermodynamicists who use statistical mechanics. They often calculate response functions and (e.g., ground-state) properties of Hamiltonians.
These goals of characterizing what nonclassicality is and what it can achieve in thermodynamic contexts resemble upshots of quantum computing and cryptography. As a 21st-century quantum information scientist, I understand what makes quantum theory quantum partially by understanding which problems quantum computers can solve efficiently and classical computers can’t. Similarly, I understand what makes quantum theory quantum partially by understanding how much more work you can extract from a singlet (a maximally entangled state of two qubits) than from a product state in which the reduced states have the same forms as in the singlet, .
As quantum thermodynamics shares its operational approach with quantum information theory, quantum thermodynamicists use mathematical tools developed in quantum information theory. An example consists of generalizedentropies. Entropies quantify the optimal efficiency with which we can perform information-processing and thermodynamic tasks, such as data compression and work extraction.
Most statistical-mechanics researchers use just the Shannon and von Neumann entropies, and , and perhaps the occasional relative entropy. These entropies quantify optimal efficiencies in large-system limits, e.g., as the number of messages compressed approaches infinity and in the thermodynamic limit.
Other entropic quantities have been defined and explored over the past two decades, in quantum and classical information theory. These entropies quantify the optimal efficiencies with which tasks can be performed (i) if the number of systems processed or the number of trials is arbitrary, (ii) if the systems processed share correlations, (iii) in the presence of “quantum side information” (if the system being used as a resource is entangled with another system, to which an agent has access), or (iv) if you can tolerate some probability that you fail to accomplish your task. Instead of limiting ourselves to and , we use also “-smoothed entropies,” Rényi divergences, hypothesis-testing entropies, conditional entropies, etc.
Another hallmark of quantum thermodynamics is results’ generality and simplicity. Thermodynamics characterizes a system with a few macroscopic observables, such as temperature, volume, and particle number. The simplicity of some quantum thermodynamics served a chemist collaborator and me, as explained in the introduction of https://arxiv.org/abs/1811.06551.
Yoram’s question reminded me of one reason why, as an undergrad, I adored studying physics in a liberal-arts college. I ate dinner and took walks with students majoring in economics, German studies, and Middle Eastern languages. They described their challenges, which I analyzed with the physics mindset that I was acquiring. We then compared our approaches. Encountering other disciplines’ perspectives helped me recognize what tools I was developing as a budding physicist. How can we know our corner of the world without stepping outside it and viewing it as part of a landscape?
1The title epitomizes clarity and simplicity. And I have trouble resisting anything advertised as “the information-theoretic approach to such-and-such.”
Merchandise spilled outside shops onto the streets, restaurateurs parked diners under trees, and ice-cream cones begged to be eaten on park benches. People thronged the streets, markets filled public squares, and the scents of flowers wafted from vendors’ stalls. I couldn’t blame the city. Its sunshine could have drawn Merlin out of his crystal cave. Insofar as a city lives, Barcelona epitomized a quotation by thermodynamicist Ilya Prigogine: “The main character of any living system is openness.”
Prigogine (1917–2003), who won the Nobel Prize for chemistry, had brought me to Barcelona. I was honored to receive, at the Joint European Thermodynamics Conference (JETC) there, the Ilya Prigogine Prize for a thermodynamics PhD thesis. The JETC convenes and awards the prize biennially;the last conference had taken place in Budapest. Barcelona suited the legacy of a thermodynamicist who illuminated open systems.
The conference center. Not bad, eh?
Ilya Prigogine began his life in Russia, grew up partially in Germany, settled in Brussels, and worked at American universities. His nobelprize.orgbiography reveals a mind open to many influences and disciplines: Before entering university, his “interest was more focused on history and archaeology, not to mention music, especially piano.” Yet Prigogine pursued chemistry.
He helped extend thermodynamics outside equilibrium. Thermodynamics is the study of energy, order, and time’s arrow in terms of large-scale properties, such as temperature, pressure, and volume. Many physicists think that thermodynamics describes only equilibrium. Equilibrium is a state of matter in which (1) large-scale properties remain mostly constant and (2) stuff (matter, energy, electric charge, etc.) doesn’t flow in any particular direction much. Apple pies reach equilibrium upon cooling on a countertop. When I’ve described my research as involving nonequilibrium thermodynamics, some colleagues have asked whether I’ve used an oxymoron. But “nonequilibrium thermodynamics” appears in Prigogine’s Nobel Lecture.
Another Nobel laureate, Lars Onsager, helped extend thermodynamics a little outside equilibrium. He imagined poking a system gently, as by putting a pie on a lukewarm stovetop or a magnet in a weak magnetic field. (Experts: Onsager studied the linear-response regime.) You can read about his work in my blog post “Long live Yale’s cemetery.” Systems poked slightly out of equilibrium tend to return to equilibrium: Equilibrium is stable. Systems flung far from equilibrium, as Prigogine showed, can behave differently.
A system can stay far from equilibrium by interacting with other systems. Imagine placing an apple pie atop a blistering stove. Heat will flow from the stove through the pie into the air. The pie will stay out of equilibrium due to interactions with what we call a “hot reservoir” (the stove) and a “cold reservoir” (the air). Systems (like pies) that interact with other systems (like stoves and air), we call “open.”
You and I are open: We inhale air, ingest food and drink, expel waste, and radiate heat. Matter and energy flow through us; we remain far from equilibrium. A bumper sticker in my high-school chemistry classroom encapsulated our status: “Old chemists don’t die. They come to equilibrium.” We remain far from equilibrium—alive—because our environment provides food and absorbs heat. If I’m an apple pie, the yogurt that I ate at breakfast serves as my stovetop, and the living room in which I breakfasted serves as the air above the stove. We live because of our interactions with our environments, because we’re open. Hence Prigogine’s claim, “The main character of any living system is openness.”
JETC 2019 fostered openness. The conference sessions spanned length scales and mass scales, from quantum thermodynamics to biophysics to gravitation. One could arrive as an expert in cell membranes and learn about astrophysics.
I remain grateful for the prize-selection committee’s openness. The topics of earlier winning theses include desalination, colloidal suspensions, and falling liquid films. If you tipped those topics into a tube, swirled them around, and capped the tube with a kaleidoscope glass, you might glimpse my thesis’s topic, quantum steampunk. Also, of the nine foregoing Prigogine Prize winners, only one had earned his PhD in the US. I’m grateful for the JETC’s consideration of something completely different.
When Prigogine said, “openness,” he referred to exchanges of energy and mass. Humans can exhibit openness also to ideas. The JETC honored Prigogine’s legacy in more ways than one. Here’s hoping I live up to their example.
You would hardly think that a quantum channel could have any sort of thermodynamic behavior. We were surprised, too.
How do the laws of thermodynamics apply in the quantum regime? Thanks to novel ideas introduced in the context of quantum information, scientists have been able to develop new ways to characterize the thermodynamic behavior of quantum states. If you’re a Quantum Frontiers regular, you have certainly read about these advances in Nicole’s captivating posts on the subject.
Asking the same question for quantum channels, however, turned out to be more challenging than expected. A quantum channel is a way of representing how an input state can change into an output state according to the laws of quantum mechanics. Let’s picture it as a box with an input state and an output state, like so:
A computing gate, the building block of quantum computers, is described by a quantum channel. Or, if Alice sends a photon to Bob over an optical fiber, then the whole process is represented by a quantum channel. Thus, by studying quantum channels directly we can derive statements that are valid regardless of the physical platform used to store and process the quantum information—ion traps, superconducting qubits, photonic qubits, NV centers, etc.
We asked the following question: If I’m given a quantum channel, can I transform it into another, different channel by using something like a miniature heat engine? If so, how much work do I need to spend in order to accomplish this task? The answer is tricky because of a few aspects in which quantum channels are more complicated than quantum states.
First things first, let’s worry about how to study the thermodynamic behavior of miniature systems.
Thermodynamics of small stuff
One of the important ideas that quantum information brought to thermodynamics is the idea of a resource theory. In a resource theory, we declare that there are certain kinds of states that are available for free, and that there are a set of operations that can be carried out for free. In a resource theory of thermodynamics, when we say “for free,” we mean “without expending any thermodynamic work.”
Here, the free states are those in thermal equilibrium at a fixed given temperature, and the free operations are those quantum operations that preserve energy and that introduce no noise into the system (we call those unitary operations). Faced with a task such as transforming one quantum state into another, we may ask whether or not it is possible to do so using the freely available operations. If that is not possible, we may then ask how much thermodynamic work we need to invest, in the form of additional energy at the input, in order to make the transformation possible.
Interestingly, the amount of work needed to go from one state ρ to another state σ might be unrelated to the work required to go back from σ to ρ. Indeed, the freely allowed operations can’t always be reversed; the reverse process usually requires a different sequence of operations, incurring an overhead. There is a mathematical framework to understand these transformations and this reversibility gap, in which generalized entropy measures play a central role. To avoid going down that road, let’s instead consider the macroscopic case in which we have a large number n of independent particles that are all in the same state ρ, a state which we denote by . Then something magical happens: This macroscopic state can be reversibly converted to and from another macroscopic state , where all particles are in some other state σ. That is, the work invested in the transformation from to can be entirely recovered by performing the reverse transformation:
If this rings a bell, that is because this is precisely the kind of thermodynamics that you will find in your favorite textbook. There is an optimal, reversible way of transforming any two thermodynamic states into each other, and the optimal work cost of the transformation is the difference of a corresponding quantity known as the thermodynamic potential. Here, the thermodynamic potential is a quantity known as the free energy . Therefore, the optimal work cost per copy w of transforming into is given by the difference in free energy .
From quantum states to quantum channels
Can we repeat the same story for quantum channels? Suppose that we’re given a channel , which we picture as above as a box that transforms an input state into an output state. Using the freely available thermodynamic operations, can we “transform” into another channel ? That is, can we wrap this box with some kind of procedure that uses free thermodynamic operations to pre-process the input and post-process the output, such that the overall new process corresponds (approximately) to the quantum channel ? We might picture the situation like this:
Let us first simplify the question by supposing we don’t have a channel to start off with. How can we implement the channel from scratch, using only free thermodynamic operations and some invested work? That simple question led to pages and pages of calculations, lots of coffee, a few sleepless nights, and then more coffee. After finally overcoming several technical obstacles, we found that in the macroscopic limit of many copies of the channel, the corresponding amount of work per copy is given by the maximum difference of free energy F between the input and output of the channel. We decided to call this quantity the thermodynamic capacity of the channel:
Intuitively, an implementation of must be prepared to expend an amount of work corresponding to the worst possible transformation of an input state to its corresponding output state. It’s kind of obvious in retrospect. However, what is nontrivial is that one can find a single implementation that works for all input states.
It turned out that this quantity had already been studied before. An earlier paper by Navascués and García-Pintos had shown that it was exactly this quantity that characterized the amount of work per copy that could be extracted by “consuming” many copies of a process provided as black boxes.
To our surprise, we realized that Navascués and García-Pintos’s result implied that the transformation of into is reversible. There is a simple procedure to convert into at a cost per copy that equals . The procedure consists in first extracting work per copy of the first set of channels, and then preparing from scratch at a work cost of per copy:
Clearly, the reverse transformation yields back all the work invested in the forward transformation, making the transformation reversible. That’s because we could have started with ’s and finished with ’s instead of the opposite, and the associated work cost per copy would be . Thus the transformation is, indeed, reversible:
In turn, this implies that in the many-copy regime, quantum channels have a macroscopic thermodynamic behavior. That is, there is a thermodynamic potential—the thermodynamic capacity—that quantifies the minimal work required to transform one macroscopic set of channels into another.
Prospects for the thermodynamic capacity
Resource theories that are reversible are pretty rare. Reversibility is a coveted property because a reversible resource theory is one in which we can easily understand exactly which transformations are possible. Other than the thermodynamic resource theory of states mentioned above, most instances of a resource theory—especially resource theories of channels—typically produce the kind of overheads in the conversion cost that spoil reversibility. So it’s rather exciting when you do find a new reversible resource theory of channels.
Quantum information theorists, especially those working on the theory of quantum communication, care a lot about characterizing the capacity of a channel. This is the maximal amount of information that can be transmitted through a channel. Even though in our case we’re talking about a different kind of capacity—one where we transmit thermodynamic energy and entropy, rather than quantum bits of messages—there are some close parallels between the two settings from which both fields of quantum communication and quantum thermodynamics can profit. Our result draws deep inspiration from the so-called quantum reverse Shannon theorem, an important result in quantum communication that tells us how two parties can communicate using one kind of a channel if they have access to another kind of a channel. On the other hand, the thermodynamic capacity at zero energy is a quantity that was already studied in quantum communication, but it was not clear what that quantity represented concretely. This quantity gained even more importance as it was identified as the entropy of a channel. Now, we see that this quantity has a thermodynamic interpretation. Also, the thermodynamic capacity has a simple definition, it is relatively easy to compute and it is additive—all desirable properties that other measures of capacity of a quantum channel do not necessarily share.
We still have a few rough edges that I hope we can resolve sooner or later. In fact, there is an important caveat that I have avoided mentioning so far—our argument only holds for special kinds of channels, those that do the same thing regardless of when they are applied in time. (Those channels are called time-covariant.) A lot of channels that we’re used to studying have this property, but we think it should be possible to prove a version of our result for any general quantum channel. In fact, we do have another argument that works for all quantum channels, but it uses a slightly different thermodynamic framework which might not be physically well-grounded.
That’s all very nice, I can hear you think, but is this useful for any quantum computing applications? The truth is, we’re still pretty far from founding a new quantum start-up. The levels of heat dissipation in quantum logic elements are still orders of magnitude away from the fundamental limits that we study in the thermodynamic resource theory.
Rather, our result teaches us about the interplay of quantum channels and thermodynamic concepts. We not only have gained useful insight on the structure of quantum channels, but also developed new tools for how to analyze them. These will be useful to study more involved resource theories of channels. And still, in the future when quantum technologies will perhaps approach the thermodynamically reversible limit, it might be good to know how to implement a given quantum channel in such a way that good accuracy is guaranteed for any possible quantum input state, and without any inherent overhead due to the fact that we don’t know what the input state is.
Thermodynamics, a theory developed to study gases and steam engines, has turned out to be relevant from the most obvious to the most unexpected of situations—chemical reactions, electromagnetism, solid state physics, black holes, you name it. Trust the laws of thermodynamics to surprise you again by applying to a setting you’d never imagined them to, like quantum channels.
Some research topics, says conventional wisdom, a physics PhD student shouldn’t touch with an iron-tipped medieval lance: sinkholes in the foundations of quantum theory. Problems so hard, you’d have a snowball’s chance of achieving progress. Problems so obscure, you’d have a snowball’s chance of convincing anyone to care about progress. Whether quantum physics could influence cognition much.
Quantum physics influences cognition insofar as (i) quantum physics prevents atoms from imploding and (ii) implosion inhabits atoms from contributing to cognition. But most physicists believe that useful entanglement can’t survive in brains. Entanglement consists of correlations shareable by quantum systems and stronger than any achievable by classical systems. Useful entanglement dies quickly in hot, wet, random environments.
Brains form such environments. Imagine injecting entangled molecules A and B into someone’s brain. Water, ions, and other particles would bombard the molecules. The higher the temperature, the heavier the bombardment. The bombardiers would entangle with the molecules via electric and magnetic fields. Each molecule can share only so much entanglement. The more A entangled with the environment, the less A could remain entangled with B. A would come to share a tiny amount of entanglement with each of many particles. Such tiny amounts couldn’t accomplish much. So quantum physics seems unlikely to affect cognition significantly.
Do not touch.
Yet my PhD advisor, John Preskill, encouraged me to consider whether the possibility interested me.
Try some completely different research, he said. Take a risk. If it doesn’t pan out, fine. People don’t expect much of grad students, anyway. Have you seen Matthew Fisher’s paper about quantum cognition?
Matthew Fisher is a theoretical physicist at the University of California, Santa Barbara. He has plaudits out the wazoo, many for his work on superconductors. A few years ago, Matthew developed an interest in biochemistry. He knew that most physicists doubt whether quantum physics could affect cognition much. But suppose that it could, he thought. How could it? Matthew reverse-engineered a mechanism, in a paper published by Annals of Physics in 2015.
A PhD student shouldn’t touch such research with a ten-foot radio antenna, says conventional wisdom. But I trust John Preskill in a way in which I trust no one else on Earth.
I’ll look at the paper, I said.
Matthew proposed that quantum physics could influence cognition as follows. Experimentalists have performed quantum computation using one hot, wet, random system: that of nuclear magnetic resonance (NMR). NMR is the process that underlies magnetic resonance imaging (MRI), a technique used to image people’s brains. A common NMR system consists of high-temperature liquid molecules. The molecules consists of atoms whose nuclei have quantum properties called spin. The nuclear spins encode quantum information (QI).
Nuclear spins, Matthew reasoned, might store QI in our brains. He catalogued the threats that could damage the QI. Hydrogen ions, he concluded, would threaten the QI most. They could entangle with (decohere) the spins via dipole-dipole interactions.
How can a spin avoid the threats? First, by having a quantum number . Such a quantum number zeroes out the nuclei’s electric quadrupole moments. Electric-quadrupole interactions can’t decohere such spins. Which biologically prevalent atoms have nuclear spins? Phosphorus and hydrogen. Hydrogen suffers from other vulnerabilities, so phosphorus nuclear spins store QI in Matthew’s story. The spins serve as qubits, or quantum bits.
How can a phosphorus spin avoid entangling with other spins via magnetic dipole-dipole interactions? Such interactions depend on the spins’ orientations relative to their positions. Suppose that the phosphorus occupies a small molecule that tumbles in biofluids. The nucleus’s position changes randomly. The interaction can average out over tumbles.
The molecule contains atoms other than phosphorus. Those atoms have nuclei whose spins can interact with the phosphorus spins, unless every threatening spin has a quantum number . Which biologically prevalent atoms have nuclear spins? Oxygen and calcium. The phosphorus should therefore occupy a molecule with oxygen and calcium.
Matthew designed this molecule to block decoherence. Then, he found the molecule in the scientific literature. The structure, , is called a Posner cluster or a Posner molecule. I’ll call it a Posner, for short. Posners appear to exist in simulated biofluids, fluids created to mimic the fluids in us. Posners are believed to exist in us and might participate in bone formation. According to Matthew’s estimates, Posners might protect phosphorus nuclear spins for up to 1-10 days.
How can Posners influence cognition? Matthew proposed the following story.
Adenosine triphosphate (ATP) is a molecule that fuels biochemical reactions. “Triphosphate” means “containing three phosphate ions.” Phosphate () consists of one phosphorus atom and three oxygen atoms. Two of an ATP molecule’s phosphates can break off while remaining joined to each other.
The phosphate pair can drift until encountering an enzyme called pyrophosphatase. The enzyme can break the pair into independent phosphates. Matthew, with Leo Radzihovsky, conjectured that, as the pair breaks, the phosphorus nuclear spins are projected onto a singlet. This state, represented by , is maximally entangled.
Imagine many entangled phosphates in a biofluid. Six phosphates can join nine calcium ions to form a Posner molecule. The Posner can share up to six singlets with other Posners. Clouds of entangled Posners can form.
One clump of Posners can enter one neuron while another clump enters another neuron. The protein VGLUT, or BNPI, sits in cell membranes and has the potential to ferry Posners in. The neurons will share entanglement. Imagine two Posners, P and Q, approaching each other in a neuron N. Quantum-chemistry calculations suggest that the Posners can bind together. Suppose that P shares entanglement with a Posner P’ in a neuron N’, while Q shares entanglement with a Posner Q’ in N’. The entanglement, with the binding of P to Q, can raise the probability that P’ binds to Q’.
Bound-together Posners will move slowly, having to push much water out of the way. Hydrogen and magnesium ions can latch onto the slow molecules easily. The Posners’ negatively charged phosphates will attract the and as the phosphates attract the Posner’s . The hydrogen and magnesium can dislodge the calcium, breaking apart the Posners. Calcium will flood neurons N and N’. Calcium floods a neuron’s axion terminal (the end of the neuron) when an electrical signal reaches the axion. The flood induces the neuron to release neurotransmitters. Neurotransmitters are chemicals that travel to the next neuron, inducing it to fire. So entanglement between phosphorus nuclear spins in Posner molecules might stimulate coordinated neuron firing.
Does Matthew’s story play out in the body? We can’t know till running experiments and analyzing the results. Experiments have begun: Last year, the Heising-Simons Foundation granted Matthew and collaborators $1.2 million to test the proposal.
Suppose that Matthew conjectures correctly, John challenged me, or correctly enough. Posner molecules store QI. Quantum systems can process information in ways in which classical systems, like laptops, can’t. How adroitly can Posners process QI?
I threw away my iron-tipped medieval lance in year five of my PhD. I left Caltech for a five-month fellowship, bent on returning with a paper with which to answer John. I did, and Annals of Physics published the paper this month.
I had the fortune to interest Elizabeth Crosson in the project. Elizabeth, now an assistant professor at the University of New Mexico, was working as a postdoc in John’s group. Both of us are theorists who specialize in QI theory. But our backgrounds, skills, and specialties differ. We complemented each other while sharing a doggedness that kept us emailing, GChatting, and Google-hangout-ing at all hours.
Elizabeth and I translated Matthew’s biochemistry into the mathematical language of QI theory. We dissected Matthew’s narrative into a sequence of biochemical steps. We ascertained how each step would transform the QI encoded in the phosphorus nuclei. Each transformation, we represented with a piece of math and with a circuit-diagram element. (Circuit-diagram elements are pictures strung together to form circuits that run algorithms.) The set of transformations, we called Posner operations.
Imagine that you can perform Posner operations, by preparing molecules, trying to bind them together, etc. What QI-processing tasks can you perform? Elizabeth and I found applications to quantum communication, quantum error detection, and quantum computation. Our results rest on the assumption—possibly inaccurate—that Matthew conjectures correctly. Furthermore, we characterized what Posners could achieve if controlled. Randomness, rather than control, would direct Posners in biofluids. But what can happen in principle offers a starting point.
First, QI can be teleported from one Posner to another, while suffering noise.1 This noisy teleportation doubles as superdense coding: A trit is a random variable that assumes one of three possible values. A bit is a random variable that assumes one of two possible values. You can teleport a trit from one Posner to another effectively, while transmitting a bit directly, with help from entanglement.
Second, Matthew argued that Posners’ structures protect QI. Scientists have developed quantum error-correcting and -detecting codes to protect QI. Can Posners implement such codes, in our model? Yes: Elizabeth and I (with help from erstwhile Caltech postdoc Fernando Pastawski) developed a quantum error-detection code accessible to Posners. One Posner encodes a logical qutrit, the quantum version of a trit. The code detects any error that slams any of the Posner’s six qubits.
Third, how complicated an entangled state can Posner operations prepare? A powerful one, we found: Suppose that you can measure this state locally, such that earlier measurements’ outcomes affect which measurements you perform later. You can perform any quantum computation. That is, Posner operations can prepare a state that fuels universal measurement-based quantum computation.
Finally, Elizabeth and I quantified effects of entanglement on the rate at which Posners bind together. Imagine preparing two Posners, P and P’, that share entanglement only with other particles. If the Posners approach each other with the right orientation, they have a 33.6% chance of binding, in our model. Now, suppose that every qubit in P is maximally entangled with a qubit in P’. The binding probability can rise to 100%.
Elizabeth and I recast as a quantum circuit a biochemical process discussed in Matthew Fisher’s 2015 paper.
I feared that other scientists would pooh-pooh our work as crazy. To my surprise, enthusiasm flooded in. Colleagues cheered the risk on a challenge in an emerging field that perks up our ears. Besides, Elizabeth’s and my work is far from crazy. We don’t assert that quantum physics affects cognition. We imagine that Matthew conjectures correctly, acknowledging that he might not, and explore his proposal’s implications. Being neither biochemists nor experimentalists, we restrict our claims to QI theory.
Maybe Posners can’t protect coherence for long enough. Would inaccuracy of Matthew’s beach our whale of research? No. Posners prompted us to propose ideas and questions within QI theory. For instance, our quantum circuits illustrate interactions (unitary gates, to experts) interspersed with measurements implemented by the binding of Posners. The circuits partially motivated a subfield that emerged last summer and is picking up speed: Consider interspersing random unitary gates with measurements. The unitaries tend to entangle qubits, whereas the measurements disentangle. Which influence wins? Does the system undergo a phase transition from “mostly entangled” to “mostly unentangled” at some measurement frequency? Researchers from SantaBarbara to Colorado; MIT; Oxford; Lancaster, UK; Berkeley; Stanford; and Princeton have taken up the challenge.
A physics PhD student, conventional wisdom says, shouldn’t touch quantum cognition with a Swiss guard’s halberd. I’m glad I reached out: I learned much, contributed to science, and had an adventure. Besides, if anyone disapproves of daring, I can blame John Preskill.
Annals of Physics published “Quantum information in the Posner model of quantum cognition” here. You can find the arXiv version here and can watch a talk about our paper here.
1Experts: The noise arises because, if two Posners bind, they effectively undergo a measurement. This measurement transforms a subspace of the two-Posner Hilbert space as a coarse-grained Bell measurement. A Bell measurement yields one of four possible outcomes, or two bits. Discarding one of the bits amounts to coarse-graining the outcome. Quantum teleportation involves a Bell measurement. Coarse-graining the measurement introduces noise into the teleportation.
In 2013, I was attending a workshop on noise, information and complexity at the Ettore Majorana Center in beautiful Erice, Sicily, a medieval town sitting on top of a steep hill overlooking the western part of the island. The town, a network of tiny, winding streets lined mostly with medieval buildings, was foggy most days. The Center I was visiting, apart from its awe-inspiring location, is said to have played an important role in fostering relationships between scientists of the West and the East during the Cold War. As a proof of its openness to hosting even the most unexpected of visitors, the Center proudly displays a picture of Pope John Paul II seated behind a version of Dirac’s equation missing an all-important , the unit of imaginary numbers.
One afternoon, the hosts of the workshop drove us down to Palermo for sightseeing. We toured a number of churches, whose layered styles and decorations reflected the different cultures that flourished on the island over the centuries. The last stop on our tour was the Martorana Church, an Italo-Albanian church of the 12th century, where to this day Mass is held in ancient Greek (yes, it is a complicated history). And while everybody had their noses up in the air, admiring the golden mosaics on the ceilings and the late baroque decorations, I was mesmerized by what lied underneath my feet. I am not talking about some forgotten crypt or creepy burial vault: I was looking at triangles – colorful, 12th century triangles.
What I was looking at, was a 12th century version of a fractal figure which is known today as the Sierpinski triangle, a geometric pattern named after Wacław Sierpiński, the Polish mathematician who studied it eight centuries later, in 1915.
You might think this famous tiling pattern was a fluke back then, a random pattern appearing only on the floor of this particular church. It turns out that this type of decoration existed all over the floors of Italy and Europe and was due to a family of Roman artists known as the Cosmati. If you find this fascinating (and you definitely should), I recommend reading “Sierpinski triangles in stone, on medieval floors in Rome”, by Conversano and Tedeschini Lalli, J. Appl. Math 4 (2011). Or you can simply maze through the pictures of these pavements on Wikipedia.
Tiling periods (and the lack thereof)
Since I was a little kid, I was fascinated by tilings. I would spend hours looking at them (don’t all kids do?), trying to figure out which set of tiles was sufficient to reproduce the whole thing (which, to my great surprise, did not always coincide with the way the tiles were cut). I didn’t know at the time that what I was looking for was the period of the tiling, the minimum set of tiles needed to cover the whole space in a periodic fashion. To illustrate this concept, let’s have a look at these beautiful Ottoman tiles from the city of İznik, Turkey.
Here, we quickly realize that there are two different kinds of tiles: the top right and bottom left tiles are the same, whereas the ones on the diagonal are mirror reflections of the off-diagonal ones. The artist who made these had to actually paint two different kinds of tiles, preparing two separate stacks, one for each kind. If the tiles were made of thin, translucent glass, only one stack would have been necessary (why?)
While it is the drawings that make these tiles beautiful, if we wish to study how they can be composed, we might as well forget about the particular details of the drawings for a moment, and just focus on how each tile can be attached to its neighbor while preserving the continuity of the picture (this is something we do a lot in science, trying to focus on important features by filtering out unnecessary details). Since each square tile has four neighbors, we can think of these two different kinds of tiles in the following way:
From this new point of view, one kind of tile is just a square with four quadrants labeled 1,2,3,4 in a clockwise fashion, and the other kind of tile (the reflection of the first kind) has four quadrants labeled, -1, -2, -3, and -4, also in a clockwise fashion (as if looking at the first kind of tiles from the other side). The tiling rule is such that neighboring tiles sum to zero across their common edge. Now it is easy to see that, if we were given only one type of tiles, we could not do much with them, since the sum would always be positive (for the positive tiles), or negative (for the negative ones) across any edge, but never zero. But if we have access to both types, then we can cover an arbitrarily large surface.
But, how do we know that we can actually keep going and fill up any rectangular region, no matter how big it is? The trick is, there is a pattern which repeats: every second tile (both horizontally and vertically), the colors repeat, so we can keep making the same choice over and over again. There is a 2×2 square which is our period, and once we obtain it we can simply copy-and-paste this period as many times as we need. Notice that a period is the smallest tiling whose sum is zero along each of the two dimensions.
The Sierpinski tiling, on the other hand, does not have a period.
Try to focus on the pattern of the small drank green triangles. In the top row, they appear fairly often, but already in the second row they are spaced further apart, and then in the middle of the picture there is a big segment (the light green triangle) where they don’t appear. In other words, since we have larger and larger triangles appearing, there cannot be a period, since we would eventually find a triangle larger than the period itself! Tilings of this kind are called aperiodic.
The quest for a truly aperiodic tiling
While the Sierpinski triangle does not have a period that could cover the whole plane as the triangle gets bigger and bigger, if we use a Sierpinski triangle of a fixed size, we can actually generate a simple periodic tiling of the plane, as follows: Attach upside-down versions of the original triangle to its left and right, repeating the process in both directions ad infinitum. Then, take this infinite row of triangles, flip it upside-down and glue it to the original row below, stacking copies of these two rows on top of each other to fill an infinite plane. The aperiodicity of the Sierpinski triangle was a choice of how the smaller triangles tiled the inside of the Sieprinski triangle as it got larger and larger. The same set of triangles would tile the plane periodically if we used the procedure outlined above. In other words, aperiodicity was by choice, not of a necessity. But could there be a particular set of tiles for which no periodic tiling could ever exist?
In 1961, Hao Wang conjectured that, at least for the case of square tiles (which are now called Wang tiles), this is not the case: If a set of square tiles can cover an arbitrarily large rectangle, then there is a way to do so in a periodic fashion. Wang was not interested in floor tilings (at least, we don’t know of any floors decorated by him). Instead, he cared about the decidability of the tiling problem: given a set of tiles, is there an algorithm which can tell whether these tiles can be used to tile an infinitely large floor? If Wang’s conjecture about square tiles was true, we could set up a computer program that explored all the possible ways of covering a 1×1 square, then a 2×2 square, then a 3×3 square, and so on. The program would simply try every possible combination: while there are a lot of combinations, for any n-by-n square there is a finite number of tilings, so the computer could just check every single one of them. Specifically, at some point in the computation, one of two things would happen and the program would stop:
The computer would find a square which could not be covered with the given tiles, or
The computer would find a square which contained a period.
When either of the above happened, the program would stop. In the first case, finding a square which cannot be covered by our tiles implies that any larger square is also impossible to tile. In the second case, since we have found a period, just like in the case of the tiles from İznik, we can tile any rectangular region by repeating the period as needed. The computer might take a long time to decide whether 1. or 2. is the case for our set of tiles, but we know that we will always get an answer, with certainty, at some point. You may be thinking by now that there is a third possibility that I skipped over: The tiles could cover the whole space, but not in a periodic way. And you would be correct in thinking that.
If Wang’s conjecture were to be false, and there is a set of tiles which only generates aperiodic tilings, then our computer program would keep exploring larger and larger squares, without ever being able to give us a definitive answer whether we could tile the plane with this set of tiles. It would keep calculating, using more and more resources, until either it ran out of memory, or the heat generated by the computation boiled the oceans and the Earth and the tiles themselves.
So is Wang’s conjecture true? In 1964, a student of Wang, Robert Berger, showed in his PhD thesis that this conjecture is false: he constructed a set of 20,426 tiles which cover the plane, but can only do so aperiodically! Even worse than that, he actually managed to show that the tiling problem was undecidable: no computer ever built could predict with certainty whether a given set of tiles covered the plane or not!
Before I explain how Berger’s proof works, let me digress a bit and focus on his aperiodic tiling. Clearly, 20,426 are too many to be shown in a blog post, but since his result first appeared, other examples of smaller sets of aperiodic tiles have been found. Berger himself lowered the number to 104, Donald Knuth (of Computer Science fame) to 92, Hans Läuchli to 40, and finally, Raphael Robinson in 1971 produced a set of 6 tiles with the same property! Robinson tiles look like this (they are not depicted as exactly square tiles here, but they can be made into squares easily).
The pattern they create looks like this.
So, here we do not have triangles but squares, but apart from this it looks very similar to the Sierpinski triangle. Focus on the orange squares: there are some smaller ones, and they are sitting at the corners of slightly larger squares, which are in turn at the corners of even larger squares, and so on. While at a first look it might seem like a periodic pattern, it is not, since larger and larger squares keep appearing. We will come back to this orange squares in a while, keep them in mind.
In 1974, Roger Penrose found a set of just 2 aperiodic tiles, but which are not squares.
Penrose also had this cute idea that one could make a puzzle game out of these shapes, and he even got a patent for that! (“The tiles of the invention may be used to form an instructive game or as a visually attractive floor or wall-covering or the like”). At some point such a puzzle game was actually produced, but it is unfortunately out of production now. If you ever stop by the Newton Institute in Cambridge, UK, they own a copy (and they let you play with it!)
One of the characteristics of Penrose’s tiling is that with it one can obtain patterns with a 5-fold rotational symmetry, which means that you can rotate the tiling by 72°, which is 1/5th of 360°. This is interesting because a beautiful, and elementary, argument from Linear Algebra shows that in periodic tilings you can only get 2-, 3-, 4- or 6-fold symmetries (which corresponds to all n-fold symmetries for which is an integer), so having a 5-fold symmetry is a very unique thing! And just like the case of Sierpinski, there are traces of Penrose’s tiling in art, for example in the Darb-e Imam shrine in Isfahán, Iran.
Aperiodicity and Undecidability
Going back to Wang’s problem of whether the tiling problem is decidable: how did Berger prove his undecidability result? There are a lot of technical details he had to take care of, but the essence of his proof was to map each step of adding tiles to an ever-growing tiling, to the steps taken by a computer when running an algorithm (also known as a computer program). Each step of running the algorithm would correspond to instructions on which tile to add next and where. Specifically, Berger was interested in simulating the behavior of a very simple, yet very general computer – a Turing machine.
A Turing machine is basically a model for a machine that can run a particular computer algorithm, reduced to the bare minimum. It comprises of four main ingredients:
A tape of arbitrary length on which the machine can write (and overwrite) symbols,
A “head” which
can read/write one symbol at a time (like a scanner/printer combo)
can move the tape left/right one position at a time
can store a finite amount of information (in internal memory)
A program (table of instructions), which tells the “head” what to do next given the symbol it reads on the tape and the current internal memory state.
An initial internal state (which tells the “head” how to start moving), as well as a final (halting) internal state (which tells the “head” when to stop).
While being a really simple object, Turing machines are capable of running any computer algorithm, no matter how complex, so they can come in handy when you need something simple and extremely versatile at the same time!
For example, we could have a Turing machine which can only read/write the symbols 0 and 1, has 6 internal states labeled with letters A, B, C, D, E, F, and has the following program:
Here is how to read this table: Assume the initial state of the machine is A and the tape is filled with the symbol 0. The head of the machine will check the entry in the table corresponding to (0,A) and find the instruction “1RB”, which instructs it to write the symbol 1 (flipping the 0 that was already there to a 1), move the tape to the right, and change the internal state of the head to B. The head will now look up the new instructions for (0,B) (since, after moving the tape to the right, the new symbol under the head will be a 0 again), find “1RC” on the table of instructions, change the 0 into a 1, move the tape to the right once again, and change the internal state to C. It will repeat this process, reading one symbol at a time, checking its table of instructions to decide what to do next, until it reads a 0 while being in state F. If that happens, the special instruction “H” tells the machine to stop its execution: it has reached the “halting” state.
You can try to simulate the execution of this machine on a piece of paper, at least for the first few steps (you might need quite a lot of paper if you want to keep going). Or you could use a computer to simulate it. But you may find that after ten, or a thousand, or a million steps the machine has not halted yet. What if we kept going for another million steps? What about a billion? Can we be sure that the machine will halt eventually?
In his landmark work of 1936, Alan Turing showed that analyzing the behavior of this type of machines is outside the reach of any algorithmic computation: there cannot exist any algorithm which, given the description of a Turing machine’s program, can decide whether the machine will eventually halt, or if the machine will keep running forever! This is known as the halting problem.
Berger’s idea was to simulate a Turing machine using a set of tiles. For each possible symbol, the machine could read or write on the tape, he associated a corresponding color for each edge on the tile borders, as well as one color for each of the possible internal states of the machine. As you can probably guess, for two tiles to be neighbors, their common borders had to have the same color. Then he defined a set of tiles which “implemented” the transitions of the machine’s program, in such a way that each horizontal line was one “time step” of the tape during the execution of the machine. The resulting tiles looked like this, and the rule for the arrow is: two tiles can be next to each other only if the head of each arrow matches with the tail of another arrow.
Imagine we start our tiling with a row describing the initial state of the machine, which means having a “blank tape” (for example, a tape filled with the symbol 0), and one tile where the head of the machine is. It would look like this.
Then there is only one way we can extend this tiling further: for each of the tiles we have put down, there is only one tile that can go on top of that (try to check it yourself!). This is because the Turing Machine only has one possible transition, starting from the symbol 0 and state A. So after we add an extra layer, the pattern looks like this.
And then we repeat. Each time we put down a new tile, there is only one choice possible: we have to respect the transition rules of the Turing Machine, and our tiling will describe the state of the tape at the various steps of the execution.
If the machine halts at some point because it has completed its task, then there will be no way to add new tiles. In order to be sure that we could tile an arbitrarily large area, we would need to know in advance that the Turing machine defined by these tiles (converted into a set of fixed Turing instructions via Berger’s, or Robinson’s mapping) never halted. But, as I mentioned earlier, Turing showed that no algorithm can ever tell us such a thing. Which means you might regret having chosen these tiles for your new bathroom floor (you definitely should have chosen the ones with the flowers instead).
So, why is the aperiodic tiling so important for Berger’s and Robinson’s proofs? We assumed that we started the tiling with a special line, representing the tape in the “blank” state, and this has forced every other choice in the tiling. But using only the alphabet tiles with a single symbol, we get a periodic tiling which can always fill any region! In order to really force our tiling to have a description of the execution of a Turing Machine, we need to guarantee that the tiling is started with that special initialization line. In Robinson’s construction, this is possible using the orange squares as guides (go back and look at the picture of Robinson’s aperiodic tiling if you don’t visualize them), forcing the initialization to happen along the lower edge of each orange square which appears in the pattern. But remember, the Turing machine needs to have access to arbitrarily long segments of tape (we cannot predict how much it will need in case it halts), so we need to have arbitrarily large squares in our tiling. And this means, we really need an aperiodic tiling in order to have all possible tape lengths at our disposition! Any periodic tiling would have restricted the maximum amount of tape the machine could have used before repeating itself.
Tiling a quantum system
You might be wondering: what does all of this have to do with physics (you are, after all, reading the Quantum Frontiers blog and not The IKEA Catalogue 2019). The answer is: tiling problems can be converted into Hamiltonian groundstate energy problems. Think of a square lattice, where to each edge we can assign one of the possible edge configurations of your set of tiles. We can force edges of a square to come from one of the valid tiles by defining a plaquette interaction which gives an energy penalty to non-valid configurations. In this way, we can tile a region of the plane with our set of tiles if and only if this Hamiltonian has a frustration-free groundstate: a groundstate which simultaneously satisfies all the local plaquette constraints, or in other words, one that has zero energy. Deciding whether of not this special kind of groundstate exists is undecidable!
You do not need quantum mechanics for this, as this is a completely classical problem, but you soon realize that the number of possible configurations of the edges in the lattice is arbitrarily large! If you want to write down the matrix which represents this Hamiltonian interaction, you have to resort to larger and larger matrices.
Here is where quantum mechanics comes to the rescue! In a celebrated result, Toby Cubitt, David Pérez-García and Michael Wolf proved that you can have a similar result, this time for the spectral gap of a local Hamiltonian (the problem of deciding whether the spectrum of the Hamiltonian has a constant gap above the groundstate energy), using only a fixed number of local degrees of freedom. Their result is definitely not easy to explain: the first version of the paper was 146 pages long – luckily they managed to simplify it down to 127 pages… But I can try to give a very minimal explanation of how they managed to do this. The key part of their construction is to encode the rules of the Turing machine not directly in the tiling, but in a complex phase (complex number of unit length) which multiplies a certain fixed set of local Hamiltonian terms. They then use the quantum phase estimation algorithm to read off this phase, feeding this input into a Universal Turing Machine (a programmable Turing machine which can simulate any algorithm). In this way, the number of degrees of freedom needed is fixed, and by varying the complex phase mentioned above, they are able to simulate all possible classical Turing machines!
Quantum tiles on a line
Now that we have entered the realms of local Hamiltonian problems, one might wonder if what is going on here is specific to 2 dimensional systems. Clearly, the same phenomena can happen in 3 or more dimensions, since we can simply take multiple slices of 2D systems and stack them on top of each other. But what about 1 dimensional systems? Can we make this construction work on a line?
Interestingly, Wang’s conjecture in 1D is true: every tiling of a line necessarily has a period. Since we are tiling a line, we can think of each tile as essentially a connection between its left-edge color and its right-edge color. Any set of tiles (and associated edge colors) then defines an oriented graph whose vertices are the colors and whose edges are given by the tiles. The rule is again that tiles can be neighbors if their corresponding edges are the same color. The longest (oriented) path we can find in the graph is then the length of the longest segment which can be tiled. It turns out that this length will be infinite if and only if there is a cycle in the graph. In other words, if there is a period.
So we can’t construct aperiodic tilings in 1D, and the tiling problem is decidable. One might be tempted to guess that the same should happen with the spectral gap of local Hamiltonians: We can look at the terms defining the Hamiltonian and decide if a uniform spectral gap exists, as the size of our quantum system increases. After all, in many cases, 1D systems behave “nicely”: we have the DMRG algorithm, polynomial time algorithms for computing groundstates of gapped Hamiltonians, area laws and matrix product state approximations, no thermal phase transitions or topological order, and so on.
The key idea was to construct a Hamiltonian whose groundstate would be periodic in the (state of the) spins of an arbitrarily long spin chain, but with a period depending on the halting time of an algorithm (modeled as a Turing machine) encoded (in binary) in the complex phase multiplying each Hamiltonian term. Roughly speaking, this is how we set this up: We partitioned the set of spins into segments. On each segment, we introduced a special Hamiltonian, known as the Feynman-Kitaev history state Hamiltonian, which made sure that the groundstate on that segment was a transcription of the tape during the execution of the classical Turing machine defined by the complex phase (as discussed above).
If at some point the machine has not halted and is running out of tape, so that the segment is not large enough to contain the complete transcription of its execution, then the machine can “push” the delimiter a bit further away, “stealing” some tape space from its neighbor (more technically: the resulting configuration with a larger tape segment is more energetically favorable than the previous one). But once the machine halts, the tape segment shrinks exactly to the minimal size required for the machine to reach its halting state. So, in case the machine halts, the line is divided up into periodic segments, whose length is exactly the optimal length for the machine to halt. If on the other hand the machine does not halt, then the best configuration is the one where there is a unique tape segment, and only one machine running on it.
To recap, the groundstate of this Hamiltonian looks very different depending on whether the Turing machine (encoded in the phase parameter) eventually halts or not. If it does, the groundstate will look periodic, with the period being determined by the halting time. It is therefore a product state, if we think of each segment as a single, huge, particle. If instead the machine never halts, then the groundstate will have a single, very long segment, with a big Kitaev-Feynman history state, which is a highly entangled state.
Even more interestingly, we can set up the different energy scales in the system to behave as follows: for system sizes where the machine has not halted (because it still does not have enough tape to do so, or because it will never do), the single tape segment groudstate has vanishing (but positive) energy, while after it halts, each segment has a small, negative energy. These negative energies in the halting case keep accumulating, so that the thermodynamic groundstate has strictly negative energy density. We can use this difference in energy density between the two cases to construct a “switch”: we introduce two other Hamiltonians to the system (introducing extra local degrees of freedom), one gapped and one gapless. We couple them to everything else we had already set up (the tape segment and the Kitaev-Feynman history state Hamiltonians), in such a way that only one of them controls the low-energy properties of our system. We can set up the switch based on the difference in the energy density in such a way that, before halting, the system is gapped, and it becomes gapless only after the Turing machine has halted (and we cannot predict if this will ever happen!) Hence, the spectral gap is undecidable!
As is the case for 2D system, we need a very large local Hilbert space dimension to make this construction work (so large we did not even care to compute an exact number – but we know it is finite!) On the other extreme end, we know if the local dimension is 2 (we have qubits on a line), and the Hamiltonian has a special property called frustration freeness, then the spectral gap problem is easy to solve. Contrast this with the aperiodic tiling constructions: first Berger found a highly complicated case (with 20,426 tiles), then his construction was refined and simplified over and over, until Robinson got it down to 6 and Penrose showed a similar one with only 2 tiles.
Can we do the same for the undecidability of the spectral gap? At which point does the line become complex enough that the spectral gap problem is undecidable? Can we find some sort of “threshold” which separates the easy and the impossible cases? We need new ideas and new constructions in order to answer all these questions, so let’s get to work!