I tacked Dirac’s quote onto the bulletin board above my desk, the summer before senior year of high school. I’d picked quotes by T.S. Elliot and Einstein, Catullus and Hatshepsut.^{*} In a closet, I’d found amber-, peach-, and scarlet-colored paper. I’d printed the quotes and arranged them, starting senior year with inspiration that looked like a sunrise.
Not that I knew who Paul Dirac was. Nor did I evaluate his opinion. But I’d enrolled in Advanced Placement Physics C and taken the helm of my school’s literary magazine. The confluence of two passions of mine—science and literature—in Dirac’s quote tickled me.
A fiery lecturer began to alleviate my ignorance in college. Dirac, I learned, had co-invented quantum theory. The “Dee-rac Equa-shun,” my lecturer trilled in her Italian accent, describes relativistic quantum systems—tiny particles associated with high speeds. I developed a taste for spin, a quantum phenomenon encoded in Dirac’s equation. Spin serves quantum-information scientists as two-by-fours serve carpenters: Experimentalists have tried to build quantum computers from particles that have spins. Theorists keep the idea of electron spins in a mental car trunk, to tote out when illustrating abstract ideas with examples.
The next year, I learned that Dirac had predicted the existence of antimatter. Three years later, I learned to represent antimatter mathematically. I memorized the Dirac Equation, forgot it, and re-learned it.
One summer in grad school, visiting my parents, I glanced at my bulletin board.
The sun rises beyond a window across the room from the board. Had the light faded the papers’ colors? If so, I couldn’t tell.
In science one tries to tell people, in such a way as to be understood by everyone, something that no one ever knew before. But in the case of poetry, it’s the exact opposite!
Do poets try to obscure ideas everyone understands? Some poets express ideas that people intuit but feel unable, lack the attention, or don’t realize one should, articulate. Reading and hearing poetry helps me grasp the ideas. Some poets express ideas in forms that others haven’t imagined.
Did Dirac not represent physics in a form that others hadn’t imagined?
Would you have imagined that form? I didn’t imagine it until learning it. Do scientists not express ideas—about gravity, time, energy, and matter—that people feel unable, lack the attention, or don’t realize we should, articulate?
The U.S. and Canada have designated April as National Poetry Month. A hub for cousins of poets, Quantum Frontiers salutes. Carry a poem in your pocket this month. Or carry a copy of the Dirac Equation. Or tack either on a bulletin board; I doubt whether their colors will fade.
^{*}“Now my heart turns this way and that, as I think what the people will say. Those who see my monuments in years to come, and who shall speak of what I have done.” I expect to build no such monuments. But here’s to trying.
In a recent paper with Daniel Harlow, Fernando Pastawski and John Preskill, we have proposed a toy model of the AdS/CFT correspondence based on quantum error-correcting codes. Fernando has already written how this research project started after a fateful visit by Daniel to Caltech and John’s remarkable prediction in 1999. In this post, I hope to write an introduction which may serve as a reader’s guide to our paper, explaining why I’m so fascinated by the beauty of the toy model.
This is certainly a challenging task because I need to make it accessible to everyone while explaining real physics behind the paper. My personal philosophy is that a toy model must be as simple as possible while capturing key properties of the system of interest. In this post, I will try to extract some key features of the AdS/CFT correspondence and construct a toy model which captures these features. This post may be a bit technical compared to other recent posts, but anyway, let me give it a try…
Bulk locality paradox and quantum error-correction
The AdS/CFT correspondence says that there is some kind of correspondence between quantum gravity on (d+1)-dimensional asymptotically-AdS space and d-dimensional conformal field theory on its boundary. But how are they related?
The AdS-Rindler reconstruction tells us how to “reconstruct” a bulk operator from boundary operators. Consider a bulk operator and a boundary region A on a hyperbolic space (in other words, a negatively-curved plane). On a fixed time-slice, the causal wedge of A is a bulk region enclosed by the geodesic line of A (a curve with a minimal length). The AdS-Rindler reconstruction says that can be represented by some integral of local boundary operators supported on A if and only if is contained inside the causal wedge of A. Of course, there are multiple regions A,B,C,… whose causal wedges contain , and the reconstruction should work for any such region.
That a bulk operator in the causal wedge can be reconstructed by local boundary operators, however, leads to a rather perplexing paradox in the AdS/CFT correspondence. Consider a bulk operator at the center of a hyperbolic space, and split the boundary into three pieces, A, B, C. Then the geodesic line for the union of BC encloses the bulk operator, that is, is contained inside the causal wedge of BC. So, can be represented by local boundary operators supported on BC. But the same argument applies to AB and CA, implying that the bulk operator corresponds to local boundary operators which are supported inside AB, BC and CA simultaneously. It would seem then that the bulk operator must correspond to an identity operator times a complex phase. In fact, similar arguments apply to any bulk operators, and thus, all the bulk operators must correspond to identity operators on the boundary. Then, the AdS/CFT correspondence seems so boring…
Almheiri, Dong and Harlow have recently proposed an intriguing way of reconciling this paradox with the AdS/CFT correspondence. They proposed that the AdS/CFT correspondence can be viewed as a quantum error-correcting code. Their idea is as follows. Instead of corresponding to a single boundary operator, may correspond to different operators in different regions, say , , living in AB, BC, CA respectively. Even though , , are different boundary operators, they may be equivalent inside a certain low energy subspace on the boundary.
This situation resembles the so-called quantum secret-sharing code. The quantum information at the center of the bulk cannot be accessed from any single party A, B or C because does not have representation on A, B, or C. It can be accessed only if multiple parties cooperate and perform joint measurements. It seems that a quantum secret is shared among three parties, and the AdS/CFT correspondence somehow realizes the three-party quantum secret-sharing code!
Entanglement wedge reconstruction?
Recently, causal wedge reconstruction has been further generalized to the notion of entanglement wedge reconstruction. Imagine we split the boundary into four pieces A,B,C,D such that A,C are larger than B,D. Then the geodesic lines for A and C do not form the geodesic line for the union of A and C because we can draw shorter arcs by connecting endpoints of A and C, which form the global geodesic line. The entanglement wedge of AC is a bulk region enclosed by this global geodesic line of AC. And the entanglement wedge reconstruction predicts that can be represented as an integral of local boundary operators on AC if and only if is inside the entanglement wedge of AC [1].
Building a minimal toy model; the five-qubit code
Okay, now let’s try to construct a toy model which admits causal and entanglement wedge reconstructions of bulk operators. Because I want a simple toy model, I take a rather bold assumption that the bulk consists of a single qubit while the boundary consists of five qubits, denoted by A, B, C, D, E.
What does causal wedge reconstruction teach us in this minimal setup of five and one qubits? First, we split the boundary system into two pieces, ABC and DE and observe that the bulk operator is contained inside the causal wedge of ABC. From the rotational symmetries, we know that the bulk operator must have representations on ABC, BCD, CDE, DEA, EAB. Next, we split the boundary system into four pieces, AB, C, D and E, and observe that the bulk operator is contained inside the entanglement wedge of AB and D. So, the bulk operator must have representations on ABD, BCE, CDA, DEB, EAC. In summary, we have the following:
This is the property I want my toy model to possess.
What kinds of physical systems have such a property? Luckily, we quantum information theorists know the answer; the five-qubit code. The five-qubit code, proposed here and here, has an ability to encode one logical qubit into five-qubit entangled states and corrects any single qubit error. We can view the five-qubit code as a quantum encoding isometry from one-qubit states to five-qubit states:
where and are the basis for a logical qubit. In quantum coding theory, logical Pauli operators and are Pauli operators which act like Pauli X (bit flip) and Z (phase flip) on a logical qubit spanned by and . In the five-qubit code, for any set of qubits R with volume 3, some representations of logical Pauli X and Z operators, and , can be found on R. While and are different operators for , they act exactly in the same manner on the codeword subspace spanned by and . This is exactly the property I was looking for.
Holographic quantum error-correcting codes
We just found possibly the smallest toy model of the AdS/CFT correspondence, the five-qubit code! The remaining task is to construct a larger model. For this goal, we view the encoding isometry of the five-qubit code as a six-leg tensor. The holographic quantum code is a network of such six-leg tensors covering a hyperbolic space where each tensor has one open leg. These open legs on the bulk are interpreted as logical input legs of a quantum error-correcting code while open legs on the boundary are identified as outputs where quantum information is encoded. Then the entire tensor network can be viewed as an encoding isometry.
The six-leg tensor has some nice properties. Imagine we inject some Pauli operator into one of six legs in the tensor. Then, for any given choice of three legs, there always exists a Pauli operator acting on them which counteracts the effect of the injection. An example is shown below:
In other words, if an operator is injected from one tensor leg, one can “push” it into other three tensor legs.
Finally, let’s demonstrate causal wedge reconstruction of bulk logical operators. Pick an arbitrary open tensor leg in the bulk and inject some Pauli operator into it. We can “push” it into three tensor legs, which are then injected into neighboring tensors. By repeatedly pushing operators to the boundary in the network, we eventually have some representation of the operator living on a piece of boundary region A. And the bulk operator is contained inside the causal wedge of A. (Here, the length of the curve can be defined as the number of tensor legs cut by the curve). You can also push operators into the boundary by choosing different tensor legs which lead to different representations of a logical operator. You can even have a rather exotic representation which is supported non-locally over two disjoint pieces of the boundary, realizing entanglement wedge reconstruction.
What’s next?
This post is already pretty long and I need to wrap it up…
Shor’s quantum factoring algorithm is a revolutionary invention which opened a whole new research avenue of quantum information science. It is often forgotten, but the first quantum error-correcting code is another important invention by Peter Shor (and independently by Andrew Steane) which enabled a proof that the quantum computation can be performed fault-tolerantly. The theory of quantum error-correcting codes has found interesting applications in studies of condensed matter physics, such as topological phases of matter. Perhaps then, quantum coding theory will also find applications in high energy physics.
Indeed, many interesting open problems are awaiting us. Is entanglement wedge reconstruction a generic feature of tensor networks? How do we describe black holes by quantum error-correcting codes? Can we build a fast scrambler by tensor networks? Is entanglement a wormhole (or maybe a perfect tensor)? Can we resolve the firewall paradox by holographic quantum codes? Can the physics of quantum gravity be described by tensor networks? Or can the theory of quantum gravity provide us with novel constructions of quantum codes?
I feel that now is the time for quantum information scientists to jump into the research of black holes. We don’t know if we will be burned by a firewall or not … , but it is worth trying.
1. Whether entanglement wedge reconstruction is possible in the AdS/CFT correspondence or not still remains controversial. In the spirit of the Ryu-Takayanagi formula which relates entanglement entropy to the length of a global geodesic line, entanglement wedge reconstruction seems natural. But that a bulk operator can be reconstructed from boundary operators on two separate pieces A and C non-locally sounds rather exotic. In our paper, we constructed a toy model of tensor networks which allows both causal and entanglement wedge reconstruction in many cases. For details, see our paper.
Most of you are probably familiar with holograms, these shiny flat films representing a 3D object from essentially any desired angle. I find it quite remarkable how all the information of a 3D object can be printed on an essentially 2D film. True, the colors are not represented as faithfully as in a traditional photograph, but it looks as though we have taken a photograph from every possible angle! The speaker’s main message that day seemed even more provocative than the idea of holography itself. Even if the hologram is broken into pieces, and some of these are lost, we may still use the remaining pieces to recover parts of the 3D image or even the full thing given a sufficiently large portion of the hologram. The 3D object is not only recorded in 2D, it is recorded redundantly!
Half way through Daniel’s exposition, Beni and I exchange a knowing glance. We recognize a familiar pattern from our latest project. A pattern which has gained the moniker of “cleaning lemma” within the quantum information community which can be thought of as a quantitative analog of reconstructing the 3D image from pieces of the hologram. Daniel makes connections using a language that we are familiar with. Beni and I discuss what we have understood and how to make it more concrete as we stride back through campus. We scribble diagrams on the whiteboard and string words such as tensor, encoder, MERA and negative curvature into our discussion. An image from the web gives us some intuition on the latter. We are onto something. We have a model. It is simple. It is new. It is exciting.
Food has not come our way so we head to my apartment as we enthusiastically continue our discussion. I can only provide two avocados and some leftover pasta but that is not important, we are sharing the joy of insight. We arrange a meeting with Daniel to present our progress. By Wednesday Beni and I introduce the holographic pentagon code at the group meeting. A core for a new project is already there, but we need some help to navigate the high-energy waters. Who better to guide us in such an endeavor than our mentor, John Preskill, who recognized the importance of quantum information in Holography as early as 1999 and has repeatedly proven himself a master of both trades.
“I feel that the idea of holography has a strong whiff of entanglement—for we have seen that in a profoundly entangled state the amount of information stored locally in the microscopic degrees of freedom can be far less than we would naively expect. For example, in the case of the quantum error-correcting codes, the encoded information may occupy a small ‘global’ subspace of a much larger Hilbert space. Similarly, the distinct topological phases of a fractional quantum Hall system look alike locally in the bulk, but have distinguishable edge states at the boundary.”
-J. Preskill, 1999
As Beni puts it, the time for using modern quantum information tools in high-energy physics has come. By this he means quantum error correction and maybe tensor networks. First privately, then more openly, we continue to sharpen and shape our project. Through conferences, Skype calls and emails, we further our discussion and progressively shape ideas. Many speculations mature to conjectures and fall victim to counterexamples. Some stand the test of simulations or are even promoted to theorems by virtue of mathematical proofs.
I publicly present the project for the first time at a select quantum information conference in Australia. Two months later, after a particularly intense writing, revising and editing process, the article is almost complete. As we finalize the text and relabel the figures, Daniel and Beni unveil our work to quantum entanglement experts in Puerto Rico. The talks are a hit and it is time to let all our peers read about it.
You are invited to do so and Beni will even be serving a reader’s guide in an upcoming post.
So began the personal statement in my application to Caltech’s PhD program. I didn’t mention Sir Terry Pratchett, but he belongs in the list. Pratchett wrote over 70 books, blending science fiction with fantasy, humor, and truths about humankind. Pratchett passed away last week, having completed several novels after doctors diagnosed him with early-onset Alzheimer’s. According to the San Francisco Chronicle, Pratchett “parodie[d] everything in sight.” Everything in sight included physics.
Pratchett set many novels on the Discworld, a pancake of a land perched atop four elephants, which balance on the shell of a turtle that swims through space. Discworld wizards quantify magic in units called thaums. Units impressed their importance upon me in week one of my first high-school physics class. We define one meter as “the length of the path travelled by light in vacuum during a time interval of 1/299 792 458 of a second.” Wizards define one thaum as “the amount of magic needed to create one small white pigeon or three normal-sized billiard balls.”
Wizards study the thaum in a High-Energy Magic Building reminiscent of Caltech’s Lauritsen-Downs Building. To split the thaum, the wizards built a Thaumatic Resonator. Particle physicists in our world have split atoms into constituent particles called mesons and baryons. Discworld wizards discovered that the thaum consists of resons. Mesons and baryons consist of quarks, seemingly elementary particles that we believe cannot be split. Quarks fall into six types, called flavors: up, down, charmed, strange, top (or truth), and bottom (or beauty). Resons, too, consist of quarks. The Discworld’s quarks have the flavors up, down, sideways, sex appeal, and peppermint.
Reading about the Discworld since high school, I’ve wanted to grasp Pratchett’s allusions. I’ve wanted to do more than laugh at them. In Pyramids, Pratchett describes “ideas that would make even a quantum mechanic give in and hand back his toolbox.” Pratchett’s ideas have given me a hankering for that toolbox. Pratchett nudged me toward training as a quantum mechanic.
Pratchett hasn’t only piqued my curiosity about his allusions. He’s piqued my desire to create as he did, to do physics as he wrote. While reading or writing, we build worlds in our imaginations. We visualize settings; we grow acquainted with characters; we sense a plot’s consistency or the consistency of a system of magic. We build worlds in our imaginations also when doing and studying physics and math. The Standard Model is a system that encapsulates the consistency of our knowledge about particles. We tell stories about electrons’ behaviors in magnetic fields. Theorems’ proofs have logical structures like plots’. Pratchett and other authors trained me to build worlds in my imagination. Little wonder I’m training to build worlds as a physicist.
Around the time I graduated from college, Diana Wynne Jones passed away. So did Brian Jacques (another British novelist) and Madeleine L’Engle. L’Engle wasn’t British, but I forgave her because her Time Quartet introduced me to dimensions beyond three. As I completed one stage of intellectual growth, creators who’d led me there left.
Terry Pratchett has joined Jones, Jacques, and L’Engle. I will probably create nothing as valuable as his Discworld, let alone a character in the Standard Model toward which the Discworld steered me.
But, because of Terry Pratchett, I have to try.
The task for the computer here was to produce a verbal description of the image. There are thousands of words in the vocabulary, and a computer has to try them in different combinations to make a sensible sentence. There is no way a computer can be given an exhaustive list of correct sentences with examples of images for each. That kind of list would be a database bigger than the earth (as one can see just by counting the number of combinations). So to train the computer to use language like in a picture above, one only possesses a limited set of examples – maybe a few thousand pictures with descriptions. Yet we as humans are capable of learning from just seeing a few examples, by noticing the repeating patterns. So the computer can do the same! The score next to each word above is an estimate based on those few thousand examples of how relevant is the word “tennis” or “woman” to what’s in the box on the image. The algorithm produces possible sentences, scores them, and then selects the sentence with the highest total score.
Once the classification task is done, one needs to use all the collected information to make a prediction – as Sherlock is able to point out the most probable motive in the first picture, we also want to predict a piece of very personal information: we’d like to know how to start up a conversation with that tennis player.
Humans are actually good at classification tasks: with luck, we can notice and type in our cellphone all the details the predictor will need, like brand of clothing, hair color, height… though computers recently became better than humans at facial expression recognition, so we don’t have to trust ourselves on that anymore. Finally, when all the data is collected, most humans will still say only generic advice to you on conversation starters. Which means we are very bad at prediction tasks. We don’t notice the hidden dependencies between brand of clothes and sense of humor. But such information may not hide from the all-seeing eye of the machine learning algorithm! So expect your cellphones to give you dating advice within 10 years…
Now how do quantum computers come into play? Well if you look at your search results, they are still pretty irrelevant most of the time. Imagine you used them as conversation starters – you’ll embarrass yourself 9 out of 10 times! To make this better, a certain company needs more memory and processing power. Yet most advanced deep learning routines remain out of reach, just because there are exponentially many hidden dependencies one would need to try and reject before the algorithm finds the right predictor. So a certain company turns to us, quantum computing people, as we deal with exponentially hard problems notoriously well! And indeed, quantum algorithms make some of the machine learning routines exponentially faster – see this Quantum Machine Learning article, as well as a talk by Seth Lloyd for technical details. Some anonymous stock trader is already trying to intimidate their fellow quants (quantitative analysts) by calling the top trading system “Quantum machine learning”. I think we should appreciate his sense of humor and invest into his algorithm as soon as Quantiacs.com opens such functionality. Or we could invest in Teagan from Caltech – her code recently won the futures contest on the same website.
Once upon a time, I worked with a postdoc who shaped my views of mathematical physics, research, and life. Each week, I’d email him a PDF of the calculations and insights I’d accrued. He’d respond along the lines of, “Thanks so much for your notes. They look great! I think they’re mostly correct; there are just a few details that might need fixing.” My postdoc would point out the “details” over espresso, at a café table by a window. “Are you familiar with…?” he’d begin, and pull out of his back pocket some bit of math I’d never heard of. My calculations appeared to crumble like biscotti.
Some of the math involved CPTP maps. “CPTP” stands for a phrase little more enlightening than the acronym: “completely positive trace-preserving”. CPTP maps represent processes undergone by quantum systems. Imagine preparing some system—an electron, a photon, a superconductor, etc.—in a state I’ll call ““. Imagine turning on a magnetic field, or coupling one electron to another, or letting the superconductor sit untouched. A CPTP map, labeled as , represents every such evolution.
“Trace-preserving” means the following: Imagine that, instead of switching on the magnetic field, you measured some property of . If your measurement device (your photodetector, spectrometer, etc.) worked perfectly, you’d read out one of several possible numbers. Let denote the probability that you read out the possible number. Because your device outputs some number, the probabilities sum to one: . We say that “has trace one.” But you don’t measure ; you switch on the magnetic field. undergoes the process , becoming a quantum state . Imagine that, after the process ended, you measured a property of . If your measurement device worked perfectly, you’d read out one of several possible numbers. Let denote the probability that you read out the possible number. The probabilities sum to one: . “has trace one”, so the map is “trace preserving”.
Now that we understand trace preservation, we can understand positivity. The probabilities are positive (actually, nonnegative) because they lie between zero and one. Since the characterize a crucial aspect of , we call “positive” (though we should call “nonnegative”). turns the positive into the positive . Since maps positive objects to positive objects, we call “positive”. also satisfies a stronger condition, so we call such maps “completely positive.”^{**}
So I called my postdoc. “It’s almost right,” he’d repeat, nudging aside his espresso and pulling out a pencil. We’d patch the holes in my calculations. We might rewrite my conclusions, strengthen my assumptions, or prove another lemma. Always, we salvaged cargo. Always, I learned.
I no longer email weekly updates to a postdoc. But I apply what I learned at that café table, about entanglement and monotones and complete positivity. “It’s almost right,” I tell myself when a hole yawns in my calculations and a week’s work appears to fly out the window. “I have to fix a few details.”
Am I certain? No. But I remain positive.
^{*}Experts: “Trace-preserving” means .
^{**}Experts: Suppose that ρ is defined on a Hilbert space and that is defined on . “ is positive” means
To understand what “completely positive” means, imagine that our quantum system interacts with an environment. For example, suppose the system consists of photons in a box. If the box leaks, the photons interact with the electromagnetic field outside the box. Suppose the system-and-environment composite begins in a state defined on a Hilbert space . acts on the system’s part of state. Let denote the identity operation that maps every possible environment state to itself. Suppose that changes the system’s state while preserves the environment’s state. The system-and-environment composite ends up in the state . This state is positive, so we call “completely positive”:
Editor’s Note: Yesterday and today, Caltech is celebrating the inauguration of the Walter Burke Institute for Theoretical Physics. John Preskill made the following remarks at a dinner last night honoring the board of the Sherman Fairchild Foundation.
This is an exciting night for me and all of us at Caltech. Tonight we celebrate physics. Especially theoretical physics. And in particular the Walter Burke Institute for Theoretical Physics.
Some of our dinner guests are theoretical physicists. Why do we do what we do?
I don’t have to convince this crowd that physics has a profound impact on society. You all know that. We’re celebrating this year the 100^{th} anniversary of general relativity, which transformed how we think about space and time. It may be less well known that two years later Einstein laid the foundations of laser science. Einstein was a genius for sure, but I don’t think he envisioned in 1917 that we would use his discoveries to play movies in our houses, or print documents, or repair our vision. Or see an awesome light show at Disneyland.
And where did this phone in my pocket come from? Well, the story of the integrated circuit is fascinating, prominently involving Sherman Fairchild, and other good friends of Caltech like Arnold Beckman and Gordon Moore. But when you dig a little deeper, at the heart of the story are two theorists, Bill Shockley and John Bardeen, with an exceptionally clear understanding of how electrons move through semiconductors. Which led to transistors, and integrated circuits, and this phone. And we all know it doesn’t stop here. When the computers take over the world, you’ll know who to blame.
Incidentally, while Shockley was a Caltech grad (BS class of 1932), John Bardeen, one of the great theoretical physicists of the 20^{th} century, grew up in Wisconsin and studied physics and electrical engineering at the University of Wisconsin at Madison. I suppose that in the 1920s Wisconsin had no pressing need for physicists, but think of the return on the investment the state of Wisconsin made in the education of John Bardeen.^{1}
So, physics is a great investment, of incalculable value to society. But … that’s not why I do it. I suppose few physicists choose to do physics for that reason. So why do we do it? Yes, we like it, we’re good at it, but there is a stronger pull than just that. We honestly think there is no more engaging intellectual adventure than struggling to understand Nature at the deepest level. This requires attitude. Maybe you’ve heard that theoretical physicists have a reputation for arrogance. Okay, it’s true, we are arrogant, we have to be. But it is not that we overestimate our own prowess, our ability to understand the world. In fact, the opposite is often true. Physics works, it’s successful, and this often surprises us; we wind up being shocked again and again by “unreasonable effectiveness of mathematics in the natural sciences.” It’s hard to believe that the equations you write down on a piece of paper can really describe the world. But they do.
And to display my own arrogance, I’ll tell you more about myself. This occasion has given me cause to reflect on my own 30+ years on the Caltech faculty, and what I’ve learned about doing theoretical physics successfully. And I’ll tell you just three principles, which have been important for me, and may be relevant to the future of the Burke Institute. I’m not saying these are universal principles – we’re all different and we all contribute in different ways, but these are principles that have been important for me.
My first principle is: We learn by teaching.
Why do physics at universities, at institutions of higher learning? Well, not all great physics is done at universities. Excellent physics is done at industrial laboratories and at our national laboratories. But the great engine of discovery in the physical sciences is still our universities, and US universities like Caltech in particular. Granted, US preeminence in science is not what it once was — it is a great national asset to be cherished and protected — but world changing discoveries are still flowing from Caltech and other great universities.
Why? Well, when I contemplate my own career, I realize I could never have accomplished what I have as a research scientist if I were not also a teacher. And it’s not just because the students and postdocs have all the great ideas. No, it’s more interesting than that. Most of what I know about physics, most of what I really understand, I learned by teaching it to others. When I first came to Caltech 30 years ago I taught advanced elementary particle physics, and I’m still reaping the return from what I learned those first few years. Later I got interested in black holes, and most of what I know about that I learned by teaching general relativity at Caltech. And when I became interested in quantum computing, a really new subject for me, I learned all about it by teaching it.^{2}
Part of what makes teaching so valuable for the teacher is that we’re forced to simplify, to strip down a field of knowledge to what is really indispensable, a tremendously useful exercise. Feynman liked to say that if you really understand something you should be able to explain it in a lecture for the freshman. Okay, he meant the Caltech freshman. They’re smart, but they don’t know all the sophisticated tools we use in our everyday work. Whether you can explain the core idea without all the peripheral technical machinery is a great test of understanding.
And of course it’s not just the teachers, but also the students and the postdocs who benefit from the teaching. They learn things faster than we do and often we’re just providing some gentle steering; the effect is to amplify greatly what we could do on our own. All the more so when they leave Caltech and go elsewhere to change the world, as they so often do, like those who are returning tonight for this Symposium. We’re proud of you!
My second principle is: The two-trick pony has a leg up.
I’m a firm believer that advances are often made when different ideas collide and a synthesis occurs. I learned this early, when as a student I was fascinated by two topics in physics, elementary particles and cosmology. Nowadays everyone recognizes that particle physics and cosmology are closely related, because when the universe was very young it was also very hot, and particles were colliding at very high energies. But back in the 1970s, the connection was less widely appreciated. By knowing something about cosmology and about particle physics, by being a two-trick pony, I was able to think through what happens as the universe cools, which turned out to be my ticket to becoming a Caltech professor.
It takes a community to produce two-trick ponies. I learned cosmology from one set of colleagues and particle physics from another set of colleagues. I didn’t know either subject as well as the real experts. But I was a two-trick pony, so I had a leg up. I’ve tried to be a two-trick pony ever since.
Another great example of a two-trick pony is my Caltech colleague Alexei Kitaev. Alexei studied condensed matter physics, but he also became intensely interested in computer science, and learned all about that. Back in the 1990s, perhaps no one else in the world combined so deep an understanding of both condensed matter physics and computer science, and that led Alexei to many novel insights. Perhaps most remarkably, he connected ideas about error-correcting code, which protect information from damage, with ideas about novel quantum phases of matter, leading to radical new suggestions about how to operate a quantum computer using exotic particles we call anyons. These ideas had an invigorating impact on experimental physics and may someday have a transformative effect on technology. (We don’t know that yet; it’s still way too early to tell.) Alexei could produce an idea like that because he was a two-trick pony.^{3}
Which brings me to my third principle: Nature is subtle.
Yes, mathematics is unreasonably effective. Yes, we can succeed at formulating laws of Nature with amazing explanatory power. But it’s a struggle. Nature does not give up her secrets so readily. Things are often different than they seem on the surface, and we’re easily fooled. Nature is subtle.^{4}
Perhaps there is no greater illustration of Nature’s subtlety than what we call the holographic principle. This principle says that, in a sense, all the information that is stored in this room, or any room, is really encoded entirely and with perfect accuracy on the boundary of the room, on its walls, ceiling and floor. Things just don’t seem that way, and if we underestimate the subtlety of Nature we’ll conclude that it can’t possibly be true. But unless our current ideas about the quantum theory of gravity are on the wrong track, it really is true. It’s just that the holographic encoding of information on the boundary of the room is extremely complex and we don’t really understand in detail how to decode it. At least not yet.
This holographic principle, arguably the deepest idea about physics to emerge in my lifetime, is still mysterious. How can we make progress toward understanding it well enough to explain it to freshmen? Well, I think we need more two-trick ponies. Except maybe in this case we’ll need ponies who can do three tricks or even more. Explaining how spacetime might emerge from some more fundamental notion is one of the hardest problems we face in physics, and it’s not going to yield easily. We’ll need to combine ideas from gravitational physics, information science, and condensed matter physics to make real progress, and maybe completely new ideas as well. Some of our former Sherman Fairchild Prize Fellows are leading the way at bringing these ideas together, people like Guifre Vidal, who is here tonight, and Patrick Hayden, who very much wanted to be here.^{5} We’re very proud of what they and others have accomplished.
Bringing ideas together is what the Walter Burke Institute for Theoretical Physics is all about. I’m not talking about only the holographic principle, which is just one example, but all the great challenges of theoretical physics, which will require ingenuity and synthesis of great ideas if we hope to make real progress. We need a community of people coming from different backgrounds, with enough intellectual common ground to produce a new generation of two-trick ponies.
Finally, it seems to me that an occasion as important as the inauguration of the Burke Institute should be celebrated in verse. And so …
Who studies spacetime stress and strain
And excitations on a brane,
Where particles go back in time,
And physicists engage in rhyme?
Whose speedy code blows up a star
(Though it won’t quite blow up so far),
Where anyons, which braid and roam
Annihilate when they get home?
Who makes math and physics blend
Inside black holes where time may end?
Where do they do all this work?
The Institute of Walter Burke!
We’re very grateful to the Burke family and to the Sherman Fairchild Foundation. And we’re confident that your generosity will make great things happen!
However much I agree with Candidate A about social issues, I dislike his running mate. I lean toward Candidate B’s economic plans and C’s science-funding record, but nobody’s foreign policy impresses me. Must I settle on one candidate? May I not vote
Now you can—at least in theory. Caltech postdoc Ning Bao and I concocted quantum elections in which voters can superpose, entangle, and create probabilistic mixtures of votes.
Previous quantum-voting work has focused on privacy and cryptography. Ning and I channeled quantum game theory. Quantum game theorists ask what happens if players in classical games, such as the Prisoner’s Dilemma, could superpose strategies and share entanglement. Quantization can change the landscape of possible outcomes.
The Prisoner’s Dilemma, for example, concerns two thugs whom the police have arrested and have isolated in separate cells. Each prisoner must decide whether to rat out the other. How much time each serves depends on who, if anyone, confesses. Since neither prisoner knows the other’s decision, each should rat to minimize his or her jail time. But both would serve less time if neither confessed. The prisoners can escape this dilemma using quantum resources.
Introducing superpositions and entanglement into games helps us understand the power of quantum mechanics. Elections involve gameplay; pundits have been feeding off Hilary Clinton’s for months. So superpositions and entanglement merit introduction into elections.
How can you model elections with quantum systems? Though multiple options exist, Ning and I followed two principles: (1) A general quantum process—a preparation procedure, an evolution, and a measurement—should model a quantum election. (2) Quantum elections should remain as true as possible to classical.
Given our quantum voting system, one can violate a quantum analogue of Arrow’s Impossibility Theorem. Arrow’s Theorem, developed by the Nobel-winning economist Kenneth Arrow during the mid-20^{th} century, is a no-go theorem about elections: If a constitution has three innocuous-seeming properties, it’s a dictatorship. Ning and I translated the theorem as faithfully as we knew how into our quantum voting scheme. The result, dubbed the Quantum Arrow Conjecture, rang false.
Superposing (and probabilistically mixing) votes entices me for a reason that science does: I feel ignorant. I read articles and interview political junkies about national defense; but I miss out on evidence and subtleties. I read quantum-physics books and work through papers; but I miss out on known mathematical tools and physical interpretations. Not to mention tools and interpretations that humans haven’t discovered.
Science involves identifying (and diminishing) what humanity doesn’t know. Science frees me to acknowledge my ignorance. I can’t throw all my weight behind Candidate A’s defense policy because I haven’t weighed all the arguments about defense, because I don’t know all the arguments. Believing that I do would violate my job description. How could I not vote for elections that accommodate superpositions?
Though Ning and I identified applications of superpositions and entanglement, more quantum strategies might await discovery. Monogamy of entanglement, discussed elsewhere on this blog, might limit the influence voters exert on each other. Also, we quantized ordinal voting systems (in which each voter ranks candidates, as in “A above C above B”). The quantization of cardinal voting (in which each voter grades the candidates, as in “5 points to A, 3 points to C, 2 points to B”) or another voting scheme might yield more insights.
If you have such insights, drop us a line. Ideally before the presidential smack-down of 2016.
What I learned from a few others^{*}
If by some miracle I ever get to be a professor, there will be a few others I look to for teaching wisdom, some of which I’ve already made use of.
When it comes to writing problem sets, I would look to the two physicists who write the best problem sets I’ve ever seen: Kip Thorne and Andreas Ludwig. In my opinion, a problem set should not be something that just makes you apply the things you learned in class to other examples that are essentially the same as things you’ve seen before. The best problems make you work through an exciting new topic that the professor doesn’t have time to cover in lecture. I remember sitting in my office looking out at one of the 4 km long arms of the LIGO Hanford Observatory, where I was working during the summer after my sophomore year at Caltech, while working through some of Kip’s homework problems because those were some of the best resources I could find to teach me how the amazing contraption really works.^{1} And Ludwig probably has the best problems of all. The first problem set from one of his classes on many-body field theory, a pretty typical one, consists of two problems written over five pages along with six pages of appendices attached. Those problems sets looked daunting, but they really weren’t. Once you read through everything and thought carefully about it, the problems weren’t so bad and you ended up deriving some really cool results!^{2}
Finally, I’ll describe some things that I learned from Ed McCaffery who brilliantly dealt with the challenging job of teaching humanities at Caltech by accomplishing two goals. First, he “tricked” us into finding a “boring” subject interesting. Second, he had a great sense of who his audience was and taught us in a way that we would actually be receptive to once he had our attention. He accomplished the first goal by making his lectures extremely humorous and ridiculous, but in such a way that they were actually filled with content. He accomplished the second by boiling down the subject to a few key principles and essentially deriving the rest of the ideas from these while ignoring all of the technical details (it was almost as if we were in a physics or math class). The main reason I kept going to his classes was to be entertained—which is especially impressive seeing as how I would normally consider law to be a terribly boring subject—but I accidentally ended up learning a lot. Of all the hums I took at Caltech, his two classes are the ones I remember the most from to this day, and I know I’m not alone in this regard.
If I’m ever put in the challenging position of, for example, teaching an introductory physics course to students, maybe primarily premeds for illustrative purposes, who have no desire to learn it and are being forced to take it to satisfy some requirement, I would try to accomplish the same two goals that McCaffery did in what was essentially an equivalent scenario. It would be hard to be as funny as McCaffery, but maybe I could figure out some other way of being sufficiently ridiculous to “trick” the students into caring about the class. On the other hand, with a little experimentation and a willingness to ignore what I was told to do, I bet it wouldn’t be too hard to find a way to teach some physics in a way that these students could relate to and maybe actually remember something from after the class was over. In a bigger school like UCSB—where not everyone is going to be either a scientist, a mathematician, or an engineer and there are several segregated levels of introductory physics courses—I’ve often asked, “Why are we teaching premeds how to calculate the trajectory of a cannon ball, the moment of inertia of a disk, or the efficiency of a heat engine in quantitative detail?” It’s pretty clear that most of them aren’t going to care about any of this, and they’re really not going to need to know how to do any of that after they pass the class anyway. So is it really that shocking that they tend to go through the class with the attitude that they’re just going to do what it takes to get a good grade instead of with the attitude that they’d like to learn some science? I believe this is roughly equivalent to trying to teach Caltech undergrads the intricacies of the tax code, in the way you might teach USC law students, which I’m pretty sure wouldn’t be a huge success.
The first thing I would try would be to teach physics in a back-of-an-envelope kind of way, ignoring any rigor and just trying to get a feel for how powerful of a tool physics, and really science in general, can be by applying it to problems that I thought the students would be interested in or find amusing. For example, I might explain how using a simple scaling argument shows that the height that most animals can jump, regardless of their size, is roughly universal (and, by observations, happens to be on the order of half a meter). Or maybe I’d explain how some basic knowledge of material properties allows you to estimate how the maximum height of a mountain depends on the properties of a planet—and when you plug the numbers in for the earth, you basically get the height of Mt. Everest.^{3} You could probably even do this in such a way that every example and every problem would actually be relevant to what premeds might use later in their careers. And maybe the most important part is that I know I would learn a lot from doing this—it could even be completely different if taught multiple times—and so I would actually be excited about the material and would be motivated to explain it well.
And if that didn’t work, I’d hope that in talking with the students and getting a sense of what was important to them, I would be able to come up with a different approach that would be successful. I simply don’t believe that it’s impossible to find a way to reach every kind of student, whether they be aspiring scientists, mathematicians, engineers, doctors, lawyers, historians, poets, artists, or politicians. Physics is just too exciting of a subject for that to be true, but you’ve got to know your audience. (Maybe McCaffery feels the same way about law and economics for all I know.)
Closing thoughts
There are two memories in particular of interactions with students that will always mean more to me than the teaching award itself because they remind me of how I felt interacting with John. After we were leaving a particularly good lunch lecture,^{4} I remember Steve grinning from ear to ear telling everyone within sight about how awesome the lecture was. In a similar situation after one of my quantum field theory lectures,^{5} I was standing in the hallway answering a student’s question, but out of the corner of my eye I could see a small group of students with similar grins as Steve’s and overheard them saying that these were some of the best recitation sections they had ever been to. The other time was when a student came to my office hours and asked if he could ask me a question about some random advanced physics topic he was reading about because he really wanted to understand it well and wanted to hear what I had to say about it. That student probably had no idea how good that made me feel because that’s exactly how I felt anytime I asked John a question.
Looking back over my notes for the field theory course, I feel like I didn’t actually do that good of a job overall, though I am happy that the students seemed to enjoy it and learned a lot from it. There are some things I am very proud of. Probably the biggest one was my last lecture on the Casimir effect which included a digression on what it means for something to be “renormalizable” or “non-renormalizable” and how there is absolutely nothing wrong with the second kind in the context of effective field theories.^{6} After an introduction to the philosophy of effective field theories (see footnote 6 for an excellent reference), that discussion mostly included the classic pictures, found in Figure 4.2, of the standard model superseding Fermi theory followed by string theory superseding the effective field theory of gravity,^{7} though I also very briefly mentioned that neutrino masses and proton decay could be understood by similar arguments.
But the lectures taken as a whole were too technical for the intended audience, unlike my phase transition lecture. At the time I was giving the lectures, I was working through Weinberg’s books on QFT (and GR) and was very excited about his non-standard approach which seemed especially elegant to me.^{8} I think I let my excitement about Weinberg creep into some of my lectures without properly toning down the math^{9} as is most clearly illustrated by my attempt to explain the spin-statistics theorem. That could be much better explained to the intended audience with pictures along the lines of Feynman’s discussion in Elementary Particles and the Laws of Physics or John’s comments in his chapter on anyons.^{10} If I have the opportunity to teach ever again, I’ll try to do an even better John Preskill imitation, maybe perturbing it slightly with further wisdom gained from others over the years.
*This section lies somewhat out of the blog’s main line of development, and may be omitted in a first reading.
When it came time for me to give my last electrodynamics lecture, I remember thinking that I wanted to give a lecture that would inspire my students to go read more and which would serve as a good introduction to do just that—just as John’s lectures had done for me so many times. Now I am not nearly as quick as John,^{1} so I didn’t prepare my lecture in my head on the short walk from my office to the classroom where I taught, but I did prepare a lecture that I hoped would satisfy the above criteria. I thought that the way magnets actually work was left as a bit of a mystery in the standard course on electrodynamics. After toying around with a few magnet themes, I eventually prepared a lecture on the ferromagnetic-paramagnetic phase transition from a Landau-Ginzburg effective field theory point of view.^{2} The lecture had a strong theme about the importance and role of spontaneous symmetry breaking throughout physics, using magnets and magnons mainly as an example of the more general phenomenon. Since there was a lot of excitement about the discovery of the Higgs boson just a few months earlier, I also briefly hinted at the connection between Landau-Ginzburg and the Higgs. Keeping John’s excellent example in mind, my lecture consisted almost entirely of pictures with maybe an equation or two here and there.
This turned out to be one of the most rewarding experiences of my life. I taught identical recitation sections three times a week, and the section attendance dropped rapidly, probably to around 5–10 students per section, after the first few weeks of the quarter. Things were no different for my last lectures earlier in the week. But on the final day, presumably after news of what I was doing had time to spread and the students realized I was not just reviewing for the final, the classroom was packed. There were far more students than I had ever seen before, even on the first day of class. I remember saying something like, “Now your final is next week and I’m happy to answer any questions you may have on any of the material we have covered. Or if no one has any questions, I have prepared a lecture on the ferromagnetic-paramagnetic phase transition. So, does anyone have any questions?” The response was swift and decisive. One of the students sitting in the front row eagerly and immediately blurted out, “No, do that!”
So I proceeded to give my lecture, which seemed to go very well. The highlight of the lecture—at least for me because it is an example where a single physical idea explains a variety of physical phenomena occurring over a very large range of scales and because it demonstrated that at least some of the students had understood most of what I had already said—was when I had the class help me fill in and discuss Table 3.1. Many of the students were asking good questions throughout the lecture, several stayed after class to ask me more questions, and some came to my office hours after the break to ask even more questions—clearly after doing some reading on their own!
After that experience, I decided to completely ignore everything I was taught in TA training and to instead always try to do my best John Preskill imitation anytime I interacted with students. (Minus the voice, of course. There’s no replicating that, and even if I tried, hardly anyone would get the joke.) As I suggested earlier, a big frustration with what I was told to do was that—since UCSB has so many students with such a wide range of abilities—I should mostly aim my instruction for the middle of the bell curve, but could try to do some extra things to challenge the students on one tail and some remedial things to help the students on the other—but only if I felt like it. I’ve been told there’s no reason to “waste time” explaining how an idea fits into the bigger picture and is related to other physical concepts. Most students aren’t going to care and I could spend that time going over another explicit example showing the students exactly how to do something. I thought this was stupid advice and did it anyway, and in my experience, explaining the wider context was usually pretty helpful even to the students struggling the most.^{3}
But when I started trying to imitate John by completely ignoring what I was supposed to do and instead just trying to explain exciting physics like he did, I seemed to become a fairly popular TA. I think most students appreciated being treated as naive equals who had the ability to learn awesome things—just as I was treated at the IQI. Instead of the null hypothesis being that your students are stupid and you need to coddle them otherwise they will be completely lost, make the null hypothesis that your students are actually pretty bright and are both interested in and able to learn exciting things, even if that means using a few concepts from other courses. But also be very willing to quickly and momentarily reject the null in a non-condescending manner. I concede the point that there are many arguments against my philosophy and that the anecdotal data to support my approach is potentially extremely biased,^{4} but I still think that it’s probably best to err on the side of allowing teachers to teach in the way that excites them the most since this will probably motivate the students to learn just by osmosis—in theory at least.
The last class I was a TA for (an advanced undergraduate elective on relativistic quantum mechanics) was probably the one that benefited the most from my attempts to imitate John, and me regurgitating things that I learned from him and a few others (see Figure 4.2 in the last post). The course used Dirac and Bjorken and Drell as textbooks, which are not the books that I would have chosen for the intended audience: advanced undergrads who might want to take QFT in the future.^{5} By that time, I was fully into ignoring what I was supposed to do and was just going to try to teach my students some interesting physics. So there was no way I was going to spend recitation sections going over, in quantitative detail, the Dirac Sea and other such outdated topics that Julian Schwinger would describe as “best regarded as a historical curiosity, and forgotten.”^{6} Instead, I decided right from the beginning, that I would give a series of lectures on an introduction to quantum field theory by giving my best guess as to the lectures John would have given if Steve or I had asked him questions such as “Why are spin and statistics related?,” or “Why are neutral conducting plates attracted to each other in vacuum?” Those lectures were often pretty crowded, one time so much so that I literally tripped over the outstretched legs of one of the students who was sitting on the ground in the front of the classroom as I was trying to work at the boards; all of the seats were taken and several people were already standing in the back. (In all honesty, that classroom was terrible and tiny.^{7} Nevertheless, at least for that one lecture, I would be surprised to learn that everyone in attendance was enrolled in the course.)
In the last post, I’ll describe some other teaching wisdom I learned from a few other good professors as well as concluding with some things I wish I had done better.