Local operations and Chinese communications

The workshop spotlighted entanglement. It began in Shanghai, paused as participants hopped the Taiwan Strait, and resumed in Taipei. We discussed quantum operations and chaos, thermodynamics and field theory.1 I planned to return from Taipei to Shanghai to Los Angeles.

Quantum thermodynamicist Nelly Ng and I drove to the Taipei airport early. News from Air China curtailed our self-congratulations: China’s military was running an operation near Shanghai. Commercial planes couldn’t land. I’d miss my flight to LA.

nelly-and-me

Two quantum thermodynamicists in Shanghai

An operation?

Quantum information theorists use a mindset called operationalism. We envision experimentalists in separate labs. Call the experimentalists Alice, Bob, and Eve (ABE). We tell stories about ABE to formulate and analyze problems. Which quantum states do ABE prepare? How do ABE evolve, or manipulate, the states? Which measurements do ABE perform? Do they communicate about the measurements’ outcomes?

Operationalism concretizes ideas. The outlook checks us from drifting into philosophy and into abstractions difficult to apply physics tools to.2 Operationalism infuses our language, our framing of problems, and our mathematical proofs.

Experimentalists can perform some operations more easily than others. Suppose that Alice controls the magnets, lasers, and photodetectors in her lab; Bob controls the equipment in his; and Eve controls the equipment in hers. Each experimentalist can perform local operations (LO). Suppose that Alice, Bob, and Eve can talk on the phone and send emails. They exchange classical communications (CC).

You can’t generate entanglement using LOCC. Entanglement consists of strong correlations that quantum systems can share and that classical systems can’t. A quantum system in Alice’s lab can hold more information about a quantum system of Bob’s than any classical system could. We must create and control entanglement to operate quantum computers. Creating and controlling entanglement poses challenges. Hence quantum information scientists often model easy-to-perform operations with LOCC.

Suppose that some experimentalist Charlie loans entangled quantum systems to Alice, Bob, and Eve. How efficiently can ABE compute some quantity, exchange quantum messages, or perform other information-processing tasks, using that entanglement? Such questions underlie quantum information theory.

ca

Taipei’s night market. Or Caltech’s neighborhood?

Local operations.

Nelly and I performed those, trying to finagle me to LA. I inquired at Air China’s check-in desk in English. Nelly inquired in Mandarin. An employee smiled sadly at each of us.

We branched out into classical communications. I called Expedia (“No, I do not want to fly to Manila”), United Airlines (“No flights for two days?”), my credit-card company, Air China’s American reservations office, Air China’s Chinese reservations office, and Air China’s Taipei reservations office. I called AT&T to ascertain why I couldn’t reach Air China (“Yes, please connect me to the airline. Could you tell me the number first? I’ll need to dial it after you connect me and the call is then dropped”).

As I called, Nelly emailed. She alerted Bob, aka Janet (Ling-Yan) Hung, who hosted half the workshop at Fudan University in Shanghai. Nelly emailed Eve, aka Feng-Li Lin, who hosted half the workshop at National Taiwan University in Taipei. Janet twiddled the magnets in her lab (investigated travel funding), and Feng-Li cooled a refrigerator in his.

ABE can process information only so efficiently, using LOCC. The time crept from 1:00 PM to 3:30.

nelly-2-001

Nelly Ng uses classical communications.

What could we have accomplished with quantum communication? Using LOCC, Alice can manipulate quantum states (like an electron’s orientation) in her lab. She can send nonquantum messages (like “My flight is delayed”) to Bob. She can’t send quantum information (like an electron’s orientation).

Alice and Bob can ape quantum communication, given entanglement. Suppose that Charlie strongly correlates two electrons. Suppose that Charlie gives Alice one electron and gives Bob the other. Alice can send one qubit–one unit of quantum information–to Bob. We call that sending quantum teleportation.

Suppose that air-traffic control had loaned entanglement to Janet, Feng-Li, and me. Could we have finagled me to LA quickly?

Quantum teleportation differs from human teleportation.

xkcd

xkcd.com/465

We didn’t need teleportation. Feng-Li arranged for me to visit Taiwan’s National Center for Theoretical Sciences (NCTS) for two days. Air China agreed to return me to Shanghai afterward. United would fly me to LA, thanks to help from Janet. Nelly rescued my luggage from leaving on the wrong flight.

Would I rather have teleported? I would have avoided a bushel of stress. But I wouldn’t have learned from Janet about Chinese science funding, wouldn’t have heard Feng-Li’s views about gravitational waves, wouldn’t have glimpsed Taiwanese countryside flitting past the train we rode to the NCTS.

According to some metrics, classical resources outperform quantum.

einstein-2-001

At Taiwan’s National Center for Theoretical Sciences

The workshop organizers have generously released videos of the lectures. My lecture about quantum chaos and fluctuation relations appears here and here. More talks appear here.

With gratitude to Janet Hung, Feng-Li Lin, and Nelly Ng; to Fudan University, National Taiwan University, and Taiwan’s National Center for Theoretical Sciences for their hospitality; and to Xiao Yu for administrative support.

Glossary and other clarifications:

1Field theory describes subatomic particles and light.

2Physics and philosophy enrich each other. But I haven’t trained in philosophy. I benefit from differentiating physics problems that I’ve equipped to solve from philosophy problems that I haven’t.

It’s CHAOS!

My brother and I played the video game Sonic the Hedgehog on a Sega Dreamcast. The hero has spiky electric-blue fur and can run at the speed of sound.1 One of us, then the other, would battle monsters. Monster number one oozes onto a dark city street as an aquamarine puddle. The puddle spreads, then surges upward to form limbs and claws.2 The limbs splatter when Sonic attacks: Aqua globs rain onto the street.

chaos-vs-sonic

The monster’s master, Dr. Eggman, has ginger mustachios and a body redolent of his name. He scoffs as the heroes congratulate themselves.

“Fools!” he cries, the pauses in his speech heightening the drama. “[That monster is] CHAOS…the GOD…of DE-STRUC-TION!” His cackle could put a Disney villain to shame.

Dr. Eggman’s outburst comes to mind when anyone asks what topic I’m working on.

“Chaos! And the flow of time, quantum theory, and the loss of information.”

eggman

Alexei Kitaev, a Caltech physicist, hooked me on chaos. I TAed his spring-2016 course. The registrar calls the course Ph 219c: Quantum Computation. I call the course Topics that Interest Alexei Kitaev.

“What do you plan to cover?” I asked at the end of winter term.

Topological quantum computation, Alexei replied. How you simulate Hamiltonians with quantum circuits. Or maybe…well, he was thinking of discussing black holes, information, and chaos.

If I’d had a tail, it would have wagged.

“What would you say about black holes?” I asked.

untitled-2

Sonic’s best friend, Tails the fox.

I fwumped down on the couch in Alexei’s office, and Alexei walked to his whiteboard. Scientists first noticed chaos in classical systems. Consider a double pendulum—a pendulum that hangs from the bottom of a pendulum that hangs from, say, a clock face. Imagine pulling the bottom pendulum far to one side, then releasing. The double pendulum will swing, bend, and loop-the-loop like a trapeze artist. Imagine freezing the trapeze artist after an amount t of time.

What if you pulled another double pendulum a hair’s breadth less far? You could let the pendulum swing, wait for a time t, and freeze this pendulum. This pendulum would probably lie far from its brother. This pendulum would probably have been moving with a different speed than its brother, in a different direction, just before the freeze. The double pendulum’s motion changes loads if the initial conditions change slightly. This sensitivity to initial conditions characterizes classical chaos.

A mathematical object F(t) reflects quantum systems’ sensitivities to initial conditions. [Experts: F(t) can evolve as an exponential governed by a Lyapunov-type exponent: \sim 1 - ({\rm const.})e^{\lambda_{\rm L} t}.] F(t) encodes a hypothetical process that snakes back and forth through time. This snaking earned F(t) the name “the out-of-time-ordered correlator” (OTOC). The snaking prevents experimentalists from measuring quantum systems’ OTOCs easily. But experimentalists are trying, because F(t) reveals how quantum information spreads via entanglement. Such entanglement distinguishes black holes, cold atoms, and specially prepared light from everyday, classical systems.

Alexei illustrated, on his whiteboard, the sensitivity to initial conditions.

“In case you’re taking votes about what to cover this spring,” I said, “I vote for chaos.”

We covered chaos. A guest attended one lecture: Beni Yoshida, a former IQIM postdoc. Beni and colleagues had devised quantum error-correcting codes for black holes.3 Beni’s foray into black-hole physics had led him to F(t). He’d written an OTOC paper that Alexei presented about. Beni presented about a follow-up paper. If I’d had another tail, it would have wagged.

tails-2

Sonic’s friend has two tails.

Alexei’s course ended. My research shifted to many-body localization (MBL), a quantum phenomenon that stymies the spread of information. OTOC talk burbled beyond my office door.

At the end of the summer, IQIM postdoc Yichen Huang posted on Facebook, “In the past week, five papers (one of which is ours) appeared . . . studying out-of-time-ordered correlators in many-body localized systems.”

I looked down at the MBL calculation I was performing. I looked at my computer screen. I set down my pencil.

“Fine.”

I marched to John Preskill’s office.

boss

The bosses. Of different sorts, of course.

The OTOC kept flaring on my radar, I reported. Maybe the time had come for me to try contributing to the discussion. What might I contribute? What would be interesting?

We kicked around ideas.

“Well,” John ventured, “you’re interested in fluctuation relations, right?”

Something clicked like the “power” button on a video-game console.

Fluctuation relations are equations derived in nonequilibrium statistical mechanics. They describe systems driven far from equilibrium, like a DNA strand whose ends you’ve yanked apart. Experimentalists use fluctuation theorems to infer a difficult-to-measure quantity, a difference \Delta F between free energies. Fluctuation relations imply the Second Law of Thermodynamics. The Second Law relates to the flow of time and the loss of information.

Time…loss of information…Fluctuation relations smelled like the OTOC. The two had to join together.

on-button

I spent the next four days sitting, writing, obsessed. I’d read a paper, three years earlier, that casts a fluctuation relation in terms of a correlator. I unearthed the paper and redid the proof. Could I deform the proof until the paper’s correlator became the out-of-time-ordered correlator?

Apparently. I presented my argument to my research group. John encouraged me to clarify a point: I’d defined a mathematical object A, a probability amplitude. Did A have physical significance? Could anyone measure it? I consulted measurement experts. One identified A as a quasiprobability, a quantum generalization of a probability, used to model light in quantum optics. With the experts’ assistance, I devised two schemes for measuring the quasiprobability.

The result is a fluctuation-like relation that contains the OTOC. The OTOC, the theorem reveals, is a combination of quasiprobabilities. Experimentalists can measure quasiprobabilities with weak measurements, gentle probings that barely disturb the probed system. The theorem suggests two experimental protocols for inferring the difficult-to-measure OTOC, just as fluctuation relations suggest protocols for inferring the difficult-to-measure \Delta F. Just as fluctuation relations cast \Delta F in terms of a characteristic function of a probability distribution, this relation casts F(t) in terms of a characteristic function of a (summed) quasiprobability distribution. Quasiprobabilities reflect entanglement, as the OTOC does.

pra-image

Collaborators and I are extending this work theoretically and experimentally. How does the quasiprobability look? How does it behave? What mathematical properties does it have? The OTOC is motivating questions not only about our quasiprobability, but also about quasiprobability and weak measurements. We’re pushing toward measuring the OTOC quasiprobability with superconducting qubits or cold atoms.

Chaos has evolved from an enemy to a curiosity, from a god of destruction to an inspiration. I no longer play the electric-blue hedgehog. But I remain electrified.

 

1I hadn’t started studying physics, ok?

2Don’t ask me how the liquid’s surface tension rises enough to maintain the limbs’ shapes.

3Black holes obey quantum mechanics. Quantum systems can solve certain problems more quickly than ordinary (classical) computers. Computers make mistakes. We fix mistakes using error-correcting codes. The codes required by quantum computers differ from the codes required by ordinary computers. Systems that contain black holes, we can regard as performing quantum computations. Black-hole systems’ mistakes admit of correction via the code constructed by Beni & co. 

Quantum Chess

Two years ago, as a graduate student in Physics at USC,  I began work on a game whose mechanics were based on quantum mechanics. When I had a playable version ready, my graduate adviser, Todd Brun, put me in contact with IQIM’s Spiros Michalakis, who had already worked with Google to design qCraft, a mod introducing quantum mechanics into Minecraft. Spiros must have seen potential in my clunky prototype and our initial meeting turned into weekly brainstorming lunches at Caltech’s Chandler cafeteria. More than a year later, the game had evolved into Quantum Chess and we began talking about including a video showing some gameplay at an upcoming Caltech event celebrating Feynman’s quantum legacy. The next few months were a whirlwind. Somehow this video turned into a Quantum Chess battle for the future of humanity, between Stephen Hawking and Paul Rudd. And it was being narrated by Keanu Reeves! The video, called Anyone Can Quantum, and directed by Alex Winter, premiered at Caltech’s One Entangled Evening on January 26, 2016 and has since gone viral. If you haven’t watched it, now would be a good time to do so (if you are at work, be prepared to laugh quietly).

So, what exactly is Quantum Chess and how does it make use of quantum physics? It is a modern take on the centuries-old game of strategy that endows each chess piece with quantum powers. You don’t need to know quantum mechanics to play the game. On the other hand, understanding the rules of chess might help [1].  But if you already know the basics of regular chess, you can just start playing. Over time, your brain will get used to some of the strange quantum behavior of the chess pieces and the battles you wage in Quantum Chess will make regular chess look like tic-tac-toe [2].

Quantum ChessIn this post, I will discuss the concept of quantum superposition and how it plays a part in the game. There will be more posts to follow that will discuss entanglement, interference, and quantum measurement [3].

In quantum chess, players have the ability to perform quantum moves in addition to the standard chess moves. Each time a player chooses to move a piece, they can indicate whether they want to perform a standard move, or a quantum move. A quantum move creates a superposition of boards. If any of you ever saw Star Trek 3D Chess, you can think of this in a similar way.

Star Trek 3D Chess

There are multiple boards on which pieces exist. However, in Quantum Chess, the number of possible boards is not fixed, it can increase or decrease. All possible boards exist in a superposition. The player is presented with a single board that represents the entire superposition. In Quantum Chess, any individual move will act on all boards at the same time.  Each time a player makes a quantum move, the number of possible boards present in the superposition doubles. Let’s look at some pictures that might clarify things.

The Quantum Chess board begins in the same configuration as standard chess.

InitialConfigAll pawns move the same as they would in standard chess, but all other pieces get a choice of two movement types, standard or quantum. Standard moves act exactly as they would in standard chess. However, quantum moves, create superpositions. Let’s look at an example of a quantum move for the white queen.

QueenQuantumMoveIn this diagram, we see what happens when we perform a quantum move of the white queen from D1 to D3. We get two possible boards. On one board the queen did not move at all. On the other, the queen did move. Each board has a 50% chance of “existence”. Showing every possible board, though, would get quite complicated after just a few moves. So, the player view of the game is a single board. After the same quantum queen move, the player sees this:

PlayerViewQQD1D3The teal colored “fill” of each queen shows the probability of finding the queen in that space; the same queen, existing in different locations on the board. The queen is in a superposition of being in two places at once. On their next turn, the player can choose to move any one of their pieces.   

So, let’s talk about moving the queen, again. You may be wondering, “What happens if I want to move a piece that is in a superposition?” The queen exists in two spaces. You choose which of those two positions you would like to move from, and you can perform the same standard or quantum moves from that space. Let’s look at trying to perform a standard move, instead of a quantum move, on the queen that now exists in a superposition. The result would be as follows:

StandardSuperpositionQueenMoveThe move acts on all boards in the superposition. On any board where the queen is in space D3, it will be moved to B5. On any board where the queen is still in space D1, it will not be moved. There is a 50% chance that the queen is still in space D1 and a 50% chance that it is now located in B5. The player view, as illustrated below, would again be a 50/50 superposition of the queen’s position. This was just an example of a standard move on a piece in a superposition, but a quantum move would work similarly.

PlayerViewQueenMove

Some of you might have noticed the quantum move basically gives you a 50% chance to pass your turn. Not a very exciting thing to do for most players. That’s why I’ve given the quantum move an added bonus. With a quantum move, you can choose a target space that is up to two standard moves away! For example, the queen could choose a target that is forward two spaces and then left two spaces. Normally, this would take two turns: The first turn to move from D1 to D3 and the second turn to move from D3 to B3. A quantum move gives you a 50% chance to move from D1 to B3 in a single turn!

Let’s look at a quantum queen move from D1 to B3.

QQD1B3Just like the previous quantum move we looked at, we get a 50% probability that the move was successful and a 50% probability that nothing happened. As a player, we would see the board below.

QuantumQueenD1toB3There is a 50% chance the queen completed two standard moves in one turn! Don’t worry though, things are not just random. The fact that the board is a superposition of boards and that movement is unitary (just a fancy word for how quantum things evolve) can lead to some interesting effects. I’ll end this post here. Now, I hope I’ve given you some idea of how superposition is present in Quantum Chess. In the next post I’ll go into entanglement and a bit more on the quantum move!

Notes:

[1] For those who would like to know more about chess, here is a good link.

[2] If you would like to see a public release of Quantum Chess (and get a copy of the game), consider supporting the Kickstarter campaign.

[3] I am going to be describing aspects of the game in terms of probability and multiple board states. For those with a scientific or technical understanding of how quantum mechanics works, this may not appear to be very quantum. I plan to go into a more technical description of the quantum aspects of the game in a later post. Also, a reminder to the non-scientific audience. You don’t need to know quantum mechanics to play this game. In fact, you don’t even need to know what I’m going to be describing here to play! These posts are just for those with an interest in how concepts like superposition, entanglement, and interference can be related to how the game works.

Toward physical realizations of thermodynamic resource theories

“This is your arch-nemesis.”

The thank-you slide of my presentation remained onscreen, and the question-and-answer session had begun. I was presenting a seminar about thermodynamic resource theories (TRTs), models developed by quantum-information theorists for small-scale exchanges of heat and work. The audience consisted of condensed-matter physicists who studied graphene and photonic crystals. I was beginning to regret my topic’s abstractness.

The question-asker pointed at a listener.

“This is an experimentalist,” he continued, “your arch-nemesis. What implications does your theory have for his lab? Does it have any? Why should he care?”

I could have answered better. I apologized that quantum-information theorists, reared on the rarefied air of Dirac bras and kets, had developed TRTs. I recalled the baby steps with which science sometimes migrates from theory to experiment. I could have advocated for bounding, with idealizations, efficiencies achievable in labs. I should have invoked the connections being developed with fluctuation results, statistical mechanical theorems that have withstood experimental tests.

The crowd looked unconvinced, but I scored one point: The experimentalist was not my arch-nemesis.

“My new friend,” I corrected the questioner.

His question has burned in my mind for two years. Experiments have inspired, but not guided, TRTs. TRTs have yet to drive experiments. Can we strengthen the connection between TRTs and the natural world? If so, what tools must resource theorists develop to predict outcomes of experiments? If not, are resource theorists doing physics?

http://everystevejobsvideo.com/steve-jobs-qa-session-excerpt-following-antennagate-2010/

A Q&A more successful than mine.

I explore answers to these questions in a paper released today. Ian Durham and Dean Rickles were kind enough to request a contribution for a book of conference proceedings. The conference, “Information and Interaction: Eddington, Wheeler, and the Limits of Knowledge” took place at the University of Cambridge (including a graveyard thereof), thanks to FQXi (the Foundational Questions Institute).

What, I asked my advisor, does one write for conference proceedings?

“Proceedings are a great opportunity to get something off your chest,” John said.

That seminar Q&A had sat on my chest, like a pet cat who half-smothers you while you’re sleeping, for two years. Theorists often justify TRTs with experiments.* Experimentalists, an argument goes, are probing limits of physics. Conventional statistical mechanics describe these regimes poorly. To understand these experiments, and to apply them to technologies, we must explore TRTs.

Does that argument not merit testing? If experimentalists observe the extremes predicted with TRTs, then the justifications for, and the timeliness of, TRT research will grow.

http://maryqin.com/wp-content/uploads/2014/05/

Something to get off your chest. Like the contents of a conference-proceedings paper, according to my advisor.

You’ve read the paper’s introduction, the first eight paragraphs of this blog post. (Who wouldn’t want to begin a paper with a mortifying anecdote?) Later in the paper, I introduce TRTs and their role in one-shot statistical mechanics, the analysis of work, heat, and entropies on small scales. I discuss whether TRTs can be realized and whether physicists should care. I identify eleven opportunities for shifting TRTs toward experiments. Three opportunities concern what merits realizing and how, in principle, we can realize it. Six adjustments to TRTs could improve TRTs’ realism. Two more-out-there opportunities, though less critical to realizations, could diversify the platforms with which we might realize TRTs.

One opportunity is the physical realization of thermal embezzlement. TRTs, like thermodynamic laws, dictate how systems can and cannot evolve. Suppose that a state R cannot transform into a state S: R \not\mapsto S. An ancilla C, called a catalyst, might facilitate the transformation: R + C \mapsto S + C. Catalysts act like engines used to extract work from a pair of heat baths.

Engines degrade, so a realistic transformation might yield S + \tilde{C}, wherein \tilde{C} resembles C. For certain definitions of “resembles,”** TRTs imply, one can extract arbitrary amounts of work by negligibly degrading C. Detecting the degradation—the work extraction’s cost—is difficult. Extracting arbitrary amounts of work at a difficult-to-detect cost contradicts the spirit of thermodynamic law.

The spirit, not the letter. Embezzlement seems physically realizable, in principle. Detecting embezzlement could push experimentalists’ abilities to distinguish between close-together states C and \tilde{C}. I hope that that challenge, and the chance to violate the spirit of thermodynamic law, attracts researchers. Alternatively, theorists could redefine “resembles” so that C doesn’t rub the law the wrong way.

http://www.eoht.info/page/Laws+of+thermodynamics+(game+version)

The paper’s broadness evokes a caveat of Arthur Eddington’s. In 1927, Eddington presented Gifford Lectures entitled The Nature of the Physical World. Being a physicist, he admitted, “I have much to fear from the expert philosophical critic.” Specializing in TRTs, I have much to fear from the expert experimental critic. The paper is intended to point out, and to initiate responses to, the lack of physical realizations of TRTs. Some concerns are practical; some, philosophical. I expect and hope that the discussion will continue…preferably with more cooperation and charity than during that Q&A.

If you want to continue the discussion, drop me a line.

*So do theorists-in-training. I have.

**A definition that involves the trace distance.

The mentors that shape us

Three years and three weeks ago I started my first blog. I wasn’t quite sure what to call it, so I went to John for advice. I had several names in mind, but John quickly zeroed in on one: Quantum Frontiers. The url was available, the name was simple and to the point, it had the word quantum in it, and it was appropriate for a blog that was to provide a vantage point from which the public could view the frontiers of quantum science. But there was a problem; we had no followers and when I first asked John if he would write something for the blog, he had said: I don’t know… I will see…maybe some day… let me think about it. The next day John uploaded More to come, the first real post on Quantum Frontiers after the introductory Hello quantum world! We had agreed on a system in order to keep the quality of the posts above some basic level: we would send each other our posts for editing before we made them public. That way, we could catch any silly typos and have a second pair of eyes do some fact-checking. So, when John sent me his first post, I went to task editing away typos. But the power that comes with being editor-in-chief corrupts. So, when I saw the following sentence in More to come…

I was in awe of Wheeler. Some students thought he sucked.

I immediately changed it to…

I was in awe of Wheeler. Some students thought less of him.

And next, when I saw John write about himself,

Though I’m 59, few students seemed awed. Some thought I sucked. Maybe I did sometimes.

I massaged it into…

Though I’m 59, few students seemed awed. Some thought I was not as good. Maybe I wasn’t sometimes.

When John published the post, I read it again for any typos I might have missed. There were no typos. I felt useful! But when I saw that all mentions of sucked had been restored to their rightful place, I felt like an idiot. John did not fire a strongly-worded email back my way asking for an explanation as to my taking liberties with his own writing. He simply trusted that I would get the message in the comfort of my own awkwardness. It worked beautifully. John had set the tone for Quantum Frontier’s authentic voice with his very first post. It was to be personal, even if the subject matter was as scientifically hardcore as it got.

So when the time came for me to write my first post, I made it personal. I wrote about my time in Los Alamos as a postdoc, working on a problem in mathematical physics that almost broke me. It was Matt Hastings, an intellectual tornado, that helped me through these hard times. As my mentor, he didn’t say things like Well done! Great progress! Good job, Spiro! He said, You can do this. And when I finally did it, when I finally solved that damn problem, Matt came back to me and said: Beyond some typos, I cannot find any mistakes. Good job, Spiro. And it meant the world to me. The sleepless nights, the lonely days up in the Pajarito mountains of New Mexico, the times I had resolved to go work for my younger brother as a waiter in his first restaurant… those were the times that I had come upon a fork on the road and my mentor had helped me choose the path less traveled.

When the time came for me to write my next post, I ended by offering two problems for the readers to solve, with the following text as motivation:

This post is supposed to be an introduction to the insanely beautiful world of problem solving. It is not a world ruled by Kings and Queens. It is a world where commoners like you and me can become masters of their domain and even build an empire.

Doi-Inthananon-Thailand

Doi-Inthananon temple in Chiang Mai, Thailand. A breathtaking city, host of this year’s international math olympiad.

It has been way too long since my last “problem solving” post, so I leave you with a problem from this year’s International Math Olympiad, which took place in gorgeous Chiang Mai, Thailand. FiverThirtyEight‘s recent article about the dominance of the US math olympic team in this year’s competition, gives some context about the degree of difficulty of this problem:

Determine all triples (a, b, c) of positive integers such that each of the numbers: ab-c, bc-a, ca-b is a power of two.

Like Fermat’s Last Theorem, this problem is easy to describe and hard to solve. Only 5 percent of the competitors got full marks on this question, and nearly half (44 percent) got no points at all.

But, on the triumphant U.S. squad, four of the six team members nailed it.

In other words, only 1 in 20 kids in the competition solved this problem correctly and about half of the kids didn’t even know where to begin. For more perspective, each national team is comprised of the top 6 math prodigies in that country. In China, that means 6 out of something like 100 million kids. And only 3-4 of these kids solved the problem.

The coach of the US national team, Po-Shen Loh, a Caltech alum and an associate professor of mathematics at Carnegie Mellon University (give him tenure already) deserves some serious props. If you think this problem is too hard, I have this to say to you: Yes, it is. But, who cares? You can do this.

Note: I will work out the solution in detail in an upcoming post, unless one of you solves it in the comments section before then!

Update: Solution posted in comments below (in response to Anthony’s comment). Thank you all who posted some of the answers below. The solution is far from trivial, but I still wonder if an elegant solution exists that gives all four triples. Maybe the best solution is geometric? I hope one of you geniuses can figure that out!

Quantum gravity from quantum error-correcting codes?

The lessons we learned from the Ryu-Takayanagi formula, the firewall paradox and the ER=EPR conjecture have convinced us that quantum information theory can become a powerful tool to sharpen our understanding of various problems in high-energy physics. But, many of the concepts utilized so far rely on entanglement entropy and its generalizations, quantities developed by Von Neumann more than 60 years ago. We live in the 21st century. Why don’t we use more modern concepts, such as the theory of quantum error-correcting codes?

In a recent paper with Daniel Harlow, Fernando Pastawski and John Preskill, we have proposed a toy model of the AdS/CFT correspondence based on quantum error-correcting codes. Fernando has already written how this research project started after a fateful visit by Daniel to Caltech and John’s remarkable prediction in 1999. In this post, I hope to write an introduction which may serve as a reader’s guide to our paper, explaining why I’m so fascinated by the beauty of the toy model.

This is certainly a challenging task because I need to make it accessible to everyone while explaining real physics behind the paper. My personal philosophy is that a toy model must be as simple as possible while capturing key properties of the system of interest. In this post, I will try to extract some key features of the AdS/CFT correspondence and construct a toy model which captures these features. This post may be a bit technical compared to other recent posts, but anyway, let me give it a try…

Bulk locality paradox and quantum error-correction

The AdS/CFT correspondence says that there is some kind of correspondence between quantum gravity on (d+1)-dimensional asymptotically-AdS space and d-dimensional conformal field theory on its boundary. But how are they related?

The AdS-Rindler reconstruction tells us how to “reconstruct” a bulk operator from boundary operators. Consider a bulk operator \phi and a boundary region A on a hyperbolic space (in other words, a negatively-curved plane). On a fixed time-slice, the causal wedge of A is a bulk region enclosed by the geodesic line of A (a curve with a minimal length). The AdS-Rindler reconstruction says that \phi can be represented by some integral of local boundary operators supported on A if and only if \phi is contained inside the causal wedge of A. Of course, there are multiple regions A,B,C,… whose causal wedges contain \phi, and the reconstruction should work for any such region.

fig_Rindler

The Rindler-wedge reconstruction

That a bulk operator in the causal wedge can be reconstructed by local boundary operators, however, leads to a rather perplexing paradox in the AdS/CFT correspondence. Consider a bulk operator \phi at the center of a hyperbolic space, and split the boundary into three pieces, A, B, C. Then the geodesic line for the union of BC encloses the bulk operator, that is, \phi is contained inside the causal wedge of BC. So, \phi can be represented by local boundary operators supported on BC. But the same argument applies to AB and CA, implying that the bulk operator \phi corresponds to local boundary operators which are supported inside AB, BC and CA simultaneously. It would seem then that the bulk operator \phi must correspond to an identity operator times a complex phase. In fact, similar arguments apply to any bulk operators, and thus, all the bulk operators must correspond to identity operators on the boundary. Then, the AdS/CFT correspondence seems so boring…

fig_paradox

The bulk operator at the center is contained inside causal wedges of BC, AB, AC. Does this mean that the bulk operator corresponds to an identity operator on the boundary?

Almheiri, Dong and Harlow have recently proposed an intriguing way of reconciling this paradox with the AdS/CFT correspondence. They proposed that the AdS/CFT correspondence can be viewed as a quantum error-correcting code. Their idea is as follows. Instead of \phi corresponding to a single boundary operator, \phi may correspond to different operators in different regions, say O_{AB}, O_{BC}, O_{CA} living in AB, BC, CA respectively. Even though O_{AB}, O_{BC}, O_{CA} are different boundary operators, they may be equivalent inside a certain low energy subspace on the boundary.

This situation resembles the so-called quantum secret-sharing code. The quantum information at the center of the bulk cannot be accessed from any single party A, B or C because \phi does not have representation on A, B, or C. It can be accessed only if multiple parties cooperate and perform joint measurements. It seems that a quantum secret is shared among three parties, and the AdS/CFT correspondence somehow realizes the three-party quantum secret-sharing code!

Entanglement wedge reconstruction?

Recently, causal wedge reconstruction has been further generalized to the notion of entanglement wedge reconstruction. Imagine we split the boundary into four pieces A,B,C,D such that A,C are larger than B,D. Then the geodesic lines for A and C do not form the geodesic line for the union of A and C because we can draw shorter arcs by connecting endpoints of A and C, which form the global geodesic line. The entanglement wedge of AC is a bulk region enclosed by this global geodesic line of AC. And the entanglement wedge reconstruction predicts that \phi can be represented as an integral of local boundary operators on AC if and only if \phi is inside the entanglement wedge of AC [1].

fig_reconstruction

Causal wedge vs entanglement wedge.

Building a minimal toy model; the five-qubit code

Okay, now let’s try to construct a toy model which admits causal and entanglement wedge reconstructions of bulk operators. Because I want a simple toy model, I take a rather bold assumption that the bulk consists of a single qubit while the boundary consists of five qubits, denoted by A, B, C, D, E.

fig_minimal

Reconstruction of a bulk operator in the “minimal” model.

What does causal wedge reconstruction teach us in this minimal setup of five and one qubits? First, we split the boundary system into two pieces, ABC and DE and observe that the bulk operator \phi is contained inside the causal wedge of ABC. From the rotational symmetries, we know that the bulk operator \phi must have representations on ABC, BCD, CDE, DEA, EAB. Next, we split the boundary system into four pieces, AB, C, D and E, and observe that the bulk operator \phi is contained inside the entanglement wedge of AB and D. So, the bulk operator \phi must have representations on ABD, BCE, CDA, DEB, EAC. In summary, we have the following:

  • The bulk operator must have representations on R if and only if R contains three or more qubits.

This is the property I want my toy model to possess.

What kinds of physical systems have such a property? Luckily, we quantum information theorists know the answer; the five-qubit code. The five-qubit code, proposed here and here, has an ability to encode one logical qubit into five-qubit entangled states and corrects any single qubit error. We can view the five-qubit code as a quantum encoding isometry from one-qubit states to five-qubit states:

\alpha | 0 \rangle + \beta | 1 \rangle \rightarrow \alpha | \tilde{0} \rangle + \beta | \tilde{1} \rangle

where | \tilde{0} \rangle and | \tilde{1} \rangle are the basis for a logical qubit. In quantum coding theory, logical Pauli operators \bar{X} and \bar{Z} are Pauli operators which act like Pauli X (bit flip) and Z (phase flip) on a logical qubit spanned by | \tilde{0} \rangle and | \tilde{1} \rangle. In the five-qubit code, for any set of qubits R with volume 3, some representations of logical Pauli X and Z operators, \bar{X}_{R} and \bar{Z}_{R}, can be found on R. While \bar{X}_{R} and \bar{X}_{R'} are different operators for R \not= R', they act exactly in the same manner on the codeword subspace spanned by | \tilde{0} \rangle and | \tilde{1} \rangle. This is exactly the property I was looking for.

Holographic quantum error-correcting codes

We just found possibly the smallest toy model of the AdS/CFT correspondence, the five-qubit code! The remaining task is to construct a larger model. For this goal, we view the encoding isometry of the five-qubit code as a six-leg tensor. The holographic quantum code is a network of such six-leg tensors covering a hyperbolic space where each tensor has one open leg. These open legs on the bulk are interpreted as logical input legs of a quantum error-correcting code while open legs on the boundary are identified as outputs where quantum information is encoded. Then the entire tensor network can be viewed as an encoding isometry.

The six-leg tensor has some nice properties. Imagine we inject some Pauli operator into one of six legs in the tensor. Then, for any given choice of three legs, there always exists a Pauli operator acting on them which counteracts the effect of the injection. An example is shown below:

fig_pushing

In other words, if an operator is injected from one tensor leg, one can “push” it into other three tensor legs.

Finally, let’s demonstrate causal wedge reconstruction of bulk logical operators. Pick an arbitrary open tensor leg in the bulk and inject some Pauli operator into it. We can “push” it into three tensor legs, which are then injected into neighboring tensors. By repeatedly pushing operators to the boundary in the network, we eventually have some representation of the operator living on a piece of boundary region A. And the bulk operator is contained inside the causal wedge of A. (Here, the length of the curve can be defined as the number of tensor legs cut by the curve). You can also push operators into the boundary by choosing different tensor legs which lead to different representations of a logical operator. You can even have a rather exotic representation which is supported non-locally over two disjoint pieces of the boundary, realizing entanglement wedge reconstruction.

fig_example

Causal wedge and entanglement wedge reconstruction.

What’s next?

This post is already pretty long and I need to wrap it up…

Shor’s quantum factoring algorithm is a revolutionary invention which opened a whole new research avenue of quantum information science. It is often forgotten, but the first quantum error-correcting code is another important invention by Peter Shor (and independently by Andrew Steane) which enabled a proof that the quantum computation can be performed fault-tolerantly. The theory of quantum error-correcting codes has found interesting applications in studies of condensed matter physics, such as topological phases of matter. Perhaps then, quantum coding theory will also find applications in high energy physics.

Indeed, many interesting open problems are awaiting us. Is entanglement wedge reconstruction a generic feature of tensor networks? How do we describe black holes by quantum error-correcting codes? Can we build a fast scrambler by tensor networks? Is entanglement a wormhole (or maybe a perfect tensor)? Can we resolve the firewall paradox by holographic quantum codes? Can the physics of quantum gravity be described by tensor networks? Or can the theory of quantum gravity provide us with novel constructions of quantum codes?

I feel that now is the time for quantum information scientists to jump into the research of black holes. We don’t know if we will be burned by a firewall or not … , but it is worth trying.



1. Whether entanglement wedge reconstruction is possible in the AdS/CFT correspondence or not still remains controversial. In the spirit of the Ryu-Takayanagi formula which relates entanglement entropy to the length of a global geodesic line, entanglement wedge reconstruction seems natural. But that a bulk operator can be reconstructed from boundary operators on two separate pieces A and C non-locally sounds rather exotic. In our paper, we constructed a toy model of tensor networks which allows both causal and entanglement wedge reconstruction in many cases. For details, see our paper. 

Bell’s inequality 50 years later

This is a jubilee year.* In November 1964, John Bell submitted a paper to the obscure (and now defunct) journal Physics. That paper, entitled “On the Einstein Podolsky Rosen Paradox,” changed how we think about quantum physics.

The paper was about quantum entanglement, the characteristic correlations among parts of a quantum system that are profoundly different than correlations in classical systems. Quantum entanglement had first been explicitly discussed in a 1935 paper by Einstein, Podolsky, and Rosen (hence Bell’s title). Later that same year, the essence of entanglement was nicely and succinctly captured by Schrödinger, who said, “the best possible knowledge of a whole does not necessarily include the best possible knowledge of its parts.” Schrödinger meant that even if we have the most complete knowledge Nature will allow about the state of a highly entangled quantum system, we are still powerless to predict what we’ll see if we look at a small part of the full system. Classical systems aren’t like that — if we know everything about the whole system then we know everything about all the parts as well. I think Schrödinger’s statement is still the best way to explain quantum entanglement in a single vigorous sentence.

To Einstein, quantum entanglement was unsettling, indicating that something is missing from our understanding of the quantum world. Bell proposed thinking about quantum entanglement in a different way, not just as something weird and counter-intuitive, but as a resource that might be employed to perform useful tasks. Bell described a game that can be played by two parties, Alice and Bob. It is a cooperative game, meaning that Alice and Bob are both on the same side, trying to help one another win. In the game, Alice and Bob receive inputs from a referee, and they send outputs to the referee, winning if their outputs are correlated in a particular way which depends on the inputs they receive.

But under the rules of the game, Alice and Bob are not allowed to communicate with one another between when they receive their inputs and when they send their outputs, though they are allowed to use correlated classical bits which might have been distributed to them before the game began. For a particular version of Bell’s game, if Alice and Bob play their best possible strategy then they can win the game with a probability of success no higher than 75%, averaged uniformly over the inputs they could receive. This upper bound on the success probability is Bell’s famous inequality.**

Classical and quantum versions of Bell's game. If Alice and Bob share entangled qubits rather than classical bits, then they can win the game with a higher success probability.

Classical and quantum versions of Bell’s game. If Alice and Bob share entangled qubits rather than classical bits, then they can win the game with a higher success probability.

There is also a quantum version of the game, in which the rules are the same except that Alice and Bob are now permitted to use entangled quantum bits (“qubits”)  which were distributed before the game began. By exploiting their shared entanglement, they can play a better quantum strategy and win the game with a higher success probability, better than 85%. Thus quantum entanglement is a useful resource, enabling Alice and Bob to play the game better than if they shared only classical correlations instead of quantum correlations.

And experimental physicists have been playing the game for decades, winning with a success probability that violates Bell’s inequality. The experiments indicate that quantum correlations really are fundamentally different than, and stronger than, classical correlations.

Why is that such a big deal? Bell showed that a quantum system is more than just a probabilistic classical system, which eventually led to the realization (now widely believed though still not rigorously proven) that accurately predicting the behavior of highly entangled quantum systems is beyond the capacity of ordinary digital computers. Therefore physicists are now striving to scale up the weirdness of the microscopic world to larger and larger scales, eagerly seeking new phenomena and unprecedented technological capabilities.

1964 was a good year. Higgs and others described the Higgs mechanism, Gell-Mann and Zweig proposed the quark model, Penzias and Wilson discovered the cosmic microwave background, and I saw the Beatles on the Ed Sullivan show. Those developments continue to reverberate 50 years later. We’re still looking for evidence of new particle physics beyond the standard model, we’re still trying to unravel the large scale structure of the universe, and I still like listening to the Beatles.

Bell’s legacy is that quantum entanglement is becoming an increasingly pervasive theme of contemporary physics, important not just as the source of a quantum computer’s awesome power, but also as a crucial feature of exotic quantum phases of matter, and even as a vital element of the quantum structure of spacetime itself. 21st century physics will advance not only by probing the short-distance frontier of particle physics and the long-distance frontier of cosmology, but also by exploring the entanglement frontier, by elucidating and exploiting the properties of increasingly complex quantum states.

frontiersSometimes I wonder how the history of physics might have been different if there had been no John Bell. Without Higgs, Brout and Englert and others would have elucidated the spontaneous breakdown of gauge symmetry in 1964. Without Gell-Mann, Zweig could have formulated the quark model. Without Penzias and Wilson, Dicke and collaborators would have discovered the primordial black-body radiation at around the same time.

But it’s not obvious which contemporary of Bell, if any, would have discovered his inequality in Bell’s absence. Not so many good physicists were thinking about quantum entanglement and hidden variables at the time (though David Bohm may have been one notable exception, and his work deeply influenced Bell.) Without Bell, the broader significance of quantum entanglement would have unfolded quite differently and perhaps not until much later. We really owe Bell a great debt.

*I’m stealing the title and opening sentence of this post from Sidney Coleman’s great 1981 lectures on “The magnetic monopole 50 years later.” (I’ve waited a long time for the right opportunity.)

**I’m abusing history somewhat. Bell did not use the language of games, and this particular version of the inequality, which has since been extensively tested in experiments, was derived by Clauser, Horne, Shimony, and Holt in 1969.

When I met with Steven Spielberg to talk about Interstellar

Today I had the awesome and eagerly anticipated privilege of attending a screening of the new film Interstellar, directed by Christopher Nolan. One can’t help but be impressed by Nolan’s fertile visual imagination. But you should know that Caltech’s own Kip Thorne also had a vital role in this project. Indeed, were there no Kip Thorne, Interstellar would never have happened.

On June 2, 2006, I participated in an unusual one-day meeting at Caltech, organized by Kip and the movie producer Lynda Obst (Sleepless in Seattle, Contact, The Invention of Lying, …). Lynda and Kip, who have been close since being introduced by their mutual friend Carl Sagan decades ago, had conceived a movie project together, and had collaborated on a “treatment” outlining the story idea. The treatment adhered to a core principle that was very important to Kip — that the movie be scientifically accurate. Though the story indulged in some wild speculations, at Kip’s insistence it skirted away from any flagrant violation of the firmly established laws of Nature. This principle of scientifically constrained speculation intrigued Steven Spielberg, who was interested in directing.

The purpose of the meeting was to brainstorm about the story and the science behind it with Spielberg, Obst, and Thorne. A remarkable group assembled, including physicists (Andrei Linde, Lisa Randall, Savas Dimopoulos, Mark Wise, as well as Kip), astrobiologists (Frank Drake, David Grinspoon), planetary scientists (Alan Boss, John Spencer, Dave Stevenson), and psychologists (Jay Buckey, James Carter, David Musson). As we all chatted and got acquainted, I couldn’t help but feel that we were taking part in the opening scene of a movie about making a movie. Spielberg came late and left early, but spent about three hours with us; he even brought along his Dad (an engineer).

Time_cover_interstellarThough the official release of Interstellar is still a few days away, you may already know from numerous media reports (including the cover story in this week’s Time Magazine) the essential elements of the story, which involves traveling through a wormhole seeking a new planet for humankind, a replacement for the hopelessly ravaged earth. The narrative evolved substantially as the project progressed, but traveling through a wormhole to visit a distant planet was already central to the original story.

Inevitably, some elements of the Obst/Thorne treatment did not survive in the final film. For one, Stephen Hawking was a prominent character in the original story; he joined the mission because of his unparalleled expertise at wormhole transversal, and Stephen’s ALS symptoms eased during prolonged weightlessness, only to recur upon return to earth gravity. Also, gravitational waves played a big part in the treatment; in particular the opening scene depicted LIGO scientists discovering the wormhole by detecting the gravitational waves emanating from it.

There was plenty to discuss to fill our one-day workshop, including: the rocket technology needed for the trip, the strong but stretchy materials that would allow the ship to pass through the wormhole without being torn apart by tidal gravity, how to select a crew psychologically fit for such a dangerous mission, what exotic life forms might be found on other worlds, how to communicate with an advanced civilization which resides in a higher dimensional bulk rather than the three-dimensional brane to which we’re confined, how to build a wormhole that stays open rather than pinching off and crushing those who attempt to pass through, and whether a wormhole could enable travel backward in time.

Spielberg was quite engaged in our discussions. Upon his arrival I immediately shot off a text to my daughter Carina: “Steven Spielberg is wearing a Brown University cap!” (Carina was a Brown student at the time, as Spielberg’s daughter had been.) Steven assured us of his keen interest in the project, noting wryly that “Aliens have been very good to me,” and he mentioned some of his favorite space movies, which included some I had also enjoyed as a kid, like Forbidden Planet and (the original) The Day the Earth Stood Still. In one notable moment, Spielberg asked the group “Who believes that intelligent life exists elsewhere in the universe?” We all raised our hands. “And who believes that the earth has been visited by extraterrestrial civilizations?” No one raised a hand. Steven seemed struck by our unanimity, on both questions.

I remember tentatively suggesting that the extraterrestrials had mastered M-theory, thus attaining computational power far beyond the comprehension of earthlings, and that they themselves were really advanced robots, constructed by an earlier generation of computers. Like many of the fun story ideas floated that day, this one had no apparent impact on the final version of the film.

Spielberg later brought in Jonah Nolan to write the screenplay. When Spielberg had to abandon the project because his DreamWorks production company broke up with Paramount Pictures (which owned the story), Jonah’s brother Chris Nolan eventually took over the project. Jonah and Chris Nolan transformed the story, but continued to consult extensively with Kip, who became an Executive Producer and says he is pleased with the final result.

Of the many recent articles about Interstellar, one of the most interesting is this one in Wired by Adam Rogers, which describes how Kip worked closely with the visual effects team at Double Negative to ensure that wormholes and rapidly rotating black holes are accurately depicted in the film (though liberties were taken to avoid confusing the audience). The images produced by sophisticated ray tracing computations were so surprising that at first Kip thought there must be a bug in the software, though eventually he accepted that the calculations are correct, and he is still working hard to more fully understand the results.

ScienceofInterstellarMech.inddI can’t give away the ending of the movie, but I can safely say this: When it’s over you’re going to have a lot of questions. Fortunately for all of us, Kip’s book The Science of Interstellar will be available the same day the movie goes into wide release (November 7), so we’ll all know where to seek enlightenment.

In fact on that very same day we’ll be treated to the release of The Theory of Everything, a biopic about Stephen and Jane Hawking. So November 7 is going to be an unforgettable Black Hole Day. Enjoy!

Inflation on the back of an envelope

Last Monday was an exciting day!

After following the BICEP2 announcement via Twitter, I had to board a transcontinental flight, so I had 5 uninterrupted hours to think about what it all meant. Without Internet access or references, and having not thought seriously about inflation for decades, I wanted to reconstruct a few scraps of knowledge needed to interpret the implications of r ~ 0.2.

I did what any physicist would have done … I derived the basic equations without worrying about niceties such as factors of 3 or 2 \pi. None of what I derived was at all original —  the theory has been known for 30 years — but I’ve decided to turn my in-flight notes into a blog post. Experts may cringe at the crude approximations and overlooked conceptual nuances, not to mention the missing references. But some mathematically literate readers who are curious about the implications of the BICEP2 findings may find these notes helpful. I should emphasize that I am not an expert on this stuff (anymore), and if there are serious errors I hope better informed readers will point them out.

By tradition, careless estimates like these are called “back-of-the-envelope” calculations. There have been times when I have made notes on the back of an envelope, or a napkin or place mat. But in this case I had the presence of mind to bring a notepad with me.

Notes from a plane ride

Notes from a plane ride

According to inflation theory, a nearly homogeneous scalar field called the inflaton (denoted by \phi)  filled the very early universe. The value of \phi varied with time, as determined by a potential function V(\phi). The inflaton rolled slowly for a while, while the dark energy stored in V(\phi) caused the universe to expand exponentially. This rapid cosmic inflation lasted long enough that previously existing inhomogeneities in our currently visible universe were nearly smoothed out. What inhomogeneities remained arose from quantum fluctuations in the inflaton and the spacetime geometry occurring during the inflationary period.

Gradually, the rolling inflaton picked up speed. When its kinetic energy became comparable to its potential energy, inflation ended, and the universe “reheated” — the energy previously stored in the potential V(\phi) was converted to hot radiation, instigating a “hot big bang”. As the universe continued to expand, the radiation cooled. Eventually, the energy density in the universe came to be dominated by cold matter, and the relic fluctuations of the inflaton became perturbations in the matter density. Regions that were more dense than average grew even more dense due to their gravitational pull, eventually collapsing into the galaxies and clusters of galaxies that fill the universe today. Relic fluctuations in the geometry became gravitational waves, which BICEP2 seems to have detected.

Both the density perturbations and the gravitational waves have been detected via their influence on the inhomogeneities in the cosmic microwave background. The 2.726 K photons left over from the big bang have a nearly uniform temperature as we scan across the sky, but there are small deviations from perfect uniformity that have been precisely measured. We won’t worry about the details of how the size of the perturbations is inferred from the data. Our goal is to achieve a crude understanding of how the density perturbations and gravitational waves are related, which is what the BICEP2 results are telling us about. We also won’t worry about the details of the shape of the potential function V(\phi), though it’s very interesting that we might learn a lot about that from the data.

Exponential expansion

Einstein’s field equations tell us how the rate at which the universe expands during inflation is related to energy density stored in the scalar field potential. If a(t) is the “scale factor” which describes how lengths grow with time, then roughly

\left(\frac{\dot a}{a}\right)^2 \sim \frac{V}{m_P^2}.

Here \dot a means the time derivative of the scale factor, and m_P = 1/\sqrt{8 \pi G} \approx 2.4 \times 10^{18} GeV is the Planck scale associated with quantum gravity. (G is Newton’s gravitational constant.) I’ve left our a factor of 3 on purpose, and I used the symbol ~ rather than = to emphasize that we are just trying to get a feel for the order of magnitude of things. I’m using units in which Planck’s constant \hbar and the speed of light c are set to one, so mass, energy, and inverse length (or inverse time) all have the same dimensions. 1 GeV means one billion electron volts, about the mass of a proton.

(To persuade yourself that this is at least roughly the right equation, you should note that a similar equation applies to an expanding spherical ball of radius a(t) with uniform mass density V. But in the case of the ball, the mass density would decrease as the ball expands. The universe is different — it can expand without diluting its mass density, so the rate of expansion \dot a / a does not slow down as the expansion proceeds.)

During inflation, the scalar field \phi and therefore the potential energy V(\phi) were changing slowly; it’s a good approximation to assume V is constant. Then the solution is

a(t) \sim a(0) e^{Ht},

where H, the Hubble constant during inflation, is

H \sim \frac{\sqrt{V}}{m_P}.

To explain the smoothness of the observed universe, we require at least 50 “e-foldings” of inflation before the universe reheated — that is, inflation should have lasted for a time at least 50 H^{-1}.

Slow rolling

During inflation the inflaton \phi rolls slowly, so slowly that friction dominates inertia — this friction results from the cosmic expansion. The speed of rolling \dot \phi is determined by

H \dot \phi \sim -V'(\phi).

Here V'(\phi) is the slope of the potential, so the right-hand side is the force exerted by the potential, which matches the frictional force on the left-hand side. The coefficient of \dot \phi has to be H on dimensional grounds. (Here I have blown another factor of 3, but let’s not worry about that.)

Density perturbations

The trickiest thing we need to understand is how inflation produced the density perturbations which later seeded the formation of galaxies. There are several steps to the argument.

Quantum fluctuations of the inflaton

As the universe inflates, the inflaton field is subject to quantum fluctuations, where the size of the fluctuation depends on its wavelength. Due to inflation, the wavelength increases rapidly, like e^{Ht}, and once the wavelength gets large compared to H^{-1}, there isn’t enough time for the fluctuation to wiggle — it gets “frozen in.” Much later, long after the reheating of the universe, the oscillation period of the wave becomes comparable to the age of the universe, and then it can wiggle again. (We say that the fluctuations “cross the horizon” at that stage.) Observations of the anisotropy of the microwave background have determined how big the fluctuations are at the time of horizon crossing. What does inflation theory say about that?

Well, first of all, how big are the fluctuations when they leave the horizon during inflation? Then the wavelength is H^{-1} and the universe is expanding at the rate H, so H is the only thing the magnitude of the fluctuations could depend on. Since the field \phi has the same dimensions as H, we conclude that fluctuations have magnitude

\delta \phi \sim H.

From inflaton fluctuations to density perturbations

Reheating occurs abruptly when the inflaton field reaches a particular value. Because of the quantum fluctuations, some horizon volumes have larger than average values of \phi and some have smaller than average values; hence different regions reheat at slightly different times. The energy density in regions that reheat earlier starts to be reduced by expansion (“red shifted”) earlier, so these regions have a smaller than average energy density. Likewise, regions that reheat later start to red shift later, and wind up having larger than average density.

When we compare different regions of comparable size, we can find the typical (root-mean-square) fluctuations \delta t in the reheating time, knowing the fluctuations in \phi and the rolling speed \dot \phi:

\delta t \sim \frac{\delta \phi}{\dot \phi} \sim \frac{H}{\dot\phi}.

Small fractional fluctuations in the scale factor a right after reheating produce comparable small fractional fluctuations in the energy density \rho. The expansion rate right after reheating roughly matches the expansion rate H right before reheating, and so we find that the characteristic size of the density perturbations is

\delta_S\equiv\left(\frac{\delta \rho}{\rho}\right)_{hor} \sim \frac{\delta a}{a} \sim \frac{\dot a}{a} \delta t\sim \frac{H^2}{\dot \phi}.

The subscript hor serves to remind us that this is the size of density perturbations as they cross the horizon, before they get a chance to grow due to gravitational instabilities. We have found our first important conclusion: The density perturbations have a size determined by the Hubble constant H and the rolling speed \dot \phi of the inflaton, up to a factor of order one which we have not tried to keep track of. Insofar as the Hubble constant and rolling speed change slowly during inflation, these density perturbations have a strength which is nearly independent of the length scale of the perturbation. From here on we will denote this dimensionless scale of the fluctuations by \delta_S, where the subscript S stands for “scalar”.

Perturbations in terms of the potential

Putting together \dot \phi \sim -V' / H and H^2 \sim V/{m_P}^2 with our expression for \delta_S, we find

\delta_S^2 \sim \frac{H^4}{\dot\phi^2}\sim \frac{H^6}{V'^2} \sim \frac{1}{{m_P}^6}\frac{V^3}{V'^2}.

The observed density perturbations are telling us something interesting about the scalar field potential during inflation.

Gravitational waves and the meaning of r

The gravitational field as well as the inflaton field is subject to quantum fluctuations during inflation. We call these tensor fluctuations to distinguish them from the scalar fluctuations in the energy density. The tensor fluctuations have an effect on the microwave anisotropy which can be distinguished in principle from the scalar fluctuations. We’ll just take that for granted here, without worrying about the details of how it’s done.

While a scalar field fluctuation with wavelength \lambda and strength \delta \phi carries energy density \sim \delta\phi^2 / \lambda^2, a fluctuation of the dimensionless gravitation field h with wavelength \lambda and strength \delta h carries energy density \sim m_P^2 \delta h^2 / \lambda^2. Applying the same dimensional analysis we used to estimate \delta \phi at horizon crossing to the rescaled field m_P h, we estimate the strength \delta_T of the tensor fluctuations (the fluctuations of h) as

\delta_T^2 \sim \frac{H^2}{m_P^2}\sim \frac{V}{m_P^4}.

From observations of the CMB anisotropy we know that \delta_S\sim 10^{-5}, and now BICEP2 claims that the ratio

r = \frac{\delta_T^2}{\delta_S^2}

is about r\sim 0.2 at an angular scale on the sky of about one degree. The conclusion (being a little more careful about the O(1) factors this time) is

V^{1/4} \sim 2 \times 10^{16}~GeV \left(\frac{r}{0.2}\right)^{1/4}.

This is our second important conclusion: The energy density during inflation defines a mass scale, which turns our to be 2 \times 10^{16}~GeV for the observed value of r. This is a very interesting finding because this mass scale is not so far below the Planck scale, where quantum gravity kicks in, and is in fact pretty close to theoretical estimates of the unification scale in supersymmetric grand unified theories. If this mass scale were a factor of 2 smaller, then r would be smaller by a factor of 16, and hence much harder to detect.

Rolling, rolling, rolling, …

Using \delta_S^2 \sim H^4/\dot\phi^2, we can express r as

r = \frac{\delta_T^2}{\delta_S^2}\sim \frac{\dot\phi^2}{m_P^2 H^2}.

It is convenient to measure time in units of the number N = H t of e-foldings of inflation, in terms of which we find

\frac{1}{m_P^2} \left(\frac{d\phi}{dN}\right)^2\sim r;

Now, we know that for inflation to explain the smoothness of the universe we need N larger than 50, and if we assume that the inflaton rolls at a roughly constant rate during N e-foldings, we conclude that, while rolling, the change in the inflaton field is

\frac{\Delta \phi}{m_P} \sim N \sqrt{r}.

This is our third important conclusion — the inflaton field had to roll a long, long, way during inflation — it changed by much more than the Planck scale! Putting in the O(1) factors we have left out reduces the required amount of rolling by about a factor of 3, but we still conclude that the rolling was super-Planckian if r\sim 0.2. That’s curious, because when the scalar field strength is super-Planckian, we expect the kind of effective field theory we have been implicitly using to be a poor approximation because quantum gravity corrections are large. One possible way out is that the inflaton might have rolled round and round in a circle instead of in a straight line, so the field strength stayed sub-Planckian even though the distance traveled was super-Planckian.

Spectral tilt

As the inflaton rolls, the potential energy, and hence also the Hubble constant H, change during inflation. That means that both the scalar and tensor fluctuations have a strength which is not quite independent of length scale. We can parametrize the scale dependence in terms of how the fluctuations change per e-folding of inflation, which is equivalent to the change per logarithmic length scale and is called the “spectral tilt.”

To keep things simple, let’s suppose that the rate of rolling is constant during inflation, at least over the length scales for which we have data. Using \delta_S^2 \sim H^4/\dot\phi^2, and assuming \dot\phi is constant, we estimate the scalar spectral tilt as

-\frac{1}{\delta_S^2}\frac{d\delta_S^2}{d N} \sim - \frac{4 \dot H}{H^2}.

Using \delta_T^2 \sim H^2/m_P^2, we conclude that the tensor spectral tilt is half as big.

From H^2 \sim V/m_P^2, we find

\dot H \sim \frac{1}{2} \dot \phi \frac{V'}{V} H,

and using \dot \phi \sim -V'/H we find

-\frac{1}{\delta_S^2}\frac{d\delta_S^2}{d N} \sim \frac{V'^2}{H^2V}\sim m_P^2\left(\frac{V'}{V}\right)^2\sim \left(\frac{V}{m_P^4}\right)\left(\frac{m_P^6 V'^2}{V^3}\right)\sim \delta_T^2 \delta_S^{-2}\sim r.

Putting in the numbers more carefully we find a scalar spectral tilt of r/4 and a tensor spectral tilt of r/8.

This is our last important conclusion: A relatively large value of r means a significant spectral tilt. In fact, even before the BICEP2 results, the CMB anisotropy data already supported a scalar spectral tilt of about .04, which suggested something like r \sim .16. The BICEP2 detection of the tensor fluctuations (if correct) has confirmed that suspicion.

Summing up

If you have stuck with me this far, and you haven’t seen this stuff before, I hope you’re impressed. Of course, everything I’ve described can be done much more carefully. I’ve tried to convey, though, that the emerging story seems to hold together pretty well. Compared to last week, we have stronger evidence now that inflation occurred, that the mass scale of inflation is high, and that the scalar and tensor fluctuations produced during inflation have been detected. One prediction is that the tensor fluctuations, like the scalar ones, should have a notable spectral tilt, though a lot more data will be needed to pin that down.

I apologize to the experts again, for the sloppiness of these arguments. I hope that I have at least faithfully conveyed some of the spirit of inflation theory in a way that seems somewhat accessible to the uninitiated. And I’m sorry there are no references, but I wasn’t sure which ones to include (and I was too lazy to track them down).

It should also be clear that much can be done to sharpen the confrontation between theory and experiment. A whole lot of fun lies ahead.

Added notes (3/25/2014):

Okay, here’s a good reference, a useful review article by Baumann. (I found out about it on Twitter!)

From Baumann’s lectures I learned a convenient notation. The rolling of the inflaton can be characterized by two “potential slow-roll parameters” defined by

\epsilon = \frac{m_p^2}{2}\left(\frac{V'}{V}\right)^2,\quad \eta = m_p^2\left(\frac{V''}{V}\right).

Both parameters are small during slow rolling, but the relationship between them depends on the shape of the potential. My crude approximation (\epsilon = \eta) would hold for a quadratic potential.

We can express the spectral tilt (as I defined it) in terms of these parameters, finding 2\epsilon for the tensor tilt, and 6 \epsilon - 2\eta for the scalar tilt. To derive these formulas it suffices to know that \delta_S^2 is proportional to V^3/V'^2, and that \delta_T^2 is proportional to H^2; we also use

3H\dot \phi = -V', \quad 3H^2 = V/m_P^2,

keeping factors of 3 that I left out before. (As a homework exercise, check these formulas for the tensor and scalar tilt.)

It is also easy to see that r is proportional to \epsilon; it turns out that r = 16 \epsilon. To get that factor of 16 we need more detailed information about the relative size of the tensor and scalar fluctuations than I explained in the post; I can’t think of a handwaving way to derive it.

We see, though, that the conclusion that the tensor tilt is r/8 does not depend on the details of the potential, while the relation between the scalar tilt and r does depend on the details. Nevertheless, it seems fair to claim (as I did) that, already before we knew the BICEP2 results, the measured nonzero scalar spectral tilt indicated a reasonably large value of r.

Once again, we’re lucky. On the one hand, it’s good to have a robust prediction (for the tensor tilt). On the other hand, it’s good to have a handle (the scalar tilt) for distinguishing among different inflationary models.

One last point is worth mentioning. We have set Planck’s constant \hbar equal to one so far, but it is easy to put the powers of \hbar back in using dimensional analysis (we’ll continue to assume the speed of light c is one). Since Newton’s constant G has the dimensions of length/energy, and the potential V has the dimensions of energy/volume, while \hbar has the dimensions of energy times length, we see that

\delta_T^2 \sim \hbar G^2V.

Thus the production of gravitational waves during inflation is a quantum effect, which would disappear in the limit \hbar \to 0. Likewise, the scalar fluctuation strength \delta_S^2 is also O(\hbar), and hence also a quantum effect.

Therefore the detection of primordial gravitational waves by BICEP2, if correct, confirms that gravity is quantized just like the other fundamental forces. That shouldn’t be a surprise, but it’s nice to know.

My 10 biggest thrills

Wow!

BICEP2 results for the ratio r of gravitational wave perturbations to density perturbations, and the density perturbation spectral tilt n.

Evidence for gravitational waves produced during cosmic inflation. BICEP2 results for the ratio r of gravitational wave perturbations to density perturbations, and the density perturbation spectral tilt n.

Like many physicists, I have been reflecting a lot the past few days about the BICEP2 results, trying to put them in context. Other bloggers have been telling you all about it (here, here, and here, for example); what can I possibly add?

The hoopla this week reminds me of other times I have been really excited about scientific advances. And I recall some wise advice I received from Sean Carroll: blog readers like lists.  So here are (in chronological order)…

My 10 biggest thrills (in science)

This is a very personal list — your results may vary. I’m not saying these are necessarily the most important discoveries of my lifetime (there are conspicuous omissions), just that, as best I can recall, these are the developments that really started my heart pounding at the time.

1) The J/Psi from below (1974)

I was a senior at Princeton during the November Revolution. I was too young to appreciate fully what it was all about — having just learned about the Weinberg-Salam model, I thought at first that the Z boson had been discovered. But by stalking the third floor of Jadwin I picked up the buzz. No, it was charm! The discovery of a very narrow charmonium resonance meant we were on the right track in two ways — charm itself confirmed ideas about the electroweak gauge theory, and the narrowness of the resonance fit in with the then recent idea of asymptotic freedom. Theory triumphant!

2) A magnetic monopole in Palo Alto (1982)

By 1982 I had been thinking about the magnetic monopoles in grand unified theories for a few years. We thought we understood why no monopoles seem to be around. Sure, monopoles would be copiously produced in the very early universe, but then cosmic inflation would blow them away, diluting their density to a hopelessly undetectable value. Then somebody saw one …. a magnetic monopole obediently passed through Blas Cabrera’s loop of superconducting wire, producing a sudden jump in the persistent current. On Valentine’s Day!

According to then current theory, the monopole mass was expected to be about 10^16 GeV (10 million billion times heavier than a proton). Had Nature really been so kind as the bless us with this spectacular message from an staggeringly high energy scale? It seemed too good to be true.

It was. Blas never detected another monopole. As far as I know he never understood what glitch had caused the aberrant signal in his device.

3) “They’re green!” High-temperature superconductivity (1987)

High-temperature superconductors were discovered in 1986 by Bednorz and Mueller, but I did not pay much attention until Paul Chu found one in early 1987 with a critical temperature of 77 K. Then for a while the critical temperature seemed to be creeping higher and higher on an almost daily basis, eventually topping 130K …. one wondered whether it might go up, up, up forever.

It didn’t. Today 138K still seems to be the record.

My most vivid memory is that David Politzer stormed into my office one day with a big grin. “They’re green!” he squealed. David did not mean that high-temperature superconductors would be good for the environment. He was passing on information he had just learned from Phil Anderson, who happened to be visiting Caltech: Chu’s samples were copper oxides.

4) “Now I have mine” Supernova 1987A (1987)

What was most remarkable and satisfying about the 1987 supernova in the nearby Large Magellanic Cloud was that the neutrinos released in a ten second burst during the stellar core collapse were detected here on earth, by gigantic water Cerenkov detectors that had been built to test grand unified theories by looking for proton decay! Not a truly fundamental discovery, but very cool nonetheless.

Soon after it happened some of us were loafing in the Lauritsen seminar room, relishing the good luck that had made the detection possible. Then Feynman piped up: “Tycho Brahe had his supernova, Kepler had his, … and now I have mine!” We were all silent for a few seconds, and then everyone burst out laughing, with Feynman laughing the hardest. It was funny because Feynman was making fun of his own gargantuan ego. Feynman knew a good gag, and I heard him use this line at a few other opportune times thereafter.

5) Science by press conference: Cold fusion (1989)

The New York Times was my source for the news that two chemists claimed to have produced nuclear fusion in heavy water using an electrochemical cell on a tabletop. I was interested enough to consult that day with our local nuclear experts Charlie Barnes, Bob McKeown, and Steve Koonin, none of whom believed it. Still, could it be true?

I decided to spend a quiet day in my office, trying to imagine ways to induce nuclear fusion by stuffing deuterium into a palladium electrode. I came up empty.

My interest dimmed when I heard that they had done a “control” experiment using ordinary water, had observed the same excess heat as with heavy water, and remained just as convinced as before that they were observing fusion. Later, Caltech chemist Nate Lewis gave a clear and convincing talk to the campus community debunking the original experiment.

6) “The face of God” COBE (1992)

I’m often too skeptical. When I first heard in the early 1980s about proposals to detect the anisotropy in the cosmic microwave background, I doubted it would be possible. The signal is so small! It will be blurred by reionization of the universe! What about the galaxy! What about the dust! Blah, blah, blah, …

The COBE DMR instrument showed it could be done, at least at large angular scales, and set the stage for the spectacular advances in observational cosmology we’ve witnessed over the past 20 years. George Smoot infamously declared that he had glimpsed “the face of God.” Overly dramatic, perhaps, but he was excited! And so was I.

7) “83 SNU” Gallex solar neutrinos (1992)

Until 1992 the only neutrinos from the sun ever detected were the relatively high energy neutrinos produced by nuclear reactions involving boron and beryllium — these account for just a tiny fraction of all neutrinos emitted. Fewer than expected were seen, a puzzle that could be resolved if neutrinos have mass and oscillate to another flavor before reaching earth. But it made me uncomfortable that the evidence for solar neutrino oscillations was based on the boron-beryllium side show, and might conceivably be explained just by tweaking the astrophysics of the sun’s core.

The Gallex experiment was the first to detect the lower energy pp neutrinos, the predominant type coming from the sun. The results seemed to confirm that we really did understand the sun and that solar neutrinos really oscillate. (More compelling evidence, from SNO, came later.) I stayed up late the night I heard about the Gallex result, and gave a talk the next day to our particle theory group explaining its significance. The talk title was “83 SNU” — that was the initially reported neutrino flux in Solar Neutrino Units, later revised downward somewhat.

8) Awestruck: Shor’s algorithm (1994)

I’ve written before about how Peter Shor’s discovery of an efficient quantum algorithm for factoring numbers changed my life. This came at a pivotal time for me, as the SSC had been cancelled six months earlier, and I was growing pessimistic about the future of particle physics. I realized that observational cosmology would have a bright future, but I sensed that theoretical cosmology would be dominated by data analysis, where I would have little comparative advantage. So I became a quantum informationist, and have not regretted it.

9) The Higgs boson at last (2012)

The discovery of the Higgs boson was exciting because we had been waiting soooo long for it to happen. Unable to stream the live feed of the announcement, I followed developments via Twitter. That was the first time I appreciated the potential value of Twitter for scientific communication, and soon after I started to tweet.

10) A lucky universe: BICEP2 (2014)

Many past experiences prepared me to appreciate the BICEP2 announcement this past Monday.

I first came to admire Alan Guth‘s distinctive clarity of thought in the fall of 1973 when he was the instructor for my classical mechanics course at Princeton (one of the best classes I ever took). I got to know him better in the summer of 1979 when I was a graduate student, and Alan invited me to visit Cornell because we were both interested in magnetic monopole production  in the very early universe. Months later Alan realized that cosmic inflation could explain the isotropy and flatness of the universe, as well as the dearth of magnetic monopoles. I recall his first seminar at Harvard explaining his discovery. Steve Weinberg had to leave before the seminar was over, and Alan called as Steve walked out, “I was hoping to hear your reaction.” Steve replied, “My reaction is applause.” We all felt that way.

I was at a wonderful workshop in Cambridge during the summer of 1982, where Alan and others made great progress in understanding the origin of primordial density perturbations produced from quantum fluctuations during inflation (Bardeen, Steinhardt, Turner, Starobinsky, and Hawking were also working on that problem, and they all reached a consensus by the end of the three-week workshop … meanwhile I was thinking about the cosmological implications of axions).

I also met Andrei Linde at that same workshop, my first encounter with his mischievous grin and deadpan wit. (There was a delegation of Russians, who split their time between Xeroxing papers and watching the World Cup on TV.) When Andrei visited Caltech in 1987, I took him to Disneyland, and he had even more fun than my two-year-old daughter.

During my first year at Caltech in 1984, Mark Wise and Larry Abbott told me about their calculations of the gravitational waves produced during inflation, which they used to derive a bound on the characteristic energy scale driving inflation, a few times 10^16 GeV. We mused about whether the signal might turn out to be detectable someday. Would Nature really be so kind as to place that mass scale below the Abbott-Wise bound, yet high enough (above 10^16 GeV) to be detectable? It seemed unlikely.

Last week I caught up with the rumors about the BICEP2 results by scanning my Twitter feed on my iPad, while still lying in bed during the early morning. I immediately leapt up and stumbled around the house in the dark, mumbling to myself over and over again, “Holy Shit! … Holy Shit! …” The dog cast a curious glance my way, then went back to sleep.

Like millions of others, I was frustrated Monday morning, trying to follow the live feed of the discovery announcement broadcast from the hopelessly overtaxed Center for Astrophysics website. I was able to join in the moment, though, by following on Twitter, and I indulged in a few breathless tweets of my own.

Many of his friends have been thinking a lot these past few days about Andrew Lange, who had been the leader of the BICEP team (current senior team members John Kovac and Chao-Lin Kuo were Caltech postdocs under Andrew in the mid-2000s). One day in September 2007 he sent me an unexpected email, with the subject heading “the bard of cosmology.” Having discovered on the Internet a poem I had written to introduce a seminar by Craig Hogan, Andrew wrote:

“John,

just came across this – I must have been out of town for the event.

l love it.

it will be posted prominently in our lab today (with “LISA” replaced by “BICEP”, and remain our rallying cry till we detect the B-mode.

have you set it to music yet?

a”

I lifted a couplet from that poem for one of my tweets (while rumors were swirling prior to the official announcement):

We’ll finally know how the cosmos behaves
If we can detect gravitational waves.

Assuming the BICEP2 measurement r ~ 0.2 is really a detection of primordial gravitational waves, we have learned that the characteristic mass scale during inflation is an astonishingly high 2 X 10^16 GeV. Were it a factor of 2 smaller, the signal would have been far too small to detect in current experiments. This time, Nature really is on our side, eagerly revealing secrets about physics at a scale far, far beyond what we will every explore using particle accelerators. We feel lucky.

We physicists can never quite believe that the equations we scrawl on a notepad actually have something to do with the real universe. You would think we’d be used to that by now, but we’re not — when it happens we’re amazed. In my case, never more so than this time.

The BICEP2 paper, a historic document (if the result holds up), ends just the way it should:

“We dedicate this paper to the memory of Andrew Lange, whom we sorely miss.”