Making predictions in the multiverse

Image

I am a theoretical physicist at University of California, Berkeley. Last month, I attended a very interesting conference organized by Foundamental Questions Institute (FQXi) in Puerto Rico, and presented a talk about making predictions in cosmology, especially in the eternally inflating multiverse. I very much enjoyed discussions with people at the conference, where I was invited to post a non-technical account of the issue as well as my own view of it. So here I am.

I find it quite remarkable that some of us in the physics community are thinking with some “confidence” that we live in the multiverse, more specifically one of the many universes in which low-energy physical laws take different forms. (For example, these universes have different elementary particles with different properties, possibly different spacetime dimensions, and so on.) This idea of the multiverse, as we currently think, is not simply a result of random imagination by theorists, but is based on several pieces of observational and theoretical evidence.

Observationally, we have learned more and more that we live in a highly special universe—it seems that the “physical laws” of our universe (summarized in the form of standard models of particle physics and cosmology) takes such a special form that if its structure were varied slightly, then there would be no interesting structure in the universe, let alone intelligent life. It is hard to understand this fact unless there are many universes with varying “physical laws,” and we simply happen to emerge in a universe which allows for intelligent life to develop (which seems to require special conditions). With multiple universes, we can understand the “specialness” of our universe precisely as we understand the “specialness” of our planet Earth (e.g. the ideal distance from the sun), which is only one of the many planets out there.

Perhaps more nontrivial is the fact that our current theory of fundamental physics leads to this picture of the multiverse in a very natural way. Imagine that at some point in the history of the universe, space is exponentially expanding. This expansion—called inflation—occurs when space is filled with a “positive vacuum energy” (which happens quite generally). We knew, already in 80’s, that such inflation is generically eternal. During inflation, various non-inflating regions called bubble universes—of which our own universe could be one—may form, much like bubbles in boiling water. Since ambient space expands exponentially, however, these bubbles do not percolate; rather, the process of creating bubble universes lasts forever in an eternally inflating background. Now, recent progress in string theory suggests that low energy theories describing phyics in these bubble universes (such as the elementary particle content and their properties) may differ bubble by bubble. This is precisely the setup needed to understand the “specialness” of our universe because of the selection effect associated with our own existence, as described above.

multiverse

A schematic depiction of the eternally inflating multiverse. The horizontal and vertical directions correspond to spatial and time directions, respectively, and various regions with the inverted triangle or argyle shape represent different universes. While regions closer to the upper edge of the diagram look smaller, it is an artifact of the rescaling made to fit the large spacetime into a finite drawing—the fractal structure near the upper edge actually corresponds to an infinite number of large universes.

This particular version of the multiverse—called the eternally inflating multiverse—is very attractive. It is theoretically motivated and has a potential to explain various features seen in our universe. The eternal nature of inflation, however, causes a serious issue of predictivity. Because the process of creating bubble universes occurs infinitely many times, “In an eternally inflating universe, anything that can happen will happen; in fact, it will happen an infinite number of times,” as phrased in an article by Alan Guth. Suppose we want to calculate the relative probability for (any) events A and B to happen in the multiverse. Following the standard notion of probability, we might define it as the ratio of the numbers of times events A and B happen throughout the whole spacetime

P = \frac{N_A}{N_B}.

In the eternally inflating multiverse, however, both A and B occur infinitely many times: N_A, N_B = \infty. This expression, therefore, is ill-defined. One might think that this is merely a technical problem—we simply need to “regularize” to make both N_{A,B} finite, at a middle stage of the calculation, and then we get a well-defined answer. This is, however, not the case. One finds that depending on the details of this regularization procedure, one can obtain any “prediction” one wants, and there is no a priori preferred way to proceed over others—predictivity of physical theory seems lost!

Over the past decades, some physicists and cosmologists have been thinking about many aspects of this so-called measure problem in eternal inflation. (There are indeed many aspects to the problem, and I’m omitting most of them in my simplified presentation above.) Many of the people who contributed were in the session at the conference, including Aguirre, Albrecht, Bousso, Carroll, Guth, Page, Tegmark, and Vilenkin. My own view, which I think is shared by some others, is that this problem offers a window into deep issues associated with spacetime and gravity. In my 2011 paper I suggested that quantum mechanics plays a crucial role in understanding the multiverse, even at the largest distance scales. (A similar idea was also discussed here around the same time.) In particular, I argued that the eternally inflating multiverse and quantum mechanical many worlds a la Everett are the same concept:

Multiverse = Quantum Many Worlds

in a specific, and literal, sense. In this picture, the global spacetime of general relativity appears only as a derived concept at the cost of overcounting true degrees of freedom; in particular, infinitely large space associated with eternal inflation is a sort of “illusion.” A “true” description of the multiverse must be “intrinsically” probabilistic in a quantum mechanical sense—probabilities in cosmology and quantum measurements have the same origin.

To illustrate the basic idea, let us first consider an (apparently unrelated) system with a black hole. Suppose we drop some book A into the black hole and observe subsequent evolution of the system from a distance. The book will be absorbed into (the horizon of) the black hole, which will then eventually evaporate, leaving Hawking radiation. Now, let us consider another process of dropping a different book B, instead of A, and see what happens. The subsequent evolution in this case is similar to the case with A, and we will be left with Hawking radiation. However, this final-state Hawking radiation arising from B is (believed by many to be) different from that arising from A in its subtle quantum correlation structure, so that if we have perfect knowledge about the final-state radiation then we can reconstruct what the original book was. This property is called unitarity and is considered to provide the correct picture for black hole dynamics, based on recent theoretical progress. To recap, the information about the original book will not be lost—it will simply be distributed in final-state Hawking radiation in a highly scrambled form.

A puzzling thing occurs, however, if we observe the same phenomenon from the viewpoint of an observer who is falling into the black hole with a book. In this case, the equivalence principle says that the book does not feel gravity (except for the tidal force which is tiny for a large black hole), so it simply passes through the black hole horizon without any disruption. (Recently, this picture was challenged by the so-called firewall argument—the book might hit a collection of higher energy quanta called a firewall, rather than freely fall. Even if so, it does not affect our basic argument below.) This implies that all the information about the book (in fact, the book itself) will be inside the horizon at late times. On the other hand, we have just argued that from a distant observer’s point of view, the information will be outside—first on the horizon and then in Hawking radiation. Which is correct?

One might think that the information is simply duplicated: one copy inside and the other outside. This, however, cannot be the case. Quantum mechanics prohibits faithful copying of full quantum information, the so-called no-cloning theorem. Therefore, it seems that the two pictures by the two observers cannot both be correct.

The proposed solution to this puzzle is interesting—both pictures are correct, but not at the same time. The point is that one cannot be both a distant observer and a falling observer at the same time. If you are a distant observer, the information will be outside, and the interior spacetime must be viewed as non-existent since you can never access it even in principle (because of the existence of the horizon). On the other hand, if you are a falling observer, then you have the interior spacetime in which the information (the book itself) will fall, but this happens only at the cost of losing a part of spacetime in which Hawking radiation lies, which you can never access since you yourself are falling into the black hole. There is no inconsistency in either of these two pictures; only if you artificially “patch” the two pictures, which you cannot physically do, does the apparent inconsistency of information duplication occurs. This somewhat surprising aspect of a system with gravity is called black hole complementarity, pioneered by ‘t Hooft, Susskind, and their collaborators.

What does this discussion of black holes have to do with cosmology, and, in particular the eternally inflating multiverse? In cosmology our space is surrounded by a cosmological horizon. (For example, imagine that space is expanding exponentially; this makes it impossible for us to obtain any signal from regions farther than some distance because objects in these regions recede faster than speed of light. The definition of appropriate horizons in general cases is more subtle, but can be made.) The situation, therefore, is the “inside out” version of the black hole case viewed from a distant observer. As in the case of the black hole, quantum mechanics requires that spacetime on the other side of the horizon—in this case the exterior to the cosmological horizon—must be viewed as non-existent. (In the paper I made this claim based on some simple supportive calculations.) In a more technical term, a quantum state describing the system represents only the region within the horizon—there is no infinite space in any single, consistent description of the system!

If a quantum state represents only space within the horizon, then where is the multiverse, which we thought exists in an eternally inflating space further away from our own horizon? The answer is—probability! The process of creating bubble universes is a probabilistic process in the quantum mechanical sense—it occurs through quantum mechanical tunneling. This implies that, starting from some initially inflating space, we could end up with different universes probabilistically. All different universes—including our own—live in probability space. In a more technical term, a state representing eternally inflating space evolves into a superposition of terms—or branches—representing different universes, but with each of them representing only the region within its own horizon. Note that there is no concept of infinitely large space here, which led to the ill-definedness of probability. The picture of initially large multiverse, naively suggested by general relativity, appears only after “patching” pictures based on different branches together; but this vastly overcounts true degrees of freedom as was the case if we include both the interior spacetime and Hawking radiation in our description of a black hole.

The description of the multiverse presented here provides complete unification of the eternally inflating multiverse and the many worlds interpretation in quantum mechanics. Suppose the multiverse starts from some initial state |\Psi(t_0)\rangle. This state evolves into a superposition of states in which various bubble universes nucleate in various locations. As time passes, a state representing each universe further evolves into a superposition of states representing various possible cosmic histories, including different outcomes of “experiments” performed within that universe. (These “experiments” may, but need not, be scientific experiments—they can be any physical processes.) At late times, the multiverse state |\Psi(t)\rangle will thus contain an enormous number of terms, each of which represents a possible world that may arise from |\Psi(t_0)\rangle consistently with the laws of physics. Probabilities in cosmology and microscopic processes are then both given by quantum mechanical probabilities in the same manner. The multiverse and quantum many worlds are really the same thing—they simply refer to the same phenomenon occurring at (vastly) different scales.

branching

A schematic picture for the evolution of the multiverse state. As t increases, the state evolves into a superposition of states in which various bubble universes nucleate in various locations. Each of these states then evolves further into a superposition of states representing various possible cosmic histories, including different outcomes of experiments performed within that universe.

The picture presented here does not solve all the problems in eternally inflating cosmology. What is the actual quantum state of the multiverse? What is its “initial conditions”? What is time? How does it emerge? The picture, however, does provide a framework to address these further, deep questions, and I have recently made some progress: the basic idea is that the state of the multiverse (which may be selected uniquely by the normalizability condition) never changes, and yet time appears as an emergent concept locally in branches as physical correlations among objects (along the lines of an old idea by DeWitt). Given the length already, I will not elaborate on this new development here. If you are interested, you might want to read my paper.

It is fascinating that physicists can talk about big and deep questions like the ones discussed here based on concrete theoretical progress. Nobody really knows where these explorations will finally lead us to. It seems, however, clear that we live in an exciting era in which our scientific explorations reach beyond what we thought to be the entire physical world, our universe.

Reporting from the ‘Frontiers of Quantum Information Science’

What am I referring to with this title? It is similar to the name of this blog–but that’s not where this particular title comes from–although there is a common denominator. Frontiers of Quantum Information Science was the theme for the 31st Jerusalem winter school in theoretical physics, which takes place annually at the Israeli Institute for Advanced Studies located on the Givat Ram campus of the Hebrew University of Jerusalem. The school took place from December 30, 2013 through January 9, 2014, but some of the attendees are still trickling back to their home institutions. The common denominator is that our very own John Preskill was the director of this school; co-directed by Michael Ben-Or and Patrick Hayden. John mentioned during a previous post and reiterated during his opening remarks that this is the first time the IIAS has chosen quantum information to be the topic for its prestigious advanced school–another sign of quantum information’s emergence as an important sub-field of physics. In this blog post, I’m going to do my best to recount these festivities while John protects his home from forest fires, prepares a talk for the Simons Institute’s workshop on Hamiltonian complexityteaches his quantum information course and celebrates his birthday 60+1.

The school was mainly targeted at physicists, but it was diversely represented. Proof of the value of this diversity came in an interaction between a computer scientist and a physicist, which led to one of the school’s most memorable moments. Both of my most memorable moments started with the talent show (I was surprised that so many talents were on display at a physics conference…) Anyways, towards the end of the show, Mateus Araújo Santos, a PhD student in Vienna, entered the stage and mentioned that he could channel “the ghost of Feynman” to serve as an oracle for NP-complete decision problems. After making this claim, people obviously turned to Scott Aaronson, hoping that he’d be able to break the oracle. However, in order for this to happen, we had to wait until Scott’s third lecture about linear optics and boson sampling the next day. You can watch Scott bombard the oracle with decision problems from 1:00-2:15 during the video from his third lecture.

oracle_aaronson

Scott Aaronson grilling the oracle with a string of NP-complete decision problems! From 1:00-2:15 during this video.

The other most memorable moment was when John briefly danced Gangnam style during Soonwon Choi‘s talent show performance. Unfortunately, I thought I had this on video, but the video didn’t record. If anyone has video evidence of this, then please share!
Continue reading

Jostling the unreal in Oxford

Oxford, where the real and the unreal jostle in the streets, where windows open into other worlds…

So wrote Philip Pullman, author of The Golden Compass and its sequels. In the series, a girl wanders from the Oxford in another world to the Oxford in ours.

I’ve been honored to wander Oxford this fall. Visiting Oscar Dahlsten and Jon Barrett, I’ve been moonlighting in Vlatko Vedral’s QI group. We’re interweaving 21st-century knowledge about electrons and information with a Victorian fixation on energy and engines. This research program, quantum thermodynamics, should open a window onto our world.

Radcliffe camera

A new world. At least, a world new to the author.

To study our world from another angle, Oxford researchers are jostling the unreal. Oscar, Jon, Andrew Garner, and others are studying generalized probabilistic theories, or GPTs.

What’s a specific probabilistic theory, let alone a generalized one? In everyday, classical contexts, probabilities combine according to rules you know. Suppose you have a 90% chance of arriving in London-Heathrow Airport at 7:30 AM next Sunday. Suppose that, if you arrive in Heathrow at 7:30 AM, you’ll have a 70% chance of catching the 8:05 AM bus to Oxford. You have a probability 0.9 * 0.7 = 0.63 of arriving in Heathrow at 7:30 and catching the 8:05 bus. Why 0.9 * 0.7? Why not 0.90.7, or 0.9/(2 * 0.7)? How might probabilities combine, GPT researchers ask, and why do they combine as they do?

Not that, in GPTs, probabilities combine as in 0.9/(2 * 0.7). Consider the 0.9/(2 * 0.7) plucked from a daydream inspired by this City of Dreaming Spires. But probabilities do combine in ways we wouldn’t expect. By entangling two particles, separating them, and measuring one, you immediately change the probability that a measurement of Particle 2 yields some outcome. John Bell explored, and experimentalists have checked, statistics generated by entanglement. These statistics disobey rules that govern Heathrow-and-bus statistics. As do entanglement statistics, so do effects of quantum phenomena like discord, negative Wigner functions, and weak measurements. Quantum theory and its contrast with classicality force us to reconsider probability.
Continue reading

Polarizer: Rise of the Efficiency

How should a visitor to Zürich spend her weekend?

Launch this question at a Swiss lunchtable, and you split diners into two camps. To take advantage of Zürich, some say, visit Geneva, Lucerne, or another spot outside Zürich. Other locals suggest museums, the lake, and the 19th-century ETH building.

P1040429

The 19th-century ETH building

ETH, short for a German name I’ve never pronounced, is the polytechnic from which Einstein graduated. The polytechnic houses a quantum-information (QI) theory group that’s pioneering ideas I’ve blogged about: single-shot information, epsilonification, and small-scale thermodynamics. While visiting the group this August, I triggered an avalanche of tourism advice. Caught between two camps, I chose Option Three: Contemplate polar codes.

Polar codes compress information into the smallest space possible. Imagine you write a message (say, a Zürich travel guide) and want to encode it in the fewest possible symbols (so it fits in my camera bag). The longer the message, the fewer encoding symbols you need per encoded symbol: The more punch each code letter can pack. As the message grows, the encoding-to-encoded ratio decreases. The lowest possible ratio is a number, represented by H, called the Shannon entropy.

So established Claude E. Shannon in 1948. But Shannon didn’t know how to code at efficiency H. Not for 51 years did we know.

I learned how, just before that weekend. ETH student David Sutter walked me through polar codes as though down Zürich’s Banhofstrasse.

P1040385

The Banhofstrasse, one of Zürich’s trendiest streets, early on a Sunday.

Say you’re encoding n copies of a random variable. When I say, “random variable,” think, “character in the travel guide.” Just as each character is one of 26 letters, each variable has one of many possible values.

Suppose the variables are independent and identically distributed. Even if you know some variables’ values, you can’t guess others’. Cryptoquote players might object that we can infer unknown from known letters. For example, a three-letter word that begins with “th” likely ends with “e.” But our message lacks patterns.

Think of the variables as diners at my lunchtable. Asking how to fill a weekend in Zürich—splitting the diners—I resembled the polarizer.

The polarizer is a mathematical object that sounds like an Arnold Schwarzenegger film and acts on the variables. Just as some diners pointed me outside Zürich, the polarizer gives some variables one property. Just some diners pointed me to within Zürich, the polarizer gives some variables another property. Just as I pointed myself at polar codes, the polarizer gives some variables a third property.

These properties involve entropy. Entropy quantifies uncertainty about a variable’s value—about which of the 26 letters a character represents. Even if you know the early variables’ values, you can’t guess the later variables’. But we can guess some polarized variables’ values. Call the first polarized variable u1, the second u2, etc. If we can guess the value of some ui, that ui has low entropy. If we can’t guess the value, ui has high entropy. The Nicole-esque variables have entropies like the earnings of Terminator Salvation: noteworthy but not chart-topping.

To recap: We want to squeeze a message into the tiniest space possible. Even if we know early variables’ values, we can’t infer later variables’. Applying the polarizer, we split the variables into low-, high-, and middling-entropy flocks. We can guess the value of each low-entropy ui, if we know the foregoing uh’s.

Almost finished!

In your camera-size travel guide, transcribe the high-entropy ui’s. These ui’s suggest the values of the low-entropy ui’s. When you want to decode the guide, guess the low-entropy ui’s. Then reverse the polarizer to reconstruct much of the original text.

The longer the original travel guide, the fewer errors you make while decoding, and the smaller the ratio of the encoded guide’s length to the original guide’s length. That ratio shrinks–as the guide’s length grows–to H. You’ve compressed a message maximally efficiently. As the Swiss say: Glückwünsche.

How does compression relate to QI? Quantum states form messages. Polar codes, ETH scientists have shown, compress quantum messages maximally efficiently. Researchers are exploring decoding strategies and relationships among (quantum) polar codes. With their help, Shannon-coded travel guides might fit not only in my camera bag, but also on the tip of my water bottle.

Should you need a Zürich travel guide, I recommend Grossmünster Church. Not only does the name fulfill your daily dose of umlauts. Not only did Ulrich Zwingli channel the Protestant Reformation into Switzerland there. Climbing a church tower affords a panorama of Zürich. After oohing over the hills and ahhing over the lake, you can shift your gaze toward ETH. The worldview being built there bewitches as much as the vista from any tower.

P1040476

A tower with a view.

With gratitude to ETH’s QI-theory group (particularly to Renato Renner) for its hospitality. And for its travel advice. With gratitude to David Sutter for his explanations and patience.

P1040411

The author and her neue Freunde.

The cost and yield of moving from (quantum) state to (quantum) state

The countdown had begun.

In ten days, I’d move from Florida, where I’d spent the summer with family, to Caltech. Unfolded boxes leaned against my dresser, and suitcases yawned on the floor. I was working on a paper. Even if I’d turned around from my desk, I wouldn’t have seen the stacked books and folded sheets. I’d have seen Lorenz curves, because I’d drawn Lorenz curves all week, and the curves seemed imprinted on my eyeballs.

Using Lorenz curves, we illustrate how much we know about a quantum state. Say you have an electron, you’ll measure it using a magnet, and you can’t predict any measurement’s outcome. Whether you orient the magnet up-and-down, left-to-right, etc., you haven’t a clue what number you’ll read out. We represent this electron’s state by a straight line from (0, 0) to (1, 1).

Uniform_state

Say you know the electron’s state. Say you know that, if you orient the magnet up-and-down, you’ll read out +1. This state, we call “pure.” We represent it by a tented curve.

Pure_state

The more you know about a state, the more the state’s Lorenz curve deviates from the straight line.

Arbitrary_state

If Curve A fails to dip below Curve B, we know at least as much about State A as about State B. We can transform State A into State B by manipulating and/or discarding information.

Conversion_yield_part_1_arrow

By the time I’d drawn those figures, I’d listed the items that needed packing. A coauthor had moved from North America to Europe during the same time. If he could hop continents without impeding the paper, I could hop states. I unzipped the suitcases, packed a box, and returned to my desk.

Say Curve A dips below Curve B. We know too little about State A to transform it into State B. But we might combine State A with a state we know lots about. The latter state, C, might be pure. We have so much information about A + C, the amalgam can turn into B.

Yet more conversion costs Yet-more-conversion-costs-part-2

What’s the least amount of information we need about C to ensure that A + C can turn into B? That number, we call the “cost of transforming State A into State B.”

We call it that usually. But late in the evening, after I’d miscalculated two transformation costs and deleted four curves, days before my flight, I didn’t type the cost’s name into emails to coauthors. I typed “the cost of turning A into B” or “the cost of moving from state to state.”
Continue reading

The million dollar conjecture you’ve never heard of…

Curating a blog like this one and writing about imaginary stuff like Fermat’s Lost Theorem means that you get the occasional comment of the form: I have a really short proof of a famous open problem in math. Can you check it for me? Usually, the answer is no. But, about a week ago, a reader of the blog that had caught an omission in a proof contained within one of my previous posts, asked me to do just that: Check out a short proof of Beal’s Conjecture. Many of you probably haven’t heard of billionaire Mr. Beal and his $1,000,000 conjecture, so here it is:

Let a,b,c and x,y,z > 2 be positive integers satisfying a^x+b^y=c^z. Then, gcd(a,b,c) > 1; that is, the numbers a,b,c have a common factor.

After reading the “short proof” of the conjecture, I realized that this was a pretty cool conjecture! Also, the short proof was wrong, though the ideas within were non-trivial. But, partial progress had been made by others, so I thought I would take a crack at it on the 10 hour flight from Athens to Philadelphia. In particular, I convinced myself that if I could prove the conjecture for all even exponents x,y,z, then I could claim half the prize. Well, I didn’t quite get there, but I made some progress using knowledge found in these two blog posts: Redemption: Part I and Fermat’s Lost Theorem. In particular, one can show that the conjecture holds true for x=y=2n and z = 2k, for n \ge 3, k \ge 1. Moreover, the general case of even exponents can be reduced to the case of x=y=p \ge 3 and y=z=q \ge 3, for p,q primes. Which makes one wonder if the general case has a similar reduction, where two of the three exponents can be assumed equal.

The proof is pretty trivial, since most of the heavy lifting is done by Fermat’s Last Theorem (which itself has a rather elegant, short proof I wanted to post in the margins – alas, WordPress has a no-writing-on-margins policy). Moreover, it turns out that the general case of even exponents follows from a combination of results obtained by others over the past two decades (see the Partial Results section of the Wikipedia article on the conjecture linked above – in particular, the (n,n,2) case). So why am I even bothering to write about my efforts? Because it’s math! And math equals magic. Also, in case this proof is not known and in the off chance that some of the ideas can be used in the general case. Okay, here we go…

Proof. The idea is to assume that the numbers a,b,c have no common factor and then reach a contradiction. We begin by noting that a^{2m}+b^{2n}=c^{2k} is equivalent to (a^m)^2+(b^n)^2=(c^k)^2. In other words, the triplet (a^m,b^n,c^k) is a Pythagorean triple (sides of a right triangle), so we must have a^m=2rs, b^n=r^2-s^2, c^k =r^2+s^2, for some positive integers r,s with no common factors (otherwise, our assumption that a,b,c have no common factor would be violated). There are two cases to consider now:

Case I: r is even. This implies that 2r=a_0^m and s=a_1^m, where a=a_0\cdot a_1 and a_0,a_1 have no factors in common. Moreover, since b^n=r^2-s^2=(r+s)(r-s) and r,s have no common factors, then r+s,r-s have no common factors either (why?) Hence, r+s = b_0^n, r-s=b_1^n, where b=b_0\cdot b_1 and b_0,b_1 have no factors in common. But, a_0^m = 2r = (r+s)+(r-s)=b_0^n+b_1^n, implying that a_0^m=b_0^n+b_1^n, where b_0,b_1,a_0 have no common factors.

Case II: s is even. This implies that 2s=a_1^m and r=a_0^m, where a=a_0\cdot a_1 and a_0,a_1 have no factors in common. As in Case I, r+s = b_0^n, r-s=b_1^n, where b=b_0\cdot b_1 and b_0,b_1 have no factors in common. But, a_1^m = 2s = (r+s)-(r-s)=b_0^n-b_1^n, implying that a_1^m+b_1^n=b_0^n, where b_0,b_1,a_1 have no common factors.

We have shown, then, that if Beal’s conjecture holds for the exponents (x,y,z)=(n,n,m) and (x,y,z)=(m,n,n), then it holds for (x,y,z)=(2m,2n,2k), for arbitrary k \ge 1. As it turns out, when m=n, Beal’s conjecture becomes Fermat’s Last Theorem, implying that the conjecture holds for all exponents (x,y,z)=(2n,2n,2k), with n\ge 3 and k\ge 1.

Open Problem: Are there any solutions to a^p+b^p= c\cdot (a+b)^q, for a,b,c positive integers and primes p,q\ge 3?

PS: If you find a mistake in the proof above, please let everyone know in the comments. I would really appreciate it!

What’s inside a black hole?

I have a multiple choice question for you.

What’s inside a black hole?

(A) An unlimited amount of stuff.
(B) Nothing at all.
(C) A huge but finite amount of stuff, which is also outside the black hole.
(D) None of the above.

The first three answers all seem absurd, boosting the credibility of (D). Yet … at the “Rapid Response Workshop” on black holes I attended last week at the KITP in Santa Barbara (and which continues this week), most participants were advocating some version of (A), (B), or (C), with varying degrees of conviction.

When physicists get together to talk about black holes, someone is bound to draw a cartoon like this one:

Penrose diagram depicting the causal structure of a black hole spacetime.

Part of a Penrose diagram depicting the causal structure of a black hole spacetime.

I’m sure I’ve drawn and contemplated some version of this diagram hundreds of times over the past 25 years in the privacy of my office, and many times in public discussions (including at least five times during the talk I gave at the KITP). This picture vividly captures the defining property of a black hole, found by solving Einstein’s classical field equations for gravitation: once you go inside there is no way out. Instead you are unavoidably drawn to the dreaded singularity, where known laws of physics break down (and the picture can no longer be trusted). If taken seriously, the picture says that whatever falls into a black hole is gone forever, at least from the perspective of observers who stay outside.

But for nearly 40 years now, we have known that black holes can shed their mass by emitting radiation, and presumably this process continues until the black hole disappears completely. If we choose to, we can maintain the black hole for as long as we please by feeding it new stuff at the same rate that radiation carries energy away. What I mean by option (A) is that  the radiation is completely featureless, carrying no information about what kind of stuff fell in. That means we can hide as much information as we please inside a black hole of a given mass.

On the other hand, the beautiful theory of black hole thermodynamics indicates that the entropy of a black hole is determined by its mass. For all other systems we know of besides black holes, the entropy of the system quantifies how much information we can hide in the system. If (A) is the right answer, then black holes would be fundamentally different in this respect, able to hide an unlimited amount of information even though their entropy is finite. Maybe that’s possible, but it would be rather disgusting, a reason to dislike answer (A).

There is another way to argue that (A) is not the right answer, based on what we call AdS/CFT duality. AdS just describes a consistent way to put a black hole in a “bottle,” so we can regard the black hole together with the radiation outside it as a closed system. Now, in gravitation it is crucial to focus on properties of spacetime that do not depend on the observer’s viewpoint; otherwise we can easily get very confused. The best way to be sure we have a solid way of describing things is to pay attention to what happens at the boundary of the spacetime, the walls of the bottle — that’s what CFT refers to. AdS/CFT provides us with tools for describing what happens when a black hole forms and evaporates, phrased entirely in terms of what happens on the walls of the bottle. If we can describe the physics perfectly by sticking to the walls of the bottle, always staying far away from the black hole, there doesn’t seem to be anyplace to hide an unlimited amount of stuff.

At the KITP, both Bill Unruh and Bob Wald argued forcefully for (A). They acknowledge the challenge of understanding the meaning of black hole entropy and of explaining why the AdS/CFT argument is wrong. But neither is willing to disavow the powerful message conveyed by that telling diagram of the black hole spacetime. As Bill said: “There is all that stuff that fell in and it crashed into the singularity and that’s it. Bye-bye.”

Adherents of (B) and (C) like to think about black hole physics from the perspective of an observer who stays outside the black hole. From that viewpoint, they say, the black hole behaves like any other system with a temperature and a finite entropy. Stuff falling in sticks to the black hole’s outer edge and gets rapidly mixed in with other stuff the black hole absorbed previously. For a black hole of a given mass, though, there is a limit to how much stuff it can hold. Eventually, what fell in comes out again, but in a form so highly scrambled as to be nearly unrecognizable.

Where the (B) and (C) camps differ concerns what happens to a brave observer who falls into a black hole. According to (C), an observer falling in crosses from the outside to the inside of a black hole peacefully, which poses a puzzle I discussed here. The puzzle arises because an uneventful crossing implies strong quantum entanglement between the region A just inside the black hole and region B just outside. On the other hand, as information leaks out of a black hole, region B should be strongly  entangled with the radiation system R emitted by the black hole long ago. Entanglement can’t be shared, so it does not make sense for B to be entangled with both A and R. What’s going on? Answer (C) resolves the puzzle by positing that A and R are not really different systems, but rather two ways to describe the same system, as I discussed here.That seems pretty crazy, because R could be far, far away from the black hole.

Answer (B) resolves the puzzle differently, by positing that region A does not actually exist, because the black hole has no interior. An observer who attempts to fall in gets a very rude surprise, striking a seething “firewall” at the last moment before passing to the inside. That seems pretty crazy, because no firewall is predicted by Einstein’s trusty equations, which are normally very successful at describing spacetime geometry.

At the workshop, Don Marolf and Raphael Bousso gave some new arguments supporting (B). Both acknowledge that we still lack a concrete picture of how firewalls are created as black holes form, but Bousso insisted that “It is time to constrain and construct the dynamics of firewalls.” Joe Polchinski emphasized that, while AdS/CFT provides a very satisfactory description of physics outside a black hole, it has not yet been able to tell us enough about the black hole interior to settle whether there are firewalls or not, at least for generic black holes formed from collapsing matter.

Lenny Susskind, Juan Maldacena, Ted Jacobson, and I all offered different perspectives on how (C) could turn out to be the right answer. We all told different stories, but perhaps each of us had at least part of the right answer. I’m not at KITP this week, but there have been further talks supporting (C) by Raju, Nomura, and the Verlindes.

I had a fun week at the KITP. If you watch the videos of the talks, you might get an occasional glimpse of me typing furiously on my laptop. It looks like I’m doing my email, but actually that’s how I take notes, which helps me to pay attention. Every once in a while I was inspired to tweet.

I have felt for a while that ideas from quantum information can help us to grasp the mysteries of quantum gravity, so I appreciated that quantum information concepts came up in many of the talks. Susskind invoked quantum error-correcting codes in discussing how sensitively the state of the Hawking radiation depends on the information it encodes, and Maldacena used tensor networks to explain how to build spacetime geometry from quantum entanglement. Scott Aaronson proposed the appropriate acronym HARD for HAwking Radiation Decoding, and argued (following Harlow and Hayden) that this task is as hard as inverting an injective one-way function, something we don’t expect quantum computers to be able to do.

In the organizational session that launched the meeting, Polchinski remarked regarding firewalls that “Nobody has the slightest idea what is going on,” and Gary Horowitz commented that “I’m still getting over the shock over how little we’ve learned in the past 30 years.” I guess that’s fair. Understanding what’s inside black holes has turned out to be remarkably subtle, making the problem more and more tantalizing. Maybe the current state of confusion regarding black hole information means that we’re on the verge of important discoveries about quantum gravity, or maybe not. In any case, invigorating discussions like what I heard last week are bound to facilitate progress.

Steampunk quantum

A dark-haired man leans over a marble balustrade. In the ballroom below, his assistants tinker with animatronic elephants that trumpet and with potions for improving black-and-white photographs. The man is an inventor near the turn of the 20th century. Cape swirling about him, he watches technology wed fantasy.

Welcome to the steampunk genre. A stew of science fiction and Victorianism, steampunk has invaded literature, film, and the Wall Street Journal. A few years after James Watt improved the steam engine, protagonists build animatronics, clone cats, and time-travel. At sci-fi conventions, top hats and blast goggles distinguish steampunkers from superheroes.

Photo

The closest the author has come to dressing steampunk.

I’ve never read steampunk other than H. G. Wells’s The Time Machine—and other than the scene recapped above. The scene features in The Wolsenberg Clock, a novel by Canadian poet Jay Ruzesky. The novel caught my eye at an Ontario library.

In Ontario, I began researching the intersection of QI with thermodynamics. Thermodynamics is the study of energy, efficiency, and entropy. Entropy quantifies uncertainty about a system’s small-scale properties, given large-scale properties. Consider a room of air molecules. Knowing that the room has a temperature of 75°F, you don’t know whether some molecule is skimming the floor, poking you in the eye, or elsewhere. Ambiguities in molecules’ positions and momenta endow the gas with entropy. Whereas entropy suggests lack of control, work is energy that accomplishes tasks.
Continue reading

Monopoles passing through Flatland!

Like many mathematically inclined teenagers, I was charmed when I first read the book Flatland by Edwin Abbott Abbott.* It’s a story about a Sphere who visits a two-dimensional world and tries to awaken its inhabitants to the existence of a third dimension. As perceived by Flatlanders, the Sphere is a circle which appears as a point, grows to maximum size, then shrinks and disappears.

My memories of Flatland were aroused as I read a delightful recent paper by Max Metlitski, Charlie Kane, and Matthew Fisher about magnetic monopoles and three-dimensional bosonic topological insulators. To explain why, I’ll need to recall a few elements of the theory of monopoles and of topological insulators, before returning to the connection between the two and why that reminds me of Flatland.

Flatlanders, confined to the surface of a topological insulator, are convinced by a magnetic monopole that there is a third dimension.

Flatlanders, confined to the two-dimensional surface of a topological insulator, are convinced by a magnetic monopole that a third dimension must exist.

Monopoles

Paul Dirac was no ordinary genius. Aside from formulating relativistic electron theory and predicting the existence of antimatter, Dirac launched the quantum theory of magnetic monopoles in a famous 1931 paper. Dirac envisioned a magnetic monopole as a semi-infinitely long, infinitesimally thin string of magnetic flux, such that the end of the string, where the flux spills out, seems to be a magnetic charge. For this picture to make sense, the string should be invisible. Dirac pointed out that an electron with electric charge e, transported around a string carrying flux \Phi, could detect the string (via what later came to be called the Aharonov-Bohm effect) unless the flux is an integer multiple of 2\pi\hbar /e, where \hbar is Planck’s constant. Conversely, in order for the string to be invisible, if a magnetic monopole exists with magnetic charge g_D = 2\pi\hbar /e, then all electric charges must be integer multiples of e. Thus the existence of magnetic monopoles (which have never been observed) could explain quantization of electric charge (which has been observed).

Captivated by the beauty of his own proposal, Dirac concluded his paper by remarking, “One would be surprised if Nature had made no use of it.”

Our understanding of quantized magnetic monopoles advanced again in 1979 when another extraordinary physicist, Edward Witten, discussed a generalization of Dirac’s quantization condition. Witten noted that the Lagrange density of electrodynamics could contain a term of the form

\frac{\theta e^2\hbar}{4\pi^2}~\vec{E}\cdot\vec{B},

where \vec{E} is the electric field and \vec{B} is the magnetic field. This “\theta term” may also be expressed as

\frac{\theta e^2\hbar}{8\pi^2}~ \partial^\mu\left(\epsilon_{\mu\nu\lambda\sigma}A^\nu\partial^\lambda A^\sigma \right),

where A is the vector potential, and hence is a total derivative which makes no contribution to the classical field equations of electrodynamics. But Witten realized that it can have important consequences for the quantum properties of magnetic monopoles. Specifically, the \theta term modifies the field momentum conjugate to the vector potential, which becomes

\vec{E}+\frac{\theta e^2\hbar}{4\pi^2}\vec{B}.

Because the Gauss law condition satisfied by physical quantum states is altered, for a monopole with magnetic charge m g_D , where g_D is Dirac’s minimal charge 2\pi\hbar /e and m is an integer, the allowed values of the electric charge become

q = e\left( n - \frac{\theta m}{2\pi}\right),

where n is an integer. This spectrum of allowed charges remains invariant if \theta advances by 2\pi, suggesting that the parameter \theta is actually an angular variable with period 2\pi. This periodicity of \theta can be readily verified in a theory admitting fermions with the minimal charge e. But if the charged particles are bosons then \theta turns out to be a periodic variable with period 4\pi instead.

That \theta has a different period for a bosonic theory than a fermionic one has an interesting interpretation. As Goldhaber noticed in 1976, dyons carrying both magnetic and electric charge can exhibit statistical transmutation. That is, in a purely bosonic theory, a dyon with magnetic charge g_D= 2\pi\hbar/e and electric charge ne is a fermion if n is an odd integer — when two dyons are exchanged, transport of each dyon’s electric charge in the magnetic field of the other dyon induces a sign change in the wave function. In a fermionic theory the story is different; now we can think of the dyon as a fermionic electric charge bound to a bosonic monopole. There are two canceling contributions to the exchange phase of the dyon, which is therefore a boson for any integer value of n, whether even or odd.

As \theta smoothly increases from 0 to 2\pi, the statistics (whether bosonic or fermionic) of a dyon remains fixed even as the dyon’s electric charge increases by e. For the bosonic theory with \theta = 2\pi, then, dyons with magnetic charge g_D and electric charge ne are bosons for n odd and fermions for n even, the opposite of what happens when \theta=0. For the bosonic theory, unlike the fermionic theory, we need to increase \theta by 4\pi for the physics of dyons to be fully invariant.

In 1979 Ed Witten was a postdoc at Harvard, where I was a student, though he was visiting CERN for the summer when he wrote his paper about the \theta-dependent monopole charge. I always read Ed’s papers carefully, but I gave special scrutiny to this one because magnetic monopoles were a pet interest of mine. At the time, I wondered whether the Witten effect might clarify how to realize the \theta parameter in a lattice gauge theory. But it certainly did not occur to me that the \theta-dependent electric charge of a magnetic monopole could have important implications for quantum condensed matter physics. Theoretical breakthroughs often have unexpected consequences, which may take decades to emerge.

Symmetry-protected topological phases

Okay, now let’s talk about topological insulators, a very hot topic in condensed matter physics these days. Actually, a topological insulator is a particular instance of a more general concept called a symmetry-protected topological phase of matter (or SPT phase). Consider a d-dimensional hunk of material with a (d-1)-dimensional boundary. If the material is in an SPT phase, then the physics of the d-dimensional bulk is boring — it’s just an insulator with an energy gap, admitting no low-energy propagating excitations. But the physics of the (d-1)-dimensional edge is exotic and exciting — for example the edge might support “gapless” excitations of arbitrarily low energy which can conduct electricity. The exotica exhibited by the edge is a consequence of a symmetry, and is destroyed if the symmetry is broken either explicitly or spontaneously; that is why we say the phase is “symmetry protected.”

The low-energy edge excitations can be described by a (d-1)-dimensional effective field theory. But for a typical SPT phase, this effective field theory is what we call anomalous, which means that for one reason or another the theory does not really make sense. The anomaly tells us something interesting and important, namely that the (d-1)-dimensional theory cannot be really, truly (d-1) dimensional; it can arise only at the edge of a higher-dimensional system.

This phenomenon, in which the edge does not make sense by itself without the bulk, is nicely illustrated by the integer quantum Hall effect, which occurs in a two-dimensional electron system in a high magnetic field and at low temperature, if the sample is sufficiently clean so that the electrons are highly mobile and rarely scattered by impurities. In this case the relevant symmetry is electron number, or equivalently the electric charge. At the one-dimensional edge of a two-dimensional quantum Hall sample, charge carriers move in only one direction — to the right, say, but not to the left. A theory with such chiral electric charges does not really make sense. One problem is that electric charge is not conserved — an electric field along the edge causes charge to be locally created, which makes the theory inconsistent.

The way the theory resolves this conundrum is quite remarkable. A two-dimensional strip of quantum Hall fluid has two edges, one at the top, the other at the bottom. While the top edge has only right-moving excitations, the bottom edge has only left-moving excitations. When electric charge appears on the top edge, it is simultaneously removed from the bottom edge. Rather miraculously, charge can be conveyed across the bulk from one edge to the other, even though the bulk does not have any low-energy excitations at all.

I first learned about this interplay of edge and bulk physics from a beautiful 1985 paper by Curt Callan and Jeff Harvey. They explained very lucidly how an edge theory with an anomaly and a bulk theory with an anomaly can fit together, with each solving the other’s problems. Curiously, the authors did not mention any connection with the quantum Hall effect, which had been discovered five years earlier, and I didn’t appreciate the connection myself until years later.

Topological insulators

In the case of topological insulators, the symmetries which protect the gapless edge excitations are time-reversal invariance and conserved particle number, i.e. U(1) symmetry. Though the particle number might not be coupled to an electromagnetic gauge field, it is instructive for the purpose of understanding the properties of the symmetry-protected phase to imagine that the U(1) symmetry is gauged, and then to consider the potential anomalies that could afflict this gauge symmetry. The first topological insulators conceived by theorists were envisioned as systems of non-interacting electrons whose properties were relatively easy to understand using band theory. But it was not so clear at first how interactions among the electrons might alter their exotic behavior. The wonderful thing about anomalies is that they are robust with respect to interactions. In many cases we can infer the features of anomalies by studying a theory of non-interacting particles, assured that these features survive even when the particles interact.

As have many previous authors, Metlitski et al. argue that when we couple the conserved particle number to a U(1) gauge field, the effective theory describing the bulk physics of a topological insulator in three dimensions may contain a \theta term. But wait … since the electric field is even under time reversal and the magnetic field is odd, the \theta term is T-odd; under T, \theta is mapped to -\theta, so T seems to be violated if \theta has any nonzero value. Except … we have to remember that \theta is really a periodic variable. For a fermionic topological insulator the period is 2\pi; therefore the theory with \theta = \pi is time reversal invariant; \theta = \pi maps to \theta = -\pi under T, which is equivalent to a rotation of \theta by 2\pi. For a bosonic topological insulator the period is 4\pi, which means that \theta = 2\pi is the nontrivial T-invariant value.

If we say that a “trivial” insulator (e.g., the vacuum) has \theta = 0, then we may say that a bulk material with \theta = \pi (fermionic case) or \theta = 2\pi (bosonic case) is a “nontrivial” (a.k.a. topological) insulator. At the edge of the sample, where bulk material meets vacuum, \theta must rotate suddenly by \pi (fermions) or by 2\pi (bosons). The exotic edge physics is a consequence of this abrupt change in \theta.

Monopoles in Flatland

To understand the edge physics, and in particular to grasp how fermionic and bosonic topological insulators differ, Metlitski et al. invite us to imagine a magnetic monopole with magnetic charge g_D passing through the boundary between the bulk and the surrounding vacuum. To the Flatlanders confined to the surface of the bulk sample, the passing monopole induces a sudden change in the magnetic flux through the surface by a single flux quantum g_D, which could arise due to a quantum tunneling event. What does the Flatlander see?

In a fermionic topological insulator, there is a monopole that carries charge e/2 when inside the sample (where \theta=-\pi) and charge 0 when outside (where \theta=0). Since electric charge is surely conserved in the full three-dimensional theory, the change in the monopole’s charge must be compensated by a corresponding change in the charge residing on the surface. Flatlanders are puzzled to witness a spontaneously arising excitation with charge e/2. This is an anomaly — electric charge conservation is violated, which can only make sense if Flatlanders are confined to a surface in a higher-dimensional world. Though unable to escape their surface world, the Flatlanders can be convinced by the Monopole that an extra dimension must exist.

In a bosonic topological insulator, the story is somewhat different: there is a monopole that carries electric charge 0 when inside the sample (where \theta=-2\pi) and charge –e when outside (where \theta=0). In this case, though, there are bosonic charge-e particles living on the surface. A monopole can pick up a charged particle as it passes through Flatland, so that its charge is 0 both inside the bulk sample and outside in the vacuum. Flatlanders are happy — electric charge is conserved!

But hold on … there’s still something wrong. Inside the bulk (where \theta= -2\pi) a monopole with electric charge 0 is a fermion, while outside in the vacuum (where \theta = 0) it is a boson. In the three-dimensional theory it is not possible for any local process to create an isolated fermion, so if the fermionic monopole becomes a bosonic monople as it passes through Flatland, it must leave a fermion behind. Flatlanders are puzzled to witness a spontaneously arising fermion. This is an anomaly — conservation of fermionic parity is violated, which can only make sense if Flatlanders are confined to a surface in a higher-dimensional world. Once again, the clever residents of Flatland learn from the Monopole about an extra spatial dimension, without ever venturing outside their two-dimensional home.

Topological order gets edgy

This post is already pretty long and I should wrap it up. Before concluding I’ll remark that the theory of symmetry-protected phases has been developing rapidly in recent months.

In particular, a new idea, introduced last fall by Vishwanath and Senthil, has been attracting increasing attention. While in most previously studied SPT phases the unbroken symmetry protects gapless excitations confined to the edge of the sample, Vishwanath and Senthil pointed out another possibility — a gapped edge exhibiting topological order. The surface can support anyons with exotic braiding statistics.

Here, too, anomalies are central to the discussion. While anyons in two-dimensional media are already a much-studied subject, the anyon models that can be realized at the edges of three-dimensional SPT phases are different than anyon models realized in really, truly two-dimensional systems. What’s new are not the braiding properties of the anyons, but rather how the anyons transform under the symmetry. Flatlanders who study the symmetry realization in their gapped two-dimensional world should be able to infer the existence of the three-dimensional bulk.

The pace of discovery picked up this month when four papers appeared simultaneously on the preprint arXiv, by Metlitski-Kane-Fisher, Chen-Fidkowski-Vishwanath, Bonderson-Nayak-Qi, and Wang-Potter-Senthil, all proposing and analyzing models of SPT phases with gapped edges. It remains to be seen, though, whether this physics will be realized in actual materials.

Are we on the edge?

In Flatland, our two-dimensional friend, finally able to perceive the third dimension thanks to the Sphere’s insistent tutelage, begs to enter a world of still higher dimensions, “where thine own intestines, and those of kindred Spheres, will lie exposed to … view.” The Sphere is baffled by the Flatlander’s request, protesting, “There is no such land. The very idea of it is utterly inconceivable.”

Let’s not be so dogmatic as the Sphere. The lessons learned from the quantum Hall effect and the topological insulator have prepared us to take the next step, envisioning our own three-dimensional world as the edge of a higher-dimensional bulk system. The existence of an unseen bulk may be inferred in the future by us edgelings, if experimental explorations of our three-dimensional effective theory reveal anomalies begging for an explanation.

Perhaps we are on the edge … of a great discovery. At least it’s conceivable.

*Disclaimer: The gender politics of Flatland, to put it mildly, is outdated and offensive. I don’t wish to endorse the idea that women are one dimensional! I included the reference to Flatland because the imagery of two-dimensional beings struggling to imagine the third dimension is a perfect fit to the scientific content of this post.

This single-shot life

The night before defending my Masters thesis, I ran out of shampoo. I ran out late enough that I wouldn’t defend from beneath a mop like Jack Sparrow’s; but, belonging to the Luxuriant Flowing-Hair Club for Scientists (technically, if not officially), I’d have to visit Shopper’s Drug Mart.

Image

The author’s unofficially Luxuriant Flowing Scientist Hair

Before visiting Shopper’s Drug Mart, I had to defend my thesis. The thesis, as explained elsewhere, concerns epsilons, the mathematical equivalents of seed pearls. The thesis also concerns single-shot information theory.

Ordinary information theory emerged in 1948, midwifed by American engineer Claude E. Shannon. Shannon calculated how efficiently we can pack information into symbols when encoding long messages. Consider encoding this article in the fewest possible symbols. Because “the” appears many times, you might represent “the” by one symbol. Longer strings of symbols suit misfits like “luxuriant” and “oobleck.” The longer the article, the fewer encoding symbols you need per encoded word. The encoding-to-encoded ratio decreases, toward a number called the Shannon entropy, as the message grows infinitely long.

Claude Shannon

We don’t send infinitely long messages, excepting teenagers during phone conversations. How efficiently can we encode just one article or sentence? The answer involves single-shot information theory, or—to those stuffing long messages into the shortest possible emails to busy colleagues—“1-shot info.” Pioneered within the past few years, single-shot theory concerns short messages and single trials, the Twitter to Shannon’s epic. Like articles, quantum states can form messages. Hence single-shot theory blended with quantum information in my thesis.

Continue reading