Reading the sub(linear) text

Physicists are not known for finesse. “Even if it cost us our funding,” I’ve heard a physicist declare, “we’d tell you what we think.” Little wonder I irked the porter who directed me toward central Cambridge.

The University of Cambridge consists of colleges as the US consists of states. Each college has a porter’s lodge, where visitors check in and students beg for help after locking their keys in their rooms. And where physicists ask for directions.

Last March, I ducked inside a porter’s lodge that bustled with deliveries. The woman behind the high wooden desk volunteered to help me, but I asked too many questions. By my fifth, her pointing at a map had devolved to jabbing.

Read the subtext, I told myself. Leave.

Or so I would have told myself, if not for that afternoon.

That afternoon, I’d visited Cambridge’s CMS, which merits every letter in “Centre for Mathematical Sciences.” Home to Isaac Newton’s intellectual offspring, the CMS consists of eight soaring, glass-walled, blue-topped pavilions. Their majesty walloped me as I turned off the road toward the gatehouse. So did the congratulatory letter from Queen Elizabeth II that decorated the route to the restroom.

P1040733

I visited Nilanjana Datta, an affiliated lecturer of Cambridge’s Faculty of Mathematics, and her student, Felix Leditzky. Nilanjana and Felix specialize in entropies and one-shot information theory. Entropies quantify uncertainties and efficiencies. Imagine compressing many copies of a message into the smallest possible number of bits (units of memory). How few bits can you use per copy? That number, we call the optimal compression rate. It shrinks as the number of copies compressed grows. As the number of copies approaches infinity, that compression rate drops toward a number called the message’s Shannon entropy. If the message is quantum, the compression rate approaches the von Neumann entropy.

Good luck squeezing infinitely many copies of a message onto a hard drive. How efficiently can we compress fewer copies? According to one-shot information theory, the answer involves entropies other than Shannon’s and von Neumann’s. In addition to describing data compression, entropies describe the charging of batteriesthe concentration of entanglementthe encrypting of messages, and other information-processing tasks.

Speaking of compressing messages: Suppose one-shot information theory posted status updates on Facebook. Suppose that that panel on your Facebook page’s right-hand side showed news weightier than celebrity marriages. The news feed might read, “TRENDING: One-shot information theory: Second-order asymptotics.”

Second-order asymptotics, I learned at the CMS, concerns how the optimal compression rate decays as the number of copies compressed grows. Imagine compressing a billion copies of a quantum message ρ. The number of bits needed about equals a billion times the von Neumann entropy HvN(ρ). Since a billion is less than infinity, 1,000,000,000 HvN(ρ) bits won’t suffice. Can we estimate the compression rate more precisely?

The question reminds me of gas stations’ hidden pennies. The last time I passed a station’s billboard, some number like $3.65 caught my eye. Each gallon cost about $3.65, just as each copy of ρ costs about HvN(ρ) bits. But a 9/10, writ small, followed the $3.65. If I’d budgeted $3.65 per gallon, I couldn’t have filled my tank. If you budget HvN(ρ) bits per copy of ρ, you can’t compress all your copies.

Suppose some station’s owner hatches a plan to promote business. If you buy one gallon, you pay $3.654. The more you purchase, the more the final digit drops from four. By cataloguing receipts, you calculate how a tank’s cost varies with the number of gallons, n. The cost equals $3.65 × n to a first approximation. To a second approximation, the cost might equal $3.65 × n + an, wherein a represents some number of cents. Compute a, and you’ll have computed the gas’s second-order asymptotics.

Nilanjana and Felix computed a’s associated with data compression and other quantum tasks. Second-order asymptotics met information theory when Strassen combined them in nonquantum problems. These problems developed under attention from Hayashi, Han, Polyanski, Poor, Verdu, and others. Tomamichel and Hayashi, as well as Li, introduced quantumness.

In the total-cost expression, $3.65 × n depends on n directly, or “linearly.” The second term depends on √n. As the number of gallons grows, so does √n, but √n grows more slowly than n. The second term is called “sublinear.”

Which is the word that rose to mind in the porter’s lodge. I told myself, Read the sublinear text.

Little wonder I irked the porter. At least—thanks to quantum information, my mistake, and facial expressions’ contagiousness—she smiled.

 

 

With thanks to Nilanjana Datta and Felix Leditzky for explanations and references; to Nilanjana, Felix, and Cambridge’s Centre for Mathematical Sciences for their hospitality; and to porters everywhere for providing directions.

BICEP2 and soccer

I’m a Caltech physics graduate student who has been working on BICEP2/Keck Array since 2009. BICEP2 reported the first detection of B-mode polarization on degree angular scales, which John Preskill lists as one of his 10 biggest thrills. In my first post here on Quantum Frontiers, I will allegorically describe the current state of affairs of cosmic microwave background (CMB) cosmology. In a future post, I may share with you some of the insider history of how we arrived at this result and how it has felt to be a graduate student through it all.

BICEP2 is an experimental victory. The interpretation of the result is a work in progress. Emphasizing this distinction, I’ll present a soccer analogy in the spirit of the times. It’s as if there are goals on the board at halftime in a soccer game where neither team is expected to score any goals whatsoever. Although there’s no guarantee, the most likely interpretation of having goals on the board at halftime is that one of the two teams, namely inflationary gravitational waves, will win the game.

FIFA represents particle physics. It has been an international powerhouse for about a century. Fans around the globe are tuned into the World Cup. The discovery of the Higgs boson was the champion last time. Who will win this year?

Major League Soccer (MLS), on the other hand, represents CMB cosmology. It is a young league, only around since the 1990s. Its roots go back to the 1960s via the North American Soccer League, best remembered for the New York Cosmos, (or perhaps their New Jersey neighbors) who won an amazing championship in 1978. If you’re not from North America, you probably don’t pay much attention to MLS. You’ve heard of the LA Galaxy thanks to European superstar “David Planckham,” but you couldn’t name any other team. You may even be annoyed that they call it “soccer” instead of “football.” MLS is divided into Western and Eastern Conferences just as CMB cosmologists are concerned with large and small angular scales. Our story concerns an exhibition game between two Western Conference teams. The San Jose Earthquakes represent the ground-shaking possibility of inflationary gravitational waves. The LA Galaxy represent astronomical foregrounds. (Pick your favorite teams and adjust the analogy to taste.)

Let’s pretend that every time the Earthquakes and the Galaxy have ever played against each other over the years, the result has always been 0-0. Not just a draw. Scoreless. The matches become so boring year after year that they become a comedic punchline. Both sides’ defenses are too strong. To bring the fans back to the stands, the two teams agree to a publicity stunt. They’ll play a game at the South Pole! Crazy, right? The event is sponsored by BICEP, a brand of energy drink.*

Bookmakers take bets on the final score. Since neither team has ever scored against the other, the rules are simplified. You don’t have to bet on who will win or by how many goals. You only have to bet on the total score (detection significance in “sigma”). The odds for zero total score are high, so it’s the safe choice. Payoffs increase a little bit if you’re willing to bet against odds on a positive score, maybe one or two goals. You’ll win a small fortune if you correctly bet that three goals (“evidence”) are on the board by the end. A game ending with five goals or above (“discovery”) earns you a jackpot. Dizzle Mapo**, owner of the BICEP brand and an Earthquakes fan, decides to make it more interesting by putting a sizable chunk of her own money down on the long shot bet that at least five goals will be scored by the end. She doesn’t expect to get her money back. But why not go for it? She has the potential to win the largest sum in the history of sports gambling.

They play the game. It’s a slow one with lots of stoppage time. When the game is over, the final score is still a disappointing nil. Oh well, at least we all had a laugh watching the players stuff hand warmers in their shinguards.

One year later, Dizzle organizes for the two teams to return to South Pole in the BICEP2 friendly. She doesn’t want another stalemate, so as organizer decides to modify the net. Since she’s an Earthquakes fan, she widens only the Galaxy’s net. (The noise-equivalent temperature or NET is a measure of a CMB telescope’s sensitivity. BICEP2 looked only at the 150 GHz band in a field where the CMB was predicted to be brighter than the foregrounds.)

Dizzle discloses the widening of the net to the bookmakers. They will allow her to repeat her wager on the condition that they modify the scoreboard to display only the total score and not the individual team scores. She agrees. If there are any goals on the board, she’ll be pretty sure that they are goals for the Earthquakes. She won’t need to keep close track of which team scores which goals. After all, the Galaxy are the ones with the handicap. Few other bets come in. Most of the world is distracted by the World Cup. Besides, the common wisdom is that the game will end with zero goals, same as always. OK, maybe it’ll end with one or two goals. A week from now hardly anyone will remember the score anyway. Big whoop.

The BICEP2 players themselves are more excited about the game than the fans are. They’re playing to win. They kick in some goals. By minute twenty, the total score climbs to 5.

Wow! (It’s a discovery!) Dizzle screams so loudly that people can hear her through the thin walls of her luxury box. Word spreads across social media that something exciting is happening at some obscure soccer game at South Pole although it’s unclear what. Most of the mainstream media weren’t following it too closely until the rumors. Even the sports journalists were focused on the FIFA World Cup instead. It’s about ten minutes before halftime, and they’re all catching up. By the time they make any sense of the situation, that giant 5 is still on display.

In the next few minutes of playtime, events external to the game start to get weirder. First, an earthquake strikes the city of LA. For real. Second, reporters prepare the news item with a misleading headline, “Earthquakes score big win in BICEP2 soccer match at South Pole.” A more descriptive headline could have read, “Earthquakes likely, Galaxy possibly, scored big gambling win for Dizzle Mapo in BICEP2 soccer match at South Pole.” When asked for comment, Dizzle says, “I just won the biggest sports bet ever! Go Earthquakes!” What she means, of course, is, “I just won a bet that this game would have more than 5 goals scored in total. I won that bet, and my favorite team is likely winning by a strong margin. I’ll keep cheering for them through the second half. I may be able to figure out the exact score for each team if I ask what other spectators around the stadium saw. Next year, I’ll widen both nets and display both teams’ scores.” What the interviewers think she means, however, is, “I just won a bet that the Earthquakes would win big time. They did.” Journalists want to be accurate, but the story is unfamiliar and complicated. They are racing against the clock to break the story.

Soccer fans around the world pause a moment from the World Cup to read the piece. “Earthquakes did well in that BICEP2 game? Oh cool.” Even people who aren’t sports fans at all see the headline. “A soccer match at South Pole? Sounds fascinating!” Only people who read the entire news article learn about the wider net and the simplified scoreboard, and only the dedicated fans actively watching the game know that it’s still ongoing.

A TV camera is set up to broadcast the remainder of the game live. With one minute left in the first half, the referee blows the whistle. A player for the Galaxy is down. It’s unclear from the replay whether it was a foul or a flop. In any case the Galaxy are awarded a penalty kick. The shot goes in.

Now it’s halftime. In an impromptu ceremony, the bookmaker P. R. Letters presents Dizzle with an oversized check. Asked what she plans to do next, she exclaims, “I’m going to watch Spice World!”*** Expert sports commentators at the scene give an accurate halftime report. Five goals were scored, the majority of which were likely scored by the Earthquakes. At least one goal was scored by the Galaxy. Meanwhile, in the rest of the media, a new headline appears, even more misleading than the last one, “What the Dizzle? BICEP stock price drops as Earthquakes proven not to have shut out Galaxy.”

A few hooligans jeer from the sidelines that Dizzle was irresponsible for speaking to media before halftime, nay, before consulting everyone in the stadium for a detailed reconstruction of every play from start to finish. Ignore them. It’s a soccer game at the freakin’ South Pole. It’s the biggest win in sports gambling history. It’s news.  The public deserves to hear it. The confused media wouldn’t even be here in time for the jumbo check and halftime report if it weren’t for all the advance buzz. Is it so wrong that they take a breather from FIFA to learn about MSL and stick around for part two?

I can’t predict what will happen in the second half. Based on the balance of information about the unusual rules and the total score, the Earthquakes are likely in the lead and favored to win. Conceivably, the Galaxy can make a comeback. In fact – again conceivably – the Galaxy could already be in the lead. Maybe they trained much harder than anyone expected to offset their disadvantageously wide net. No matter the outcome, both teams will be good sports about it. Fans on both sides will demand a rematch. Dizzle has announced her plans to host BICEP3 because South Pole soccer is simply fun. There are many other brands of energy drink hoping to sponsor their own exhibition matches. It’s a competitive industry. You can also follow the Eastern Conference, where there’s an entirely different game being played. I’d pay to watch a game in outer space.

* BICEP = CAPITAL MUSCLE!!
** “Dizzle Mapo” refers to the Dark Sector Laboratory (DSL), which has housed the BICEP series of telescopes, and the Martin A. Pomerantz Observatory (MAPO), which currently houses Keck Array.
*** The five Keck Array receivers are named after the Spice Girls.

Ten reasons why black holes exist

I spent the past two weeks profoundly confused. I’ve been trying to get up to speed on this firewall business and I wanted to understand the picture below.

Much confuse. Such lost. [Is doge out of fashion now? I wouldn't know because I've been trapped in a black hole!]

[Technical paragraph that you can skip.] I’ve been trying to understand why the picture on the left is correct, even though my intuition said the middle picture should be (intuition should never be trusted when thinking about quantum gravity.) The details of these pictures are technical and tangential to this post, but the brief explanation is that these pictures are called Penrose diagrams and they provide an intuitive way to think about the time dynamics of black holes. The two diagrams on the left represent the same physics as the schematic diagram on the right. I wanted to understand why during Hawking radiation, the radial momentum for partner modes is in the same direction. John Preskill gave me the initial reasoning, that “radial momentum is not an isometry of Schwarzchild or Rindler geometries,” then I struggled for a few weeks to unpack this, and then Dan Harlow rescued me with some beautiful derivations that make it crystal clear that the picture on the left is indeed correct. I wanted to understand this because if the central picture is correct, then it would be hard to catch up to an infalling Hawking mode and therefore to verify firewall weirdness. The images above are simple enough, but maybe the image below will give you a sense for how much of an uphill battle this was!

leftorrightinfallingmodes

This pretty much sums up my last two weeks (with the caveat that each of these scratch sheets is double sided!) Or in case you wanted to know what a theoretical physicist does all day.

After four or five hours of maxing out my brain, it starts to throb. For the past couple of weeks, after breaking my brain with firewalls each day, I’ve been switching gears and reading about black hole astronomy (real-life honest-to-goodness science with data!) Beyond wanting to know the experimental state-of-the-art related to the fancy math I’ve been thinking about, I also had the selfish motivation that I wanted to do some PR maintenance after Nature’s headline: “Stephen Hawking: ‘There are no black holes’.” I found this headline infuriating when Nature posted it back in January. When taken out of context, this quote makes it seem like Stephen Hawking was saying “hey guys, my bad, we’ve been completely wrong all this time. Turn off the telescopes.” When in reality what he was saying was more like: “hey guys, I think this really hard modern firewall paradox is telling us that we’ve misunderstood an extremely subtle detail and we need to make corrections on the order of a few Planck lengths, but it matters!” When you combine this sensationalism with Nature’s lofty credibility, the result is that even a few of my intelligent scientist peers have interpreted this as the non-existence of astrophysical black holes. Not to mention that it opens a crack for the news media to say things like: ‘if even Stephen Hawking has been wrong all this time, then how can we possibly trust the rest of this scientist lot, especially related to climate change?’ So brain throbbing + sensationalism => learning black hole astronomy + PR maintenance.

Before presenting the evidence, I should wave my hands about what we’re looking for. You have all heard about black holes. They are objects where so much mass gets concentrated in such a small volume that Einstein’s general theory of relativity predicts that once an object passes beyond a certain distance (called the event horizon), then said object will never be able to escape, and must proceed to the center of the black hole. Even photons cannot escape once they pass beyond the event horizon (except when you start taking quantum mechanics into account, but this is a small correction which we won’t focus on here.) All of our current telescopes collect photons, and as I just mentioned, when photons get close to a black hole, they fall in, so this means a detection with current technology will only be indirect. What are these indirect detections we have made? Well, general relativity makes numerous predictions about black holes. After we confirm enough of these predictions to a high enough precision, and without a viable alternative theory, we can safely conclude that we have detected black holes. This is similar to how many other areas of science work, like particle physics finding new particles through detecting a particle’s decay products, for example.

Without further ado, I hope the following experimental evidence will convince you that black holes permeate our universe (and if not black holes, then something even weirder and more interesting!)

1. Sgr A*: There is overwhelming evidence that there is a supermassive black hole at the center of our galaxy, the Milky Way. As a quick note, most of the black holes we have detected are broken into two categories, solar mass, where they are only a few times more massive than our sun (5-30 solar masses), or supermassive, where the mass is about 10^5-10^{10} solar masses. Some of the most convincing evidence comes from the picture below. Andrea Ghez and others tracked the orbits of several stars around the center of the Milky Way for over twenty years. We have learned that these stars orbit around a point-like object with a mass on the order of 4\times 10^6 solar masses. Measurements in the radio spectrum show that there is a radio source located in the same location which we call Sagittarius A* (Sgr A*). Sgr A* is moving at less than 1km/s and has a mass of at least 10^5 solar masses. These bounds make it pretty clear that Sgr A* is the same object as what is at the focus of these orbits. A radio source is exactly what you would expect for this system because as dust particles get pulled towards the black hole, they collide and friction causes them to heat up, and hot objects radiate photons. These arguments together make it pretty clear that Sgr A* is a supermassive black hole at the center of the Milky Way!

This plot shows the orbits of a few stars

What are you looking at? This plot shows the orbits of a few stars around the center of our galaxy, tracked over 17 years!

2. Orbit of S2: During a recent talk that Andrea Ghez gave at Caltech, she said that S2 is “her favorite star.” S2 is a 15 solar mass star located near the black hole at the center of our galaxy. S2’s distance from this black hole is only about four times the distance from Neptune to the Sun (at closest point in orbit), and it’s orbital period is only 15 years. The Keck telescopes in Mauna Kea have followed almost two complete orbits of S2. This piece of evidence is redundant compared to point 1, but it’s such an amazing technological feat that I couldn’t resist including it.

We've followed S2's complete orbit. Is it orbiting around nothing? Something exotic that we have no idea about? Or much more likely around a black hole.

We’ve followed S2’s complete orbit. Is it orbiting around nothing? Something exotic that we have no idea about? Or much more likely around a black hole.

3. Numerical studies: astrophysicists have done numerous numerical simulations which provide a different flavor of test. Christian Ott at Caltech is pretty famous for these types of studies.

Image from a numerical simulation that Christian Ott and his student Evan O'Connor performed.

Image from a numerical simulation that Christian Ott and his student Evan O’Connor performed.

4. Cyg A: Cygnus A is a galaxy located in the Cygnus constellation. It is an exceptionally bright radio source. As I mentioned in point 1, as dust falls towards a black hole, friction causes it to heat up and then hot objects radiate away photons. The image below demonstrates this. We are able to use the Eddington limit to convert luminosity measurements into estimates of the mass of Cyg A. Not necessarily in the case of Cyg A, but in the case of its cousins Active Galactic Nuclei (AGNs) and Quasars, we are also able to put bounds on their sizes. These two things together show that there is a huge amount of mass trapped in a small volume, which is therefore probably a black hole (alternative models can usually be ruled out.)

There is a supermassive black hole at the center of this image. Dust falls towards it, gets heated up

There is a supermassive black hole at the center of this image which powers the rest of this action! The black hole is spinning and it emits relativistic jets along its axis of rotation. The blobs come from the jets colliding with the intergalactic medium.

5. AGNs and Quasars: these are bright sources which are powered by supermassive black holes. Arguments similar to those used for Cyg A make us confident that they really are powered by black holes and not some alternative.

6. X-ray binaries: astronomers have detected ~20 stellar mass black holes by finding pairs consisting of a star and a black hole, where the star is close enough that the black hole is sucking in its mass. This leads to accretion which leads to the emission of X-Rays which we detect on Earth. Cygnus X-1 is a famous example of this.

7. Water masers: Messier 106 is the quintessential example.

8. Gamma ray bursts: most gamma ray bursts occur when a rapidly spinning high mass star goes supernova (or hypernova) and leaves a neutron star or black hole in its wake. However, it is believed that some of the “long” duration gamma ray bursts are powered by accretion around rapidly spinning black holes.

That’s only eight reasons but I hope you’re convinced that black holes really exist! To round out this list to include ten things, here are two interesting open questions related to black holes:

1. Firewalls: I mentioned this paradox at the beginning of this post. This is the cutting edge of quantum gravity which is causing hundreds of physicists to pull their hair out!

2. Feedback: there is an extremely strong correlation between the size of a galaxy’s supermassive black hole and many of the other properties in the galaxy. This connection was only realized about a decade ago and trying to understand how the black hole (which has a mass much smaller than the total mass of the galaxy) affects galaxy formation is an active area of research in astrophysics.

In addition to everything mentioned above, I want to emphasize that most of these results are only from the past decade. Not to mention that we seem to be close to the dawn of gravitational wave astronomy which will allow us to probe black holes more directly. There are also exciting instruments that have recently come online, such as NuSTAR. In other words, this is an extremely exciting time to be thinking about black holes–both from observational and theoretical perspectives–we have data and a paradox! In conclusion, black holes exist. They really do. And let’s all make a pact to read critically in the 21st century!

Cool resource from Sky and Telescope.

[* I want to thank my buddy Kaes Van't Hof for letting me crash on his couch in NYC last week, which is where I did most of this work. ** I also want to thank Dan Harlow for saving me months of confusion by sharing a draft of his notes from his course on firewalls at the Israeli Institute for Advanced Study's winter school in theoretical physics.]

“Feveral kinds of hairy mouldy fpots”

The book had a sheepskin cover, and mold was growing on the sheepskin. Robert Hooke, a pioneering microbiologist, slid the cover under one of the world’s first microscopes. Mold, he discovered, consists of “nothing elfe but feveral kinds of fmall and varioufly figur’d Mufhroms.” He described the Mufhroms in his treatise Micrographia, a 1665 copy of which I found in “Beautiful Science.” An exhibition at San Marino’s Huntington Library, “Beautiful Science” showcases the physics of rainbows, the stars that enthralled Galileo, and the world visible through microscopes.

Hooke image copy

Beautiful science of yesterday: An illustration, from Hooke’s Micrographia, of the mold.

“[T]hrough a good Microfcope,” Hooke wrote, the sheepskin’s spots appeared “to be a very pretty fhap’d Vegetative body.”

How like a scientist, to think mold pretty. How like quantum noise, I thought, Hooke’s mold sounds.

Quantum noise hampers systems that transmit and detect light. To phone a friend or send an email—“Happy birthday, Sarah!” or “Quantum Frontiers has released an article”—we encode our message in light. The light traverses a fiber, buried in the ground, then hits a detector. The detector channels the light’s energy into a current, a stream of electrons that flows down a wire. The variations in the current’s strength is translated into Sarah’s birthday wish.

If noise doesn’t corrupt the signal. From encoding “Happy birthday,” the light and electrons might come to encode “Hsappi birthdeay.” Quantum noise arises because light consists of packets of energy, called “photons.” The sender can’t control how many photons hit the detector.

To send the letter H, we send about 108 photons.* Imagine sending fifty H’s. When we send the first, our signal might contain 108- 153 photons; when we send the second, 108 + 2,083; when we send the third, 108 – 6; and so on. Receiving different numbers of photons, the detector generates different amounts of current. Different amounts of current can translate into different symbols. From H, our message can morph into G.

This spring, I studied quantum noise under the guidance of IQIM faculty member Kerry Vahala. I learned to model quantum noise, to quantify it, when to worry about it, and when not. From quantum noise, we branched into Johnson noise (caused by interactions between the wire and its hot environment); amplified-spontaneous-emission, or ASE, noise (caused by photons belched by ions in the fiber); beat noise (ASE noise breeds with the light we sent, spawning new noise); and excess noise (the “miscellaneous” folder in the filing cabinet of noise types).

Vahala image copy

Beautiful science of today: A microreso-nator—a tiny pendulum-like device— studied by the Vahala group.

Noise, I learned, has structure. It exhibits patterns. It has personalities. I relished studying those patterns as I relish sending birthday greetings while battling noise. Noise types, I see as a string of pearls unearthed in a junkyard. I see them as “pretty fhap[es]” in Hooke’s treatise. I see them—to pay a greater compliment—as “hairy mouldy fpots.”

P1040754

*Optical-communications ballpark estimates:

  • Optical power: 1 mW = 10-3 J/s
  • Photon frequency: 200 THz = 2 × 1014 Hz
  • Photon energy: h𝜈 = (6.626 × 10-34 J . s)(2 × 1014 Hz) = 10-19 J
  • Bit rate: 1 GB = 109 bits/s
  • Number of bits per H: 10
  • Number of photons per H: (1 photon / 10-19 J) (10-3 J/s)(1 s / 109 bits)(10 bits / 1 H) = 108

 

An excerpt from this post was published today on Verso, the blog of the Huntington Library, Art Collection, and Botanical Gardens.

With thanks to Bassam Helou, Dan Lewis, Matt Stevens, and Kerry Vahala for feedback. With thanks to the Huntington Library (including Catherine Wehrey) and the Vahala group for the Micrographia image and the microresonator image, respectively.

The theory of everything: Help wanted

When Scientific American writes that physicists are working on a theory of everything, does it sound ambitious enough to you? Do you lie awake at night thinking that a theory of everything should be able to explain, well, everything? What if that theory is founded on quantum mechanics and finds a way to explain gravitation through the microscopic laws of the quantum realm? Would that be a grand unified theory of everything?

The answer is no, for two different, but equally important reasons. First, there is the inherent assumption that quantum systems change in time according to Schrodinger’s evolution: i \hbar \partial_t \psi(t) = H \psi(t). Why? Where does that equation come from? Is it a fundamental law of nature, or is it an emergent relationship between different states of the universe? What if the parameter t, which we call time, as well as the linear, self-adjoint operator H, which we call the Hamiltonian, are both emergent from a more fundamental, and highly typical phenomenon: the large amount of entanglement that is generically found when one decomposes the state space of a single, static quantum wavefunction, into two (different in size) subsystems: a clock and a space of configurations (on which our degrees of freedom live)? So many questions, so few answers.

The static multiverse

The perceptive reader may have noticed that I italicized the word ‘static’ above, when referring to the quantum wavefunction of the multiverse. The emphasis on static is on purpose. I want to make clear from the beginning that a theory of everything can only be based on axioms that are truly fundamental, in the sense that they cannot be derived from more general principles as special cases. How would you know that your fundamental principles are irreducible? You start with set theory and go from there. If that assumes too much already, then you work on your set theory axioms. On the other hand, if you can exhibit a more general principle from which your original concept derives, then you are on the right path towards more fundamentalness.

In that sense, time and space as we understand them, are not fundamental concepts. We can imagine an object that can only be in one state, like a switch that is stuck at the OFF position, never changing or evolving in any way, and we can certainly consider a complete graph of interactions between subsystems (the equivalent of a black hole in what we think of as space) with no local geometry in our space of configurations. So what would be more fundamental than time and space? Let’s start with time: The notion of an unordered set of numbers, such as \{4,2,5,1,3,6,8,7,12,9,11,10\}, is a generalization of a clock, since we are only keeping the labels, but not their ordering. If we can show that a particular ordering emerges from a more fundamental assumption about the very existence of a theory of everything, then we have an understanding of time as a set of ordered labels, where each label corresponds to a particular configuration in the mathematical space containing our degrees of freedom. In that sense, the existence of the labels in the first place corresponds to a fundamental notion of potential for change, which is a prerequisite for the concept of time, which itself corresponds to constrained (ordered in some way) change from one label to the next. Our task is first to figure out where the labels of the clock come from, then where the illusion of evolution comes from in a static universe (Heisenberg evolution), and finally, where the arrow of time comes from in a macroscopic world (the illusion of irreversible evolution).

The axioms we ultimately choose must satisfy the following conditions simultaneously: 1. the implications stemming from these assumptions are not contradicted by observations, 2. replacing any one of these assumptions by its negation would lead to observable contradictions, and 3. the assumptions contain enough power to specify non-trivial structures in our theory. In short, as Immanuel Kant put it in his accessible bedtime story The critique of Pure Reason, we are looking for synthetic a priori knowledge that can explain space and time, which ironically were Kant’s answer to that same question.

The fundamental ingredients of the ultimate theory

Before someone decides to delve into the math behind the emergence of unitarity (Heisenberg evolution) and the nature of time, there is another reason why the grand unified theory of everything has to do more than just give a complete theory of how the most elementary subsystems in our universe interact and evolve. What is missing is the fact that quantity has a quality all its own. In other words, patterns emerge from seemingly complex data when we zoom out enough. This “zooming out” procedure manifests itself in two ways in physics: as coarse-graining of the data and as truncation and renormalization. These simple ideas allow us to reduce the computational complexity of evaluating the next state of a complex system: If most of the complexity of the system is hidden at a level you cannot even observe (think pre retina-display era), then all you have to keep track of is information at the macroscopic, coarse-grained level. On top of that, you can use truncation and renormalization to zero in on the most likely/ highest weight configurations your coarse-grained data can be in – you can safely throw away a billion configurations, if their combined weight is less than 0.1% of the total, because your super-compressed data will still give you the right answer with a fidelity of 99.9%. This is how you get to reduce a 9 GB raw video file down to a 300 MB Youtube video that streams over your WiFi connection without losing too much of the video quality.

I will not focus on the second requirement for the “theory of everything”, the dynamics of apparent complexity. I think that this fundamental task is the purview of other sciences, such as chemistry, biology, anthropology and sociology, which look at the “laws” of physics from higher and higher vantage points (increasingly coarse-graining the topology of the space of possible configurations). Here, I would like to argue that the foundation on which a theory of everything rests, at the basement level if such a thing exists, consists of four ingredients: Math, Hilbert spaces with tensor decompositions into subsystems, stability and compressibility. Now, you know about math (though maybe not of Zermelo-Fraenkel set theory), you may have heard of Hilbert spaces if you majored in math and/or physics, but you don’t know what stability, or compressibility mean in this context. So let me motivate the last two with a question and then explain in more detail below: What are the most fundamental assumptions that we sweep under the rug whenever we set out to create a theory of anything that can fit in a book – or ten thousand books – and still have predictive power? Stability and compressibility.

Math and Hilbert spaces are fundamental in the following sense: A theory needs a Language in order to encode the data one can extract from that theory through synthesis and analysis. The data will be statistical in the most general case (with every configuration/state we attach a probability/weight of that state conditional on an ambient configuration space, which will often be a subset of the total configuration space), since any observer creating a theory of the universe around them only has access to a subset of the total degrees of freedom. The remaining degrees of freedom, what quantum physicists group as the Environment, affect our own observations through entanglement with our own degrees of freedom. To capture this richness of correlations between seemingly uncorrelated degrees of freedom, the mathematical space encoding our data requires more than just a metric (i.e. an ability to measure distances between objects in that space) – it requires an inner-product: a way to measure angles between different objects, or equivalently, the ability to measure the amount of overlap between an input configuration and an output configuration, thus quantifying the notion of incremental change. Such mathematical spaces are precisely the Hilbert spaces mentioned above and contain states (with wavefunctions being a special case of such states) and operators acting on the states (with measurements, rotations and general observables being special cases of such operators). But, let’s get back to stability and compressibility, since these two concepts are not standard in physics.

Stability

Stability is that quality that says that if the theory makes a prediction about something observable, then we can test our theory by making observations on the state of the world and, more importantly, new observations do not contradict our theory. How can a theory fall apart if it is unstable? One simple way is to make predictions that are untestable, since they are metaphysical in nature (think of religious tenets). Another way is to make predictions that work for one level of coarse-grained observations and fail for a lower level of finer coarse-graining (think of Newtonian Mechanics). A more extreme case involves quantum mechanics assumed to be the true underlying theory of physics, which could still fail to produce a stable theory of how the world works from our point of view. For example, say that your measurement apparatus here on earth is strongly entangled with the current state of a star that happens to go supernova 100 light-years from Earth during the time of your experiment. If there is no bound on the propagation speed of the information between these two subsystems, then your apparatus is engulfed in flames for no apparent reason and you get random data, where you expected to get the same “reproducible” statistics as last week. With no bound on the speed with which information can travel between subsystems of the universe, our ability to explain and/or predict certain observations goes out the window, since our data on these subsystems will look like white noise, an illusion of randomness stemming from the influence of inaccessible degrees of freedom acting on our measurement device. But stability has another dimension; that of continuity. We take for granted our ability to extrapolate the curve that fits 1000 data points on a plot. If we don’t assume continuity (and maybe even a certain level of smoothness) of the data, then all bets are off until we make more measurements and gather additional data points. But even then, we can never gather an infinite (let alone, uncountable) number of data points – we must extrapolate from what we have and assume that the full distribution of the data is close in norm to our current dataset (a norm is a measure of distance between states in the Hilbert space).

The emergence of the speed of light

The assumption of stability may seem trivial, but it holds within it an anthropic-style explanation for the bound on the speed of light. If there is no finite speed of propagation for the information between subsystems that are “far apart”, from our point of view, then we will most likely see randomness where there is order. A theory needs order. So, what does it mean to be “far apart” if we have made no assumption for the existence of an underlying geometry, or spacetime for that matter? There is a very important concept in mathematical physics that generalizes the concept of the speed of light for non-relativistic quantum systems whose subsystems live on a graph (i.e. where there may be no spatial locality or apparent geometry): the Lieb-Robinson velocity. Those of us working at the intersection of mathematical physics and quantum many-body physics, have seen first-hand the powerful results one can get from the existence of such an effective and emergent finite speed of propagation of information between quantum subsystems that, in principle, can signal to each other instantaneously through the action of a non-local unitary operator (rotation of the full system under Heisenberg evolution). It turns out that under certain natural assumptions on the graph of interactions between the different subsystems of a many-body quantum system, such a finite speed of light emerges naturally. The main requirement on the graph comes from the following intuitive picture: If each node in your graph is connected to only a few other nodes and the number of paths between any two nodes is bounded above in some nice way (say, polynomially in the distance between the nodes), then communication between two distant nodes will take time proportional to the distance between the nodes (in graph distance units, the smallest number of nodes among all paths connecting the two nodes). Why? Because at each time step you can only communicate with your neighbors and in the next time step they will communicate with theirs and so on, until one (and then another, and another) of these communication cascades reaches the other node. Since you have a bound on how many of these cascades will eventually reach the target node, the intensity of the communication wave is bounded by the effective action of a single messenger traveling along a typical path with a bounded speed towards the destination. There should be generalizations to weighted graphs, but this area of mathematical physics is still really active and new results on bounds on the Lieb-Robinson velocity gather attention very quickly.

Escaping black holes

If this idea holds any water, then black holes are indeed nearly complete graphs, where the notion of space and time breaks down, since there is no effective bound on the speed with which information propagates from one node to another. The only way to escape is to find yourself at the boundary of the complete graph, where the nodes of the black hole’s apparent horizon are connected to low-degree nodes outside. Once you get to a low-degree node, you need to keep moving towards other low-degree nodes in order to escape the “gravitational pull” of the black hole’s super-connectivity. In other words, gravitation in this picture is an entropic force: we gravitate towards massive objects for the same reason that we “gravitate” towards the direction of the arrow of time: we tend towards higher entropy configurations – the probability of reaching the neighborhood of a set of highly connected nodes is much, much higher than hanging out for long near a set of low-degree nodes in the same connected component of the graph. If a graph has disconnected components, then their is no way to communicate between the corresponding spacetimes – their states are in a tensor product with each other. One has to carefully define entanglement between components of a graph, before giving a unified picture of how spatial geometry arises from entanglement. Somebody get to it.

Erik Verlinde has introduced the idea of gravity as an entropic force and Fotini Markopoulou, et al. have introduced the notion of quantum graphity (gravity emerging from graph models). I think these approaches must be taken seriously, if only because they work with more fundamental principles than the ones found in Quantum Field Theory and General Relativity. After all, this type of blue sky thinking has led to other beautiful connections, such as ER=EPR (the idea that whenever two systems are entangled, they are connected by a wormhole). Even if we were to disagree with these ideas for some technical reason, we must admit that they are at least trying to figure out the fundamental principles that guide the things we take for granted. Of course, one may disagree with certain attempts at identifying unifying principles simply because the attempts lack the technical gravitas that allows for testing and calculations. Which is why a technical blog post on the emergence of time from entanglement is in the works.

Compressibility

So, what about that last assumption we seem to take for granted? How can you have a theory you can fit in a book about a sequence of events, or snapshots of the state of the observable universe, if these snapshots look like the static noise on a TV screen with no transmission signal? Well, you can’t! The fundamental concept here is Kolmogorov complexity and its connection to randomness/predictability. A sequence of data bits like:

10011010101101001110100001011010011101010111010100011010110111011110

has higher complexity (and hence looks more random/less predictable) than the sequence:

10101010101010101010101010101010101010101010101010101010101010101010

because there is a small computer program that can output each successive bit of the latter sequence (even if it had a million bits), but (most likely) not of the former. In particular, to get the second sequence with one million bits one can write the following short program:

string s = ’10′;
for n=1 to 499,999:
s.append(’10’);
n++;
end
print s;

As the number of bits grows, one may wonder if the number of iterations (given above by 499,999), can be further compressed to make the program even smaller. The answer is yes: The number 499,999 in binary requires \log_2 499,999 bits, but that binary number is a string of 0s and 1s, so it has its own Kolmogorov complexity, which may be smaller than \log_2 499,999. So, compressibility has a strong element of recursion, something that in physics we associate with scale invariance and fractals.

You may be wondering whether there are truly complex sequences of 0,1 bits, or if one can always find a really clever computer program to compress any N bit string down to, say, N/100 bits. The answer is interesting: There is no computer program that can compute the Kolmogorov complexity of an arbitrary string (the argument has roots in Berry’s Paradox), but there are strings of arbitrarily large Kolmogorov complexity (that is, no matter what program we use and what language we write it in, the smallest program (in bits) that outputs the N-bit string will be at least N bits long). In other words, there really are streams of data (in the form of bits) that are completely incompressible. In fact, a typical string of 0s and 1s will be almost completely incompressible!

Stability, compressibility and the arrow of time

So, what does compressibility have to do with the theory of everything? It has everything to do with it. Because, if we ever succeed in writing down such a theory in a physics textbook, we will have effectively produced a computer program that, given enough time, should be able to compute the next bit in the string that represents the data encoding the coarse-grained information we hope to extract from the state of the universe. In other words, the only reason the universe makes sense to us is because the data we gather about its state is highly compressible. This seems to imply that this universe is really, really special and completely atypical. Or is it the other way around? What if the laws of physics were non-existent? Would there be any consistent gravitational pull between matter to form galaxies and stars and planets? Would there be any predictability in the motion of the planets around suns? Forget about life, let alone intelligent life and the anthropic principle. Would the Earth, or Jupiter even know where to go next if it had no sense that it was part of a non-random plot in the movie that is spacetime? Would there be any notion of spacetime to begin with? Or an arrow of time? When you are given one thousand frames from one thousand different movies, there is no way to make a single coherent plot. Even the frames of a single movie would make little sense upon reshuffling.

What if the arrow of time emerged from the notions of stability and compressibility, through coarse-graining that acts as a compression algorithm for data that is inherently highly-complex and, hence, highly typical as the next move to make? If two strings of data look equally complex upon coarse-graining, but one of them has a billion more ways of appearing from the underlying raw data, then which one will be more likely to appear in the theory-of-everything book of our coarse-grained universe? Note that we need both high compressibility after coarse-graining in order to write down the theory, as well as large entropy before coarse-graining (from a large number of raw strings that all map to one string after coarse-graining), in order to have an arrow of time. It seems that we need highly-typical, highly complex strings that become easy to write down once we coarse grain the data in some clever way. Doesn’t that seem like a contradiction? How can a bunch of incompressible data become easily compressible upon coarse-graining? Here is one way: Take an N-bit string and define its 1-bit coarse-graining as the boolean AND of its digits. All but one strings will default to 0. The all 1s string will default to 1. Equally compressible, but the probability of seeing the 1 after coarse-graining is 2^{-N}. With only 300 bits, finding the coarse-grained 1 is harder than looking for a specific atom in the observable universe. In other words, if the coarse-graining rule at time t is the one given above, then you can be pretty sure you will be seeing a 0 come up next in your data. Notice that before coarse-graining, all 2^N strings are equally likely, so there is no arrow of time, since there is no preferred string from a probabilistic point of view.

Conclusion, for now

When we think about the world around us, we go to our intuitions first as a starting point for any theory describing the multitude of possible experiences (observable states of the world). If we are to really get to the bottom of this process, it seems fruitful to ask “why do I assume this?” and “is that truly fundamental or can I derive it from something else that I already assumed was an independent axiom?” One of the postulates of quantum mechanics is the axiom corresponding to the evolution of states under Schrodinger’s equation. We will attempt to derive that equation from the other postulates in an upcoming post. Until then, your help is wanted with the march towards more fundamental principles that explain our seemingly self-evident truths. Question everything, especially when you think you really figured things out. Start with this post. After all, a theory of everything should be able to explain itself.

UP NEXT: Entanglement, Schmidt decomposition, concentration measure bounds and the emergence of discrete time and unitary evolution.

Top 10 questions for your potential PhD adviser/group

Everyone in grad school has taken on the task of picking the perfect research group at some point.  Then some among us had the dubious distinction of choosing the perfect research group twice.  Luckily for me, a year of grad research taught me a lot and I found myself asking group members and PIs (primary investigators) very different questions.  And luckily for you, I wrote these questions down to share with future generations.  My background as an experimental applied physicist showed through initially, so I got Shaun Maguire and Spiros Michalakis to help make it applicable for theorists too, and most of them should be useful outside physics as well.

Questions to break that silence when your potential advisor asks “So, do you have any questions for me?”

1. Are you taking new students?
– 2a. if yes: How many are you looking to take?
– 2b. if no: Ask them about the department or other professors.  They’ve been there long enough to have opinions.  Alternatively, ask what kinds of questions they would suggest you ask other PIs
3. What is the procedure for joining the group?
4. (experimental) Would you have me TA?  (This is the nicest way I thought of to ask if a PI can fund you with a research assistance-ship (RA), though sometimes they just like you to TA their class.)
4. (theory) Funding routes will often be covered by question 3 since TAs are the dominant funding method for theory students, unlike for experimentalists. If relevant, you can follow up with: How does funding for your students normally work? Do you have funding for me?
5. Do new students work for/report to other grad students, post docs, or you directly?
6. How do you like students to arrange time to meet with you?
7. How often do you have group meetings?
8. How much would you like students to prepare for them?
9. Would you suggest I take any specific classes?
10. What makes someone a good fit for this group?

And then for the high bandwidth information transfer.  Grill the group members themselves, and try to ask more than one group member if you can.

1. How much do you prepare for meetings with PI?
2. How long until people lead their own project? – Equivalently, who’s working on what projects.
3. How much do people on different projects communicate? (only group meeting or every day)
4. Is the PI hands on (how often PI wants to meet with you)?
5. Is the PI accessible (how easily can you meet with the PI if you want to)?
6. What is the average time to graduation? (if it’s important to you personally)
7. Does the group/subgroup have any bonding activities?
8. Do you think I should join this group?
9. What are people’s backgrounds?
10. What makes someone a good fit for this group?

Hope that helps.  If you have any other suggested questions, be sure to leave them in the comments.

John Preskill and the dawn of the entanglement frontier

Editor’s Note: John Preskill’s recent election to the National Academy of Sciences generated a lot of enthusiasm among his colleagues and students. In an earlier post today, famed Stanford theoretical physicist, Leonard Susskind, paid tribute to John’s early contributions to physics ranging from magnetic monopoles to the quantum mechanics of black holes. In this post, Daniel Gottesman, a faculty member at the Perimeter Institute, takes us back to the formative years of the Institute for Quantum Information at Caltech, the precursor to IQIM and a world-renowned incubator for quantum information and quantum computation research. Though John shies away from the spotlight, we, at IQIM, believe that the integrity of his character and his role as a mentor and catalyst for science are worthy of attention and set a good example for current and future generations of theoretical physicists.

Preskill's legacy may well be the incredible number of preeminent research scientists in quantum physics he has mentored throughout his extraordinary career.

Preskill’s legacy may well be the incredible number of preeminent research scientists in quantum physics he has mentored throughout his extraordinary career.

When someone wins a big award, it has become traditional on this blog for John Preskill to write something about them. The system breaks down, though, when John is the one winning the award. Therefore I’ve been brought in as a pinch hitter (or should it be pinch lionizer?).

The award in this case is that John has been elected to the National Academy of Sciences, along with Charlie Kane and a number of other people that don’t work on quantum information. Lenny Susskind has already written about John’s work on other topics; I will focus on quantum information.

On the research side of quantum information, John is probably best known for his work on fault-tolerant quantum computation, particularly topological fault tolerance. John jumped into the field of quantum computation in 1994 in the wake of Shor’s algorithm, and brought me and some of his other grad students with him. It was obvious from the start that error correction was an important theoretical challenge (emphasized, for instance, by Unruh), so that was one of the things we looked at. We couldn’t figure out how to do it, but some other people did. John and I embarked on a long drawn-out project to get good bounds on the threshold error rate. If you can build a quantum computer with an error rate below the threshold value, you can do arbitrarily large quantum computations. If not, then errors will eventually overwhelm you. Early versions of my project with John suggested that the threshold should be about 10^{-4}, and the number began floating around (somewhat embarrassingly) as the definitive word on the threshold value. Our attempts to bound the higher-order terms in the computation became rather grotesque, and the project proceeded very slowly until a new approach and the recruitment of Panos Aliferis finally let us finish a paper with a rigorous proof of a slightly lower threshold value.

Meanwhile, John had also been working on topological quantum computation. John has already written about his excitement when Kitaev visited Caltech and talked about the toric code. The two of them, plus Eric Dennis and Andrew Landahl, studied the application of this code for fault tolerance. If you look at the citations of this paper over time, it looks rather … exponential. For a while, topological things were too exotic for most quantum computer people, but over time, the virtues of surface codes have become obvious (apparently high threshold, convenient for two-dimensional architectures). It’s become one of the hot topics in recent years and there are no signs of flagging interest in the community.

John has also made some important contributions to security proofs for quantum key distribution, known to the cognoscenti just by its initials. QKD allows two people (almost invariably named Alice and Bob) to establish a secret key by sending qubits over an insecure channel. If the eavesdropper Eve tries to live up to her name, her measurements of the qubits being transmitted will cause errors revealing her presence. If Alice and Bob don’t detect the presence of Eve, they conclude that she is not listening in (or at any rate hasn’t learned much about the secret key) and therefore they can be confident of security when they later use the secret key to encrypt a secret message. With Peter Shor, John gave a security proof of the best-known QKD protocol, known as the “Shor-Preskill” proof. Sometimes we scientists lack originality in naming. It was not the first proof of security, but earlier ones were rather complicated. The Shor-Preskill proof was conceptually much clearer and made a beautiful connection between the properties of quantum error-correcting codes and QKD. The techniques introduced in their paper got adopted into much later work on quantum cryptography.

Collaborating with John is always an interesting experience. Sometimes we’ll discuss some idea or some topic and it will be clear that John does not understand the idea clearly or knows little about the topic. Then, a few days later we discuss the same subject again and John is an expert, or at least he knows a lot more than me. I guess this ability to master
topics quickly is why he was always able to answer Steve Flammia’s random questions after lunch. And then when it comes time to write the paper … John will do it. It’s not just that he will volunteer to write the first draft — he keeps control of the whole paper and generally won’t let you edit the source, although of course he will incorporate your comments. I think this habit started because of incompatibilities between the TeX editor he was using and any other program, but he maintains it (I believe) to make sure that the paper meets his high standards of presentation quality.

This also explains why John has been so successful as an expositor. His
lecture notes for the quantum computation class at Caltech are well-known. Despite being incomplete and not available on Amazon, they are probably almost as widely read as the standard textbook by Nielsen and Chuang.

Before IQIM, there was IQI, and before that was QUIC.

Before IQIM, there was IQI, and before that was QUIC.

He apparently is also good at writing grants. Under his leadership and Jeff Kimble’s, Caltech has become one of the top places for quantum computation. In my last year of graduate school, John and Jeff, along with Steve Koonin, secured the QUIC grant, and all of a sudden Caltech had money for quantum computation. I got a research assistantship and could write my thesis without having to worry about TAing. Postdocs started to come — first Chris Fuchs, then a long stream of illustrious others. The QUIC grant grew into IQI, and that eventually sprouted an M and drew in even more people. When I was a student, John’s group was located in Lauritsen with the particle theory group. We had maybe three grad student offices (and not all the students were working on quantum information), plus John’s office. As the Caltech quantum effort grew, IQI acquired territory in another building, then another, and then moved into a good chunk of the new Annenberg building. Without John’s efforts, the quantum computing program at Caltech would certainly be much smaller and maybe completely lacking a theory side. It’s also unlikely this blog would exist.

The National Academy has now elected John a member, probably more for his research than his twitter account (@preskill), though I suppose you never know. Anyway, congratulations, John!

-D. Gottesman