Reading the sub(linear) text

Physicists are not known for finesse. “Even if it cost us our funding,” I’ve heard a physicist declare, “we’d tell you what we think.” Little wonder I irked the porter who directed me toward central Cambridge.

The University of Cambridge consists of colleges as the US consists of states. Each college has a porter’s lodge, where visitors check in and students beg for help after locking their keys in their rooms. And where physicists ask for directions.

Last March, I ducked inside a porter’s lodge that bustled with deliveries. The woman behind the high wooden desk volunteered to help me, but I asked too many questions. By my fifth, her pointing at a map had devolved to jabbing.

Read the subtext, I told myself. Leave.

Or so I would have told myself, if not for that afternoon.

That afternoon, I’d visited Cambridge’s CMS, which merits every letter in “Centre for Mathematical Sciences.” Home to Isaac Newton’s intellectual offspring, the CMS consists of eight soaring, glass-walled, blue-topped pavilions. Their majesty walloped me as I turned off the road toward the gatehouse. So did the congratulatory letter from Queen Elizabeth II that decorated the route to the restroom.

P1040733

I visited Nilanjana Datta, an affiliated lecturer of Cambridge’s Faculty of Mathematics, and her student, Felix Leditzky. Nilanjana and Felix specialize in entropies and one-shot information theory. Entropies quantify uncertainties and efficiencies. Imagine compressing many copies of a message into the smallest possible number of bits (units of memory). How few bits can you use per copy? That number, we call the optimal compression rate. It shrinks as the number of copies compressed grows. As the number of copies approaches infinity, that compression rate drops toward a number called the message’s Shannon entropy. If the message is quantum, the compression rate approaches the von Neumann entropy.

Good luck squeezing infinitely many copies of a message onto a hard drive. How efficiently can we compress fewer copies? According to one-shot information theory, the answer involves entropies other than Shannon’s and von Neumann’s. In addition to describing data compression, entropies describe the charging of batteriesthe concentration of entanglementthe encrypting of messages, and other information-processing tasks.

Speaking of compressing messages: Suppose one-shot information theory posted status updates on Facebook. Suppose that that panel on your Facebook page’s right-hand side showed news weightier than celebrity marriages. The news feed might read, “TRENDING: One-shot information theory: Second-order asymptotics.”

Second-order asymptotics, I learned at the CMS, concerns how the optimal compression rate decays as the number of copies compressed grows. Imagine compressing a billion copies of a quantum message ρ. The number of bits needed about equals a billion times the von Neumann entropy HvN(ρ). Since a billion is less than infinity, 1,000,000,000 HvN(ρ) bits won’t suffice. Can we estimate the compression rate more precisely?

The question reminds me of gas stations’ hidden pennies. The last time I passed a station’s billboard, some number like $3.65 caught my eye. Each gallon cost about $3.65, just as each copy of ρ costs about HvN(ρ) bits. But a 9/10, writ small, followed the $3.65. If I’d budgeted $3.65 per gallon, I couldn’t have filled my tank. If you budget HvN(ρ) bits per copy of ρ, you can’t compress all your copies.

Suppose some station’s owner hatches a plan to promote business. If you buy one gallon, you pay $3.654. The more you purchase, the more the final digit drops from four. By cataloguing receipts, you calculate how a tank’s cost varies with the number of gallons, n. The cost equals $3.65 × n to a first approximation. To a second approximation, the cost might equal $3.65 × n + an, wherein a represents some number of cents. Compute a, and you’ll have computed the gas’s second-order asymptotics.

Nilanjana and Felix computed a’s associated with data compression and other quantum tasks. Second-order asymptotics met information theory when Strassen combined them in nonquantum problems. These problems developed under attention from Hayashi, Han, Polyanski, Poor, Verdu, and others. Tomamichel and Hayashi, as well as Li, introduced quantumness.

In the total-cost expression, $3.65 × n depends on n directly, or “linearly.” The second term depends on √n. As the number of gallons grows, so does √n, but √n grows more slowly than n. The second term is called “sublinear.”

Which is the word that rose to mind in the porter’s lodge. I told myself, Read the sublinear text.

Little wonder I irked the porter. At least—thanks to quantum information, my mistake, and facial expressions’ contagiousness—she smiled.

 

 

With thanks to Nilanjana Datta and Felix Leditzky for explanations and references; to Nilanjana, Felix, and Cambridge’s Centre for Mathematical Sciences for their hospitality; and to porters everywhere for providing directions.

“Feveral kinds of hairy mouldy fpots”

The book had a sheepskin cover, and mold was growing on the sheepskin. Robert Hooke, a pioneering microbiologist, slid the cover under one of the world’s first microscopes. Mold, he discovered, consists of “nothing elfe but feveral kinds of fmall and varioufly figur’d Mufhroms.” He described the Mufhroms in his treatise Micrographia, a 1665 copy of which I found in “Beautiful Science.” An exhibition at San Marino’s Huntington Library, “Beautiful Science” showcases the physics of rainbows, the stars that enthralled Galileo, and the world visible through microscopes.

Hooke image copy

Beautiful science of yesterday: An illustration, from Hooke’s Micrographia, of the mold.

“[T]hrough a good Microfcope,” Hooke wrote, the sheepskin’s spots appeared “to be a very pretty fhap’d Vegetative body.”

How like a scientist, to think mold pretty. How like quantum noise, I thought, Hooke’s mold sounds.

Quantum noise hampers systems that transmit and detect light. To phone a friend or send an email—“Happy birthday, Sarah!” or “Quantum Frontiers has released an article”—we encode our message in light. The light traverses a fiber, buried in the ground, then hits a detector. The detector channels the light’s energy into a current, a stream of electrons that flows down a wire. The variations in the current’s strength is translated into Sarah’s birthday wish.

If noise doesn’t corrupt the signal. From encoding “Happy birthday,” the light and electrons might come to encode “Hsappi birthdeay.” Quantum noise arises because light consists of packets of energy, called “photons.” The sender can’t control how many photons hit the detector.

To send the letter H, we send about 108 photons.* Imagine sending fifty H’s. When we send the first, our signal might contain 108- 153 photons; when we send the second, 108 + 2,083; when we send the third, 108 – 6; and so on. Receiving different numbers of photons, the detector generates different amounts of current. Different amounts of current can translate into different symbols. From H, our message can morph into G.

This spring, I studied quantum noise under the guidance of IQIM faculty member Kerry Vahala. I learned to model quantum noise, to quantify it, when to worry about it, and when not. From quantum noise, we branched into Johnson noise (caused by interactions between the wire and its hot environment); amplified-spontaneous-emission, or ASE, noise (caused by photons belched by ions in the fiber); beat noise (ASE noise breeds with the light we sent, spawning new noise); and excess noise (the “miscellaneous” folder in the filing cabinet of noise types).

Vahala image copy

Beautiful science of today: A microreso-nator—a tiny pendulum-like device— studied by the Vahala group.

Noise, I learned, has structure. It exhibits patterns. It has personalities. I relished studying those patterns as I relish sending birthday greetings while battling noise. Noise types, I see as a string of pearls unearthed in a junkyard. I see them as “pretty fhap[es]” in Hooke’s treatise. I see them—to pay a greater compliment—as “hairy mouldy fpots.”

P1040754

*Optical-communications ballpark estimates:

  • Optical power: 1 mW = 10-3 J/s
  • Photon frequency: 200 THz = 2 × 1014 Hz
  • Photon energy: h𝜈 = (6.626 × 10-34 J . s)(2 × 1014 Hz) = 10-19 J
  • Bit rate: 1 GB = 109 bits/s
  • Number of bits per H: 10
  • Number of photons per H: (1 photon / 10-19 J) (10-3 J/s)(1 s / 109 bits)(10 bits / 1 H) = 108

 

An excerpt from this post was published today on Verso, the blog of the Huntington Library, Art Collection, and Botanical Gardens.

With thanks to Bassam Helou, Dan Lewis, Matt Stevens, and Kerry Vahala for feedback. With thanks to the Huntington Library (including Catherine Wehrey) and the Vahala group for the Micrographia image and the microresonator image, respectively.

The theory of everything: Help wanted

When Scientific American writes that physicists are working on a theory of everything, does it sound ambitious enough to you? Do you lie awake at night thinking that a theory of everything should be able to explain, well, everything? What if that theory is founded on quantum mechanics and finds a way to explain gravitation through the microscopic laws of the quantum realm? Would that be a grand unified theory of everything?

The answer is no, for two different, but equally important reasons. First, there is the inherent assumption that quantum systems change in time according to Schrodinger’s evolution: i \hbar \partial_t \psi(t) = H \psi(t). Why? Where does that equation come from? Is it a fundamental law of nature, or is it an emergent relationship between different states of the universe? What if the parameter t, which we call time, as well as the linear, self-adjoint operator H, which we call the Hamiltonian, are both emergent from a more fundamental, and highly typical phenomenon: the large amount of entanglement that is generically found when one decomposes the state space of a single, static quantum wavefunction, into two (different in size) subsystems: a clock and a space of configurations (on which our degrees of freedom live)? So many questions, so few answers.

The static multiverse

The perceptive reader may have noticed that I italicized the word ‘static’ above, when referring to the quantum wavefunction of the multiverse. The emphasis on static is on purpose. I want to make clear from the beginning that a theory of everything can only be based on axioms that are truly fundamental, in the sense that they cannot be derived from more general principles as special cases. How would you know that your fundamental principles are irreducible? You start with set theory and go from there. If that assumes too much already, then you work on your set theory axioms. On the other hand, if you can exhibit a more general principle from which your original concept derives, then you are on the right path towards more fundamentalness.

In that sense, time and space as we understand them, are not fundamental concepts. We can imagine an object that can only be in one state, like a switch that is stuck at the OFF position, never changing or evolving in any way, and we can certainly consider a complete graph of interactions between subsystems (the equivalent of a black hole in what we think of as space) with no local geometry in our space of configurations. So what would be more fundamental than time and space? Let’s start with time: The notion of an unordered set of numbers, such as \{4,2,5,1,3,6,8,7,12,9,11,10\}, is a generalization of a clock, since we are only keeping the labels, but not their ordering. If we can show that a particular ordering emerges from a more fundamental assumption about the very existence of a theory of everything, then we have an understanding of time as a set of ordered labels, where each label corresponds to a particular configuration in the mathematical space containing our degrees of freedom. In that sense, the existence of the labels in the first place corresponds to a fundamental notion of potential for change, which is a prerequisite for the concept of time, which itself corresponds to constrained (ordered in some way) change from one label to the next. Our task is first to figure out where the labels of the clock come from, then where the illusion of evolution comes from in a static universe (Heisenberg evolution), and finally, where the arrow of time comes from in a macroscopic world (the illusion of irreversible evolution).

The axioms we ultimately choose must satisfy the following conditions simultaneously: 1. the implications stemming from these assumptions are not contradicted by observations, 2. replacing any one of these assumptions by its negation would lead to observable contradictions, and 3. the assumptions contain enough power to specify non-trivial structures in our theory. In short, as Immanuel Kant put it in his accessible bedtime story The critique of Pure Reason, we are looking for synthetic a priori knowledge that can explain space and time, which ironically were Kant’s answer to that same question.

The fundamental ingredients of the ultimate theory

Before someone decides to delve into the math behind the emergence of unitarity (Heisenberg evolution) and the nature of time, there is another reason why the grand unified theory of everything has to do more than just give a complete theory of how the most elementary subsystems in our universe interact and evolve. What is missing is the fact that quantity has a quality all its own. In other words, patterns emerge from seemingly complex data when we zoom out enough. This “zooming out” procedure manifests itself in two ways in physics: as coarse-graining of the data and as truncation and renormalization. These simple ideas allow us to reduce the computational complexity of evaluating the next state of a complex system: If most of the complexity of the system is hidden at a level you cannot even observe (think pre retina-display era), then all you have to keep track of is information at the macroscopic, coarse-grained level. On top of that, you can use truncation and renormalization to zero in on the most likely/ highest weight configurations your coarse-grained data can be in – you can safely throw away a billion configurations, if their combined weight is less than 0.1% of the total, because your super-compressed data will still give you the right answer with a fidelity of 99.9%. This is how you get to reduce a 9 GB raw video file down to a 300 MB Youtube video that streams over your WiFi connection without losing too much of the video quality.

I will not focus on the second requirement for the “theory of everything”, the dynamics of apparent complexity. I think that this fundamental task is the purview of other sciences, such as chemistry, biology, anthropology and sociology, which look at the “laws” of physics from higher and higher vantage points (increasingly coarse-graining the topology of the space of possible configurations). Here, I would like to argue that the foundation on which a theory of everything rests, at the basement level if such a thing exists, consists of four ingredients: Math, Hilbert spaces with tensor decompositions into subsystems, stability and compressibility. Now, you know about math (though maybe not of Zermelo-Fraenkel set theory), you may have heard of Hilbert spaces if you majored in math and/or physics, but you don’t know what stability, or compressibility mean in this context. So let me motivate the last two with a question and then explain in more detail below: What are the most fundamental assumptions that we sweep under the rug whenever we set out to create a theory of anything that can fit in a book – or ten thousand books – and still have predictive power? Stability and compressibility.

Math and Hilbert spaces are fundamental in the following sense: A theory needs a Language in order to encode the data one can extract from that theory through synthesis and analysis. The data will be statistical in the most general case (with every configuration/state we attach a probability/weight of that state conditional on an ambient configuration space, which will often be a subset of the total configuration space), since any observer creating a theory of the universe around them only has access to a subset of the total degrees of freedom. The remaining degrees of freedom, what quantum physicists group as the Environment, affect our own observations through entanglement with our own degrees of freedom. To capture this richness of correlations between seemingly uncorrelated degrees of freedom, the mathematical space encoding our data requires more than just a metric (i.e. an ability to measure distances between objects in that space) – it requires an inner-product: a way to measure angles between different objects, or equivalently, the ability to measure the amount of overlap between an input configuration and an output configuration, thus quantifying the notion of incremental change. Such mathematical spaces are precisely the Hilbert spaces mentioned above and contain states (with wavefunctions being a special case of such states) and operators acting on the states (with measurements, rotations and general observables being special cases of such operators). But, let’s get back to stability and compressibility, since these two concepts are not standard in physics.

Stability

Stability is that quality that says that if the theory makes a prediction about something observable, then we can test our theory by making observations on the state of the world and, more importantly, new observations do not contradict our theory. How can a theory fall apart if it is unstable? One simple way is to make predictions that are untestable, since they are metaphysical in nature (think of religious tenets). Another way is to make predictions that work for one level of coarse-grained observations and fail for a lower level of finer coarse-graining (think of Newtonian Mechanics). A more extreme case involves quantum mechanics assumed to be the true underlying theory of physics, which could still fail to produce a stable theory of how the world works from our point of view. For example, say that your measurement apparatus here on earth is strongly entangled with the current state of a star that happens to go supernova 100 light-years from Earth during the time of your experiment. If there is no bound on the propagation speed of the information between these two subsystems, then your apparatus is engulfed in flames for no apparent reason and you get random data, where you expected to get the same “reproducible” statistics as last week. With no bound on the speed with which information can travel between subsystems of the universe, our ability to explain and/or predict certain observations goes out the window, since our data on these subsystems will look like white noise, an illusion of randomness stemming from the influence of inaccessible degrees of freedom acting on our measurement device. But stability has another dimension; that of continuity. We take for granted our ability to extrapolate the curve that fits 1000 data points on a plot. If we don’t assume continuity (and maybe even a certain level of smoothness) of the data, then all bets are off until we make more measurements and gather additional data points. But even then, we can never gather an infinite (let alone, uncountable) number of data points – we must extrapolate from what we have and assume that the full distribution of the data is close in norm to our current dataset (a norm is a measure of distance between states in the Hilbert space).

The emergence of the speed of light

The assumption of stability may seem trivial, but it holds within it an anthropic-style explanation for the bound on the speed of light. If there is no finite speed of propagation for the information between subsystems that are “far apart”, from our point of view, then we will most likely see randomness where there is order. A theory needs order. So, what does it mean to be “far apart” if we have made no assumption for the existence of an underlying geometry, or spacetime for that matter? There is a very important concept in mathematical physics that generalizes the concept of the speed of light for non-relativistic quantum systems whose subsystems live on a graph (i.e. where there may be no spatial locality or apparent geometry): the Lieb-Robinson velocity. Those of us working at the intersection of mathematical physics and quantum many-body physics, have seen first-hand the powerful results one can get from the existence of such an effective and emergent finite speed of propagation of information between quantum subsystems that, in principle, can signal to each other instantaneously through the action of a non-local unitary operator (rotation of the full system under Heisenberg evolution). It turns out that under certain natural assumptions on the graph of interactions between the different subsystems of a many-body quantum system, such a finite speed of light emerges naturally. The main requirement on the graph comes from the following intuitive picture: If each node in your graph is connected to only a few other nodes and the number of paths between any two nodes is bounded above in some nice way (say, polynomially in the distance between the nodes), then communication between two distant nodes will take time proportional to the distance between the nodes (in graph distance units, the smallest number of nodes among all paths connecting the two nodes). Why? Because at each time step you can only communicate with your neighbors and in the next time step they will communicate with theirs and so on, until one (and then another, and another) of these communication cascades reaches the other node. Since you have a bound on how many of these cascades will eventually reach the target node, the intensity of the communication wave is bounded by the effective action of a single messenger traveling along a typical path with a bounded speed towards the destination. There should be generalizations to weighted graphs, but this area of mathematical physics is still really active and new results on bounds on the Lieb-Robinson velocity gather attention very quickly.

Escaping black holes

If this idea holds any water, then black holes are indeed nearly complete graphs, where the notion of space and time breaks down, since there is no effective bound on the speed with which information propagates from one node to another. The only way to escape is to find yourself at the boundary of the complete graph, where the nodes of the black hole’s apparent horizon are connected to low-degree nodes outside. Once you get to a low-degree node, you need to keep moving towards other low-degree nodes in order to escape the “gravitational pull” of the black hole’s super-connectivity. In other words, gravitation in this picture is an entropic force: we gravitate towards massive objects for the same reason that we “gravitate” towards the direction of the arrow of time: we tend towards higher entropy configurations – the probability of reaching the neighborhood of a set of highly connected nodes is much, much higher than hanging out for long near a set of low-degree nodes in the same connected component of the graph. If a graph has disconnected components, then their is no way to communicate between the corresponding spacetimes – their states are in a tensor product with each other. One has to carefully define entanglement between components of a graph, before giving a unified picture of how spatial geometry arises from entanglement. Somebody get to it.

Erik Verlinde has introduced the idea of gravity as an entropic force and Fotini Markopoulou, et al. have introduced the notion of quantum graphity (gravity emerging from graph models). I think these approaches must be taken seriously, if only because they work with more fundamental principles than the ones found in Quantum Field Theory and General Relativity. After all, this type of blue sky thinking has led to other beautiful connections, such as ER=EPR (the idea that whenever two systems are entangled, they are connected by a wormhole). Even if we were to disagree with these ideas for some technical reason, we must admit that they are at least trying to figure out the fundamental principles that guide the things we take for granted. Of course, one may disagree with certain attempts at identifying unifying principles simply because the attempts lack the technical gravitas that allows for testing and calculations. Which is why a technical blog post on the emergence of time from entanglement is in the works.

Compressibility

So, what about that last assumption we seem to take for granted? How can you have a theory you can fit in a book about a sequence of events, or snapshots of the state of the observable universe, if these snapshots look like the static noise on a TV screen with no transmission signal? Well, you can’t! The fundamental concept here is Kolmogorov complexity and its connection to randomness/predictability. A sequence of data bits like:

10011010101101001110100001011010011101010111010100011010110111011110

has higher complexity (and hence looks more random/less predictable) than the sequence:

10101010101010101010101010101010101010101010101010101010101010101010

because there is a small computer program that can output each successive bit of the latter sequence (even if it had a million bits), but (most likely) not of the former. In particular, to get the second sequence with one million bits one can write the following short program:

string s = ’10′;
for n=1 to 499,999:
s.append(’10’);
n++;
end
print s;

As the number of bits grows, one may wonder if the number of iterations (given above by 499,999), can be further compressed to make the program even smaller. The answer is yes: The number 499,999 in binary requires \log_2 499,999 bits, but that binary number is a string of 0s and 1s, so it has its own Kolmogorov complexity, which may be smaller than \log_2 499,999. So, compressibility has a strong element of recursion, something that in physics we associate with scale invariance and fractals.

You may be wondering whether there are truly complex sequences of 0,1 bits, or if one can always find a really clever computer program to compress any N bit string down to, say, N/100 bits. The answer is interesting: There is no computer program that can compute the Kolmogorov complexity of an arbitrary string (the argument has roots in Berry’s Paradox), but there are strings of arbitrarily large Kolmogorov complexity (that is, no matter what program we use and what language we write it in, the smallest program (in bits) that outputs the N-bit string will be at least N bits long). In other words, there really are streams of data (in the form of bits) that are completely incompressible. In fact, a typical string of 0s and 1s will be almost completely incompressible!

Stability, compressibility and the arrow of time

So, what does compressibility have to do with the theory of everything? It has everything to do with it. Because, if we ever succeed in writing down such a theory in a physics textbook, we will have effectively produced a computer program that, given enough time, should be able to compute the next bit in the string that represents the data encoding the coarse-grained information we hope to extract from the state of the universe. In other words, the only reason the universe makes sense to us is because the data we gather about its state is highly compressible. This seems to imply that this universe is really, really special and completely atypical. Or is it the other way around? What if the laws of physics were non-existent? Would there be any consistent gravitational pull between matter to form galaxies and stars and planets? Would there be any predictability in the motion of the planets around suns? Forget about life, let alone intelligent life and the anthropic principle. Would the Earth, or Jupiter even know where to go next if it had no sense that it was part of a non-random plot in the movie that is spacetime? Would there be any notion of spacetime to begin with? Or an arrow of time? When you are given one thousand frames from one thousand different movies, there is no way to make a single coherent plot. Even the frames of a single movie would make little sense upon reshuffling.

What if the arrow of time emerged from the notions of stability and compressibility, through coarse-graining that acts as a compression algorithm for data that is inherently highly-complex and, hence, highly typical as the next move to make? If two strings of data look equally complex upon coarse-graining, but one of them has a billion more ways of appearing from the underlying raw data, then which one will be more likely to appear in the theory-of-everything book of our coarse-grained universe? Note that we need both high compressibility after coarse-graining in order to write down the theory, as well as large entropy before coarse-graining (from a large number of raw strings that all map to one string after coarse-graining), in order to have an arrow of time. It seems that we need highly-typical, highly complex strings that become easy to write down once we coarse grain the data in some clever way. Doesn’t that seem like a contradiction? How can a bunch of incompressible data become easily compressible upon coarse-graining? Here is one way: Take an N-bit string and define its 1-bit coarse-graining as the boolean AND of its digits. All but one strings will default to 0. The all 1s string will default to 1. Equally compressible, but the probability of seeing the 1 after coarse-graining is 2^{-N}. With only 300 bits, finding the coarse-grained 1 is harder than looking for a specific atom in the observable universe. In other words, if the coarse-graining rule at time t is the one given above, then you can be pretty sure you will be seeing a 0 come up next in your data. Notice that before coarse-graining, all 2^N strings are equally likely, so there is no arrow of time, since there is no preferred string from a probabilistic point of view.

Conclusion, for now

When we think about the world around us, we go to our intuitions first as a starting point for any theory describing the multitude of possible experiences (observable states of the world). If we are to really get to the bottom of this process, it seems fruitful to ask “why do I assume this?” and “is that truly fundamental or can I derive it from something else that I already assumed was an independent axiom?” One of the postulates of quantum mechanics is the axiom corresponding to the evolution of states under Schrodinger’s equation. We will attempt to derive that equation from the other postulates in an upcoming post. Until then, your help is wanted with the march towards more fundamental principles that explain our seemingly self-evident truths. Question everything, especially when you think you really figured things out. Start with this post. After all, a theory of everything should be able to explain itself.

UP NEXT: Entanglement, Schmidt decomposition, concentration measure bounds and the emergence of discrete time and unitary evolution.

Top 10 questions for your potential PhD adviser/group

Everyone in grad school has taken on the task of picking the perfect research group at some point.  Then some among us had the dubious distinction of choosing the perfect research group twice.  Luckily for me, a year of grad research taught me a lot and I found myself asking group members and PIs (primary investigators) very different questions.  And luckily for you, I wrote these questions down to share with future generations.  My background as an experimental applied physicist showed through initially, so I got Shaun Maguire and Spiros Michalakis to help make it applicable for theorists too, and most of them should be useful outside physics as well.

Questions to break that silence when your potential advisor asks “So, do you have any questions for me?”

1. Are you taking new students?
– 2a. if yes: How many are you looking to take?
– 2b. if no: Ask them about the department or other professors.  They’ve been there long enough to have opinions.  Alternatively, ask what kinds of questions they would suggest you ask other PIs
3. What is the procedure for joining the group?
4. (experimental) Would you have me TA?  (This is the nicest way I thought of to ask if a PI can fund you with a research assistance-ship (RA), though sometimes they just like you to TA their class.)
4. (theory) Funding routes will often be covered by question 3 since TAs are the dominant funding method for theory students, unlike for experimentalists. If relevant, you can follow up with: How does funding for your students normally work? Do you have funding for me?
5. Do new students work for/report to other grad students, post docs, or you directly?
6. How do you like students to arrange time to meet with you?
7. How often do you have group meetings?
8. How much would you like students to prepare for them?
9. Would you suggest I take any specific classes?
10. What makes someone a good fit for this group?

And then for the high bandwidth information transfer.  Grill the group members themselves, and try to ask more than one group member if you can.

1. How much do you prepare for meetings with PI?
2. How long until people lead their own project? – Equivalently, who’s working on what projects.
3. How much do people on different projects communicate? (only group meeting or every day)
4. Is the PI hands on (how often PI wants to meet with you)?
5. Is the PI accessible (how easily can you meet with the PI if you want to)?
6. What is the average time to graduation? (if it’s important to you personally)
7. Does the group/subgroup have any bonding activities?
8. Do you think I should join this group?
9. What are people’s backgrounds?
10. What makes someone a good fit for this group?

Hope that helps.  If you have any other suggested questions, be sure to leave them in the comments.

Clocking in at a Cambridge conference

Science evolves on Facebook.

On Facebook last fall, I posted about statistical mechanics. Statistical mechanics is the physics of hordes of particles. Hordes of molecules, for example, form the stench seeping from a clogged toilet. Hordes change in certain ways but not in the reverse ways, suggesting time points in a direction. Once a stink diffuses into the hall, it won’t regroup in the bathroom. The molecules’ locations distinguish past from future.

The post attracted a comment by Ian Durham, associate professor of physics at St. Anselm College. Minutes later, we were instant-messaging about infinitely long evolutions.*

The next day, I sent Ian a paper draft. His reply made me jump more than a whiff of a toilet would. Would I discuss the paper at a conference he was co-organizing?

I almost replied, Are you sure?

Then I almost replied, Yes, please!

The conference, “Eddington and Wheeler: Information and Interaction,” unfolded this March at the University of Cambridge. Cambridge employed Sir Arthur Eddington, the astronomer whose 1919 observation of starlight during an eclipse catapulted Einstein’s general relativity to fame. Decades later, John Wheeler laid groundwork for quantum information.

Though aware of Eddington’s observation, I hadn’t known he’d researched stat mech. I hadn’t known his opinions about time. Time owns a high-rise in my heart; see the fussiness with which I catalogue “last fall,” “minutes later,” and “the next day.” Conference-goers shared news about time in the Old Combination Room at Cambridge’s Trinity College. Against the room’s wig-filled portraits, our projector resembled a souvenir misplaced by a time traveler.

P1040716

Trinity College, Cambridge.

Presenter one, Huw Price, argued that time has no arrow. It appears to in our universe: We remember the past and anticipate the future. Once a stench diffuses, it doesn’t regroup. The stench illustrates the Second Law of Thermodynamics, the assumption that entropy increases.

If “entropy” doesn’t ring a bell, never mind; we’ll dissect it in future articles. Suffice it to say that (1) thermodynamics is a branch of physics related to stat mech; (2) according to the Second Law of Thermodynamics, something called “entropy” increases; (3) entropy’s rise distinguishes the past from the future by associating the former with a low entropy and the latter with a large entropy; and (4) a stench’s diffusion illustrates the Second Law and time’s flow.

In as many universes in which entropy increases (time flows in one direction), in so many universe does entropy decrease (does time flow oppositely). So, said Huw Price, postulated the 19th-century stat-mech founder Ludwig Boltzmann. Why would universes pair up? For the reason why, driving across a pothole, you not only fall, but also rise. Each fluctuation from equilibrium—from a flat road—involves an upward path and a downward. The upward path resembles a universe in which entropy increases; the downward, a universe in which entropy decreases. Every down pairs with an up. Averaged over universes, time has no arrow.

Freidel Weinert, presenter five, argued the opposite. Time has an arrow, he said, and not because of entropy.

Ariel Caticha discussed an impersonator of time. Using a cousin of MaxEnt, he derived an equation identical to Schrödinger’s. MaxEnt, short for “the Maximum Entropy Principle,” is a tool used in stat mech. Schrödinger’s Equation describes how quantum systems evolve. To draw from Schrödinger’s Equation predictions about electrons and atoms, physicists assume that features of reality resemble certain bits of math. We assume, for example, that the t in Schrödinger’s Equation represents time.

A t appeared in Ariel’s twin of Schrödinger’s Equation. But Ariel didn’t assume what physicists usually assume. MaxEnt motivated his assumptions. Interpreting Ariel’s equation poses a challenge. If a variable acts like time and smells like time, does it represent time?**

IMG_0064 copy - Version 2

A presenter uses the anachronistic projector. The head between screen and camera belongs to David Finkelstein, who helped develop the theory of general relativity checked by Eddington.

Like Ariel, Bill Wootters questioned time’s role in arguments. The co-creator of quantum teleportation wondered why one tenet of quantum physics has the form it has. Using quantum mechanics, we can’t predict certain experiments’ outcomes. We can predict probabilities—the chance that some experiment will yield Possible Outcome 1, the chance that the experiment will yield Possible Outcome 2, and so on. To calculate these probabilities, we square numbers. Why square? Why don’t the probabilities depend on cubes?

To explore this question, Bill told a story. Suppose some experimenter runs these experiments on Monday and those on Tuesday. When evaluating his story, Bill pointed out a hole: Replacing “Monday” and “Tuesday” with “eight o’clock” and “nine” wouldn’t change his conclusion. Which replacements wouldn’t change it, and which would? To what can we generalize those days?

We couldn’t answer his questions on the Sunday he asked them.

Little of presentation twelve concerned time. Rüdiger Schack introduced QBism, an interpretation of quantum mechanics that sounds like “cubism.” Casting quantum physics in terms of experimenters’ actions, Rüdiger mentioned time. By the time of the mention, I couldn’t tell what anyone meant by “time.” Raising a hand, I asked for clarification.

“You are young,” Rüdiger said. “But you will grow old and die.”

The comment clanged like the slam of a door. It echoed when I followed Ian into Ascension Parish Burial Ground. On Cambridge’s outskirts, conference-goers visited Eddington’s headstone. We found Wittgenstein’s near an uneven footpath; near tangles of undergrowth, Nobel laureates’. After debating about time, we marked its footprints. Paths of glory lead but to the grave.

P1040723

Here lies one whose name was writ in a conference title: Sir Arthur Eddington’s grave.

Paths touched by little glory, I learned, have perks. As Rüdiger noted, I was the greenest participant. As he had the manners not to note, I was the least distinguished and the most ignorant. Studenthood freed me to raise my hand, to request clarification, to lack opinions about time. Perhaps I’ll evolve opinions at some t, some Monday down the road. That Monday feels infinitely far off. These days, I’ll stick to evolving science—using that other boon of youth, Facebook.

 

* You know you’re a theoretical physicist (or a physicist-in-training) when you debate about processes that last till kingdom come.

** As long as the variable doesn’t smell like a clogged toilet.

 

For videos of the presentations—including the public lecture by best-selling author Neal Stephenson—stay tuned to http://informationandinteraction.wordpress.com.

With gratitude to Ian Durham and Dean Rickles for organizing “Information and Interaction” and for the opportunity to participate. With thanks to the other participants for sharing their ideas and time.

Inflation on the back of an envelope

Last Monday was an exciting day!

After following the BICEP2 announcement via Twitter, I had to board a transcontinental flight, so I had 5 uninterrupted hours to think about what it all meant. Without Internet access or references, and having not thought seriously about inflation for decades, I wanted to reconstruct a few scraps of knowledge needed to interpret the implications of r ~ 0.2.

I did what any physicist would have done … I derived the basic equations without worrying about niceties such as factors of 3 or 2 \pi. None of what I derived was at all original —  the theory has been known for 30 years — but I’ve decided to turn my in-flight notes into a blog post. Experts may cringe at the crude approximations and overlooked conceptual nuances, not to mention the missing references. But some mathematically literate readers who are curious about the implications of the BICEP2 findings may find these notes helpful. I should emphasize that I am not an expert on this stuff (anymore), and if there are serious errors I hope better informed readers will point them out.

By tradition, careless estimates like these are called “back-of-the-envelope” calculations. There have been times when I have made notes on the back of an envelope, or a napkin or place mat. But in this case I had the presence of mind to bring a notepad with me.

Notes from a plane ride

Notes from a plane ride

According to inflation theory, a nearly homogeneous scalar field called the inflaton (denoted by \phi)  filled the very early universe. The value of \phi varied with time, as determined by a potential function V(\phi). The inflaton rolled slowly for a while, while the dark energy stored in V(\phi) caused the universe to expand exponentially. This rapid cosmic inflation lasted long enough that previously existing inhomogeneities in our currently visible universe were nearly smoothed out. What inhomogeneities remained arose from quantum fluctuations in the inflaton and the spacetime geometry occurring during the inflationary period.

Gradually, the rolling inflaton picked up speed. When its kinetic energy became comparable to its potential energy, inflation ended, and the universe “reheated” — the energy previously stored in the potential V(\phi) was converted to hot radiation, instigating a “hot big bang”. As the universe continued to expand, the radiation cooled. Eventually, the energy density in the universe came to be dominated by cold matter, and the relic fluctuations of the inflaton became perturbations in the matter density. Regions that were more dense than average grew even more dense due to their gravitational pull, eventually collapsing into the galaxies and clusters of galaxies that fill the universe today. Relic fluctuations in the geometry became gravitational waves, which BICEP2 seems to have detected.

Both the density perturbations and the gravitational waves have been detected via their influence on the inhomogeneities in the cosmic microwave background. The 2.726 K photons left over from the big bang have a nearly uniform temperature as we scan across the sky, but there are small deviations from perfect uniformity that have been precisely measured. We won’t worry about the details of how the size of the perturbations is inferred from the data. Our goal is to achieve a crude understanding of how the density perturbations and gravitational waves are related, which is what the BICEP2 results are telling us about. We also won’t worry about the details of the shape of the potential function V(\phi), though it’s very interesting that we might learn a lot about that from the data.

Exponential expansion

Einstein’s field equations tell us how the rate at which the universe expands during inflation is related to energy density stored in the scalar field potential. If a(t) is the “scale factor” which describes how lengths grow with time, then roughly

\left(\frac{\dot a}{a}\right)^2 \sim \frac{V}{m_P^2}.

Here \dot a means the time derivative of the scale factor, and m_P = 1/\sqrt{8 \pi G} \approx 2.4 \times 10^{18} GeV is the Planck scale associated with quantum gravity. (G is Newton’s gravitational constant.) I’ve left our a factor of 3 on purpose, and I used the symbol ~ rather than = to emphasize that we are just trying to get a feel for the order of magnitude of things. I’m using units in which Planck’s constant \hbar and the speed of light c are set to one, so mass, energy, and inverse length (or inverse time) all have the same dimensions. 1 GeV means one billion electron volts, about the mass of a proton.

(To persuade yourself that this is at least roughly the right equation, you should note that a similar equation applies to an expanding spherical ball of radius a(t) with uniform mass density V. But in the case of the ball, the mass density would decrease as the ball expands. The universe is different — it can expand without diluting its mass density, so the rate of expansion \dot a / a does not slow down as the expansion proceeds.)

During inflation, the scalar field \phi and therefore the potential energy V(\phi) were changing slowly; it’s a good approximation to assume V is constant. Then the solution is

a(t) \sim a(0) e^{Ht},

where H, the Hubble constant during inflation, is

H \sim \frac{\sqrt{V}}{m_P}.

To explain the smoothness of the observed universe, we require at least 50 “e-foldings” of inflation before the universe reheated — that is, inflation should have lasted for a time at least 50 H^{-1}.

Slow rolling

During inflation the inflaton \phi rolls slowly, so slowly that friction dominates inertia — this friction results from the cosmic expansion. The speed of rolling \dot \phi is determined by

H \dot \phi \sim -V'(\phi).

Here V'(\phi) is the slope of the potential, so the right-hand side is the force exerted by the potential, which matches the frictional force on the left-hand side. The coefficient of \dot \phi has to be H on dimensional grounds. (Here I have blown another factor of 3, but let’s not worry about that.)

Density perturbations

The trickiest thing we need to understand is how inflation produced the density perturbations which later seeded the formation of galaxies. There are several steps to the argument.

Quantum fluctuations of the inflaton

As the universe inflates, the inflaton field is subject to quantum fluctuations, where the size of the fluctuation depends on its wavelength. Due to inflation, the wavelength increases rapidly, like e^{Ht}, and once the wavelength gets large compared to H^{-1}, there isn’t enough time for the fluctuation to wiggle — it gets “frozen in.” Much later, long after the reheating of the universe, the oscillation period of the wave becomes comparable to the age of the universe, and then it can wiggle again. (We say that the fluctuations “cross the horizon” at that stage.) Observations of the anisotropy of the microwave background have determined how big the fluctuations are at the time of horizon crossing. What does inflation theory say about that?

Well, first of all, how big are the fluctuations when they leave the horizon during inflation? Then the wavelength is H^{-1} and the universe is expanding at the rate H, so H is the only thing the magnitude of the fluctuations could depend on. Since the field \phi has the same dimensions as H, we conclude that fluctuations have magnitude

\delta \phi \sim H.

From inflaton fluctuations to density perturbations

Reheating occurs abruptly when the inflaton field reaches a particular value. Because of the quantum fluctuations, some horizon volumes have larger than average values of \phi and some have smaller than average values; hence different regions reheat at slightly different times. The energy density in regions that reheat earlier starts to be reduced by expansion (“red shifted”) earlier, so these regions have a smaller than average energy density. Likewise, regions that reheat later start to red shift later, and wind up having larger than average density.

When we compare different regions of comparable size, we can find the typical (root-mean-square) fluctuations \delta t in the reheating time, knowing the fluctuations in \phi and the rolling speed \dot \phi:

\delta t \sim \frac{\delta \phi}{\dot \phi} \sim \frac{H}{\dot\phi}.

Small fractional fluctuations in the scale factor a right after reheating produce comparable small fractional fluctuations in the energy density \rho. The expansion rate right after reheating roughly matches the expansion rate H right before reheating, and so we find that the characteristic size of the density perturbations is

\delta_S\equiv\left(\frac{\delta \rho}{\rho}\right)_{hor} \sim \frac{\delta a}{a} \sim \frac{\dot a}{a} \delta t\sim \frac{H^2}{\dot \phi}.

The subscript hor serves to remind us that this is the size of density perturbations as they cross the horizon, before they get a chance to grow due to gravitational instabilities. We have found our first important conclusion: The density perturbations have a size determined by the Hubble constant H and the rolling speed \dot \phi of the inflaton, up to a factor of order one which we have not tried to keep track of. Insofar as the Hubble constant and rolling speed change slowly during inflation, these density perturbations have a strength which is nearly independent of the length scale of the perturbation. From here on we will denote this dimensionless scale of the fluctuations by \delta_S, where the subscript S stands for “scalar”.

Perturbations in terms of the potential

Putting together \dot \phi \sim -V' / H and H^2 \sim V/{m_P}^2 with our expression for \delta_S, we find

\delta_S^2 \sim \frac{H^4}{\dot\phi^2}\sim \frac{H^6}{V'^2} \sim \frac{1}{{m_P}^6}\frac{V^3}{V'^2}.

The observed density perturbations are telling us something interesting about the scalar field potential during inflation.

Gravitational waves and the meaning of r

The gravitational field as well as the inflaton field is subject to quantum fluctuations during inflation. We call these tensor fluctuations to distinguish them from the scalar fluctuations in the energy density. The tensor fluctuations have an effect on the microwave anisotropy which can be distinguished in principle from the scalar fluctuations. We’ll just take that for granted here, without worrying about the details of how it’s done.

While a scalar field fluctuation with wavelength \lambda and strength \delta \phi carries energy density \sim \delta\phi^2 / \lambda^2, a fluctuation of the dimensionless gravitation field h with wavelength \lambda and strength \delta h carries energy density \sim m_P^2 \delta h^2 / \lambda^2. Applying the same dimensional analysis we used to estimate \delta \phi at horizon crossing to the rescaled field h/m_P, we estimate the strength \delta_T of the tensor fluctuations as

\delta_T^2 \sim \frac{H^2}{m_P^2}\sim \frac{V}{m_P^4}.

From observations of the CMB anisotropy we know that \delta_S\sim 10^{-5}, and now BICEP2 claims that the ratio

r = \frac{\delta_T^2}{\delta_S^2}

is about r\sim 0.2 at an angular scale on the sky of about one degree. The conclusion (being a little more careful about the O(1) factors this time) is

V^{1/4} \sim 2 \times 10^{16}~GeV \left(\frac{r}{0.2}\right)^{1/4}.

This is our second important conclusion: The energy density during inflation defines a mass scale, which turns our to be 2 \times 10^{16}~GeV for the observed value of r. This is a very interesting finding because this mass scale is not so far below the Planck scale, where quantum gravity kicks in, and is in fact pretty close to theoretical estimates of the unification scale in supersymmetric grand unified theories. If this mass scale were a factor of 2 smaller, then r would be smaller by a factor of 16, and hence much harder to detect.

Rolling, rolling, rolling, …

Using \delta_S^2 \sim H^4/\dot\phi^2, we can express r as

r = \frac{\delta_T^2}{\delta_S^2}\sim \frac{\dot\phi^2}{m_P^2 H^2}.

It is convenient to measure time in units of the number N = H t of e-foldings of inflation, in terms of which we find

\frac{1}{m_P^2} \left(\frac{d\phi}{dN}\right)^2\sim r;

Now, we know that for inflation to explain the smoothness of the universe we need N larger than 50, and if we assume that the inflaton rolls at a roughly constant rate during N e-foldings, we conclude that, while rolling, the change in the inflaton field is

\frac{\Delta \phi}{m_P} \sim N \sqrt{r}.

This is our third important conclusion — the inflaton field had to roll a long, long, way during inflation — it changed by much more than the Planck scale! Putting in the O(1) factors we have left out reduces the required amount of rolling by about a factor of 3, but we still conclude that the rolling was super-Planckian if r\sim 0.2. That’s curious, because when the scalar field strength is super-Planckian, we expect the kind of effective field theory we have been implicitly using to be a poor approximation because quantum gravity corrections are large. One possible way out is that the inflaton might have rolled round and round in a circle instead of in a straight line, so the field strength stayed sub-Planckian even though the distance traveled was super-Planckian.

Spectral tilt

As the inflaton rolls, the potential energy, and hence also the Hubble constant H, change during inflation. That means that both the scalar and tensor fluctuations have a strength which is not quite independent of length scale. We can parametrize the scale dependence in terms of how the fluctuations change per e-folding of inflation, which is equivalent to the change per logarithmic length scale and is called the “spectral tilt.”

To keep things simple, let’s suppose that the rate of rolling is constant during inflation, at least over the length scales for which we have data. Using \delta_S^2 \sim H^4/\dot\phi^2, and assuming \dot\phi is constant, we estimate the scalar spectral tilt as

-\frac{1}{\delta_S^2}\frac{d\delta_S^2}{d N} \sim - \frac{4 \dot H}{H^2}.

Using \delta_T^2 \sim H^2/m_P^2, we conclude that the tensor spectral tilt is half as big.

From H^2 \sim V/m_P^2, we find

\dot H \sim \frac{1}{2} \dot \phi \frac{V'}{V} H,

and using \dot \phi \sim -V'/H we find

-\frac{1}{\delta_S^2}\frac{d\delta_S^2}{d N} \sim \frac{V'^2}{H^2V}\sim m_P^2\left(\frac{V'}{V}\right)^2\sim \left(\frac{V}{m_P^4}\right)\left(\frac{m_P^6 V'^2}{V^3}\right)\sim \delta_T^2 \delta_S^{-2}\sim r.

Putting in the numbers more carefully we find a scalar spectral tilt of r/4 and a tensor spectral tilt of r/8.

This is our last important conclusion: A relatively large value of r means a significant spectral tilt. In fact, even before the BICEP2 results, the CMB anisotropy data already supported a scalar spectral tilt of about .04, which suggested something like r \sim .16. The BICEP2 detection of the tensor fluctuations (if correct) has confirmed that suspicion.

Summing up

If you have stuck with me this far, and you haven’t seen this stuff before, I hope you’re impressed. Of course, everything I’ve described can be done much more carefully. I’ve tried to convey, though, that the emerging story seems to hold together pretty well. Compared to last week, we have stronger evidence now that inflation occurred, that the mass scale of inflation is high, and that the scalar and tensor fluctuations produced during inflation have been detected. One prediction is that the tensor fluctuations, like the scalar ones, should have a notable spectral tilt, though a lot more data will be needed to pin that down.

I apologize to the experts again, for the sloppiness of these arguments. I hope that I have at least faithfully conveyed some of the spirit of inflation theory in a way that seems somewhat accessible to the uninitiated. And I’m sorry there are no references, but I wasn’t sure which ones to include (and I was too lazy to track them down).

It should also be clear that much can be done to sharpen the confrontation between theory and experiment. A whole lot of fun lies ahead.

Added notes (3/25/2014):

Okay, here’s a good reference, a useful review article by Baumann. (I found out about it on Twitter!)

From Baumann’s lectures I learned a convenient notation. The rolling of the inflaton can be characterized by two “potential slow-roll parameters” defined by

\epsilon = \frac{m_p^2}{2}\left(\frac{V'}{V}\right)^2,\quad \eta = m_p^2\left(\frac{V''}{V}\right).

Both parameters are small during slow rolling, but the relationship between them depends on the shape of the potential. My crude approximation (\epsilon = \eta) would hold for a quadratic potential.

We can express the spectral tilt (as I defined it) in terms of these parameters, finding 2\epsilon for the tensor tilt, and 6 \epsilon - 2\eta for the scalar tilt. To derive these formulas it suffices to know that \delta_S^2 is proportional to V^3/V'^2, and that \delta_T^2 is proportional to H^2; we also use

3H\dot \phi = -V', \quad 3H^2 = V/m_P^2,

keeping factors of 3 that I left out before. (As a homework exercise, check these formulas for the tensor and scalar tilt.)

It is also easy to see that r is proportional to \epsilon; it turns out that r = 16 \epsilon. To get that factor of 16 we need more detailed information about the relative size of the tensor and scalar fluctuations than I explained in the post; I can’t think of a handwaving way to derive it.

We see, though, that the conclusion that the tensor tilt is r/8 does not depend on the details of the potential, while the relation between the scalar tilt and r does depend on the details. Nevertheless, it seems fair to claim (as I did) that, already before we knew the BICEP2 results, the measured nonzero scalar spectral tilt indicated a reasonably large value of r.

Once again, we’re lucky. On the one hand, it’s good to have a robust prediction (for the tensor tilt). On the other hand, it’s good to have a handle (the scalar tilt) for distinguishing among different inflationary models.

One last point is worth mentioning. We have set Planck’s constant \hbar equal to one so far, but it is easy to put the powers of \hbar back in using dimensional analysis (we’ll continue to assume the speed of light c is one). Since Newton’s constant G has the dimensions of length/energy, and the potential V has the dimensions of energy/volume, while \hbar has the dimensions of energy times length, we see that

\delta_T^2 \sim \hbar G^2V.

Thus the production of gravitational waves during inflation is a quantum effect, which would disappear in the limit \hbar \to 0. Likewise, the scalar fluctuation strength \delta_S^2 is also O(\hbar), and hence also a quantum effect.

Therefore the detection of primordial gravitational waves by BICEP2, if correct, confirms that gravity is quantized just like the other fundamental forces. That shouldn’t be a surprise, but it’s nice to know.

Oh, the Places You’ll Do Theoretical Physics!

I won’t run lab tests in a box.
I won’t run lab tests with a fox.
But I’ll prove theorems here or there.
Yes, I’ll prove theorems anywhere…

Physicists occupy two camps. Some—theorists—model the world using math. We try to predict experiments’ outcomes and to explain natural phenomena. Others—experimentalists—gather data using supermagnets, superconductors, the world’s coldest atoms, and other instruments deserving of superlatives. Experimentalists confirm that our theories deserve trashing or—for this we pray—might not model the world inaccurately.

Theorists, people say, can work anywhere. We need no million-dollar freezers. We need no multi-pound magnets.* We need paper, pencils, computers, and coffee. Though I would add “quiet,” colleagues would add “iPods.”

Theorists’ mobility reminds me of the book Green Eggs and Ham. Sam-I-am, the antagonist, drags the protagonist to spots as outlandish as our workplaces. Today marks the author’s birthday. Since Theodor Geisel stimulated imaginations, and since imagination drives physics, Quantum Frontiers is paying its respects. In honor of Oh, the Places You’ll Go!, I’m spotlighting places you can do theoretical physics. You judge whose appetite for exotica exceeds whose: Dr. Seuss’s or theorists’.

http://breakfastonthect.com/2012/01/18/dartmouths-winter-carnival-ranked-6th/

I’ve most looked out-of-place doing physics by a dirt road between sheep-populated meadows outside Lancaster, UK. Lancaster, the War of the Roses victor, is a city in northern England. The year after graduating from college, I worked in Lancaster University as a research assistant. I studied a crystal that resembles graphene, a material whose superlatives include “superstrong,” “supercapacitor,” and “superconductor.” From morning to evening, I’d submerse in math till it poured out my ears. Then I’d trek from “uni,” as Brits say, to the “city centre,” as they write.

The trek wound between trees; fields; and, because I was in England, puddles. Many evenings, a rose or a sunset would arrest me. Other evenings, physics would. I’d realize how to solve an equation, or that I should quit banging my head against one. Stepping off the road, I’d fish out a notebook and write. Amidst the puddles and lambs. Cyclists must have thought me the queerest sight since a cloudless sky.

A colleague loves doing theory in the sky. On planes, he explained, hardly anyone interrupts his calculations. And who minds interruptions by pretzels and coffee?

“A mathematician is a device for turning coffee into theorems,” some have said, and theoretical physicists live down the block from mathematicians in the neighborhood of science. Turn a Pasadena café upside-down and shake it, and out will fall theorists. Since Hemingway’s day, the romanticism has faded from the penning of novels in cafés. But many a theorist trumpets about an equation derived on a napkin.

Trumpeting filled my workplace in Oxford. One of Clarendon Lab’s few theorists, I neighbored lasers, circuits, and signs that read “DANGER! RADIATION.” Though radiation didn’t leak through our walls (I hope), what did contributed more to that office’s eccentricity more than radiation would. As early as 9:10 AM, the experimentalists next door blasted “Born to Be Wild” and Animal House tunes. If you can concentrate over there, you can concentrate anywhere.

One paper I concentrated on had a Crumple-Horn Web-Footed Green-Bearded Schlottz of an acknowledgements section. In a physics paper’s last paragraph, one thanks funding agencies and colleagues for support and advice. “The authors would like to thank So-and-So for insightful comments,” papers read. This paper referenced a workplace: “[One coauthor] is grateful to the Half Moon Pub.” Colleagues of the coauthor confirmed the acknowledgement’s aptness.

Though I’ve dwelled on theorists’ physical locations, our minds roost elsewhere. Some loiter in atoms; others, in black holes; some, on four-dimensional surfaces; others, in hypothetical universes. I hobnob with particles in boxes. As Dr. Seuss whisks us to a Bazzim populated by Nazzim, theorists tell of function spaces populated by Rényi entropies.

The next time you see someone standing in a puddle, or in a ditch, or outside Buckingham Palace, scribbling equations, feel free to laugh. You might be seeing a theoretical physicist. You might be seeing me. To me, physics has relevance everywhere. Scribbling there and here should raise eyebrows no more than any setting in a Dr. Seuss book.

The author would like to thank this emporium of Seussoria. And Java & Co.

*We need for them to confirm that our theories deserve trashing, but we don’t need them with us. Just as, when considering quitting school to break into the movie business, you need for your mother to ask, “Are you sure that’s a good idea, dear?” but you don’t need for her to hang on your elbow. Except experimentalists don’t say “dear” when crushing theorists’ dreams.

Guns versus butter in quantum information

From my college’s computer-science club, I received a T-shirt that reads:

while(not_dead){

sleep--;

time--;

awesome++;

}

/*There’s a reason we can’t hang out with you…*/

The message is written in Java, a programming language. Even if you’ve never programmed, you likely catch the drift: CS majors are the bees’ knees because, at the expense of sleep and social lives, they code. I disagree with part of said drift: CS majors hung out with me despite being awesome.

photo-3 copy

The rest of the drift—you have to give some to get some—synopsizes the physics I encountered this fall. To understand tradeoffs, you needn’t study QI. But what trades off with what, according to QI, can surprise us.

The T-shirt haunted me at the University of Nottingham, where researchers are blending QI with Einstein’s theory of relativity. Relativity describes accelerations, gravity, and space-time’s curvature. In other sources, you can read about physicists’ attempts to unify relativity and quantum mechanics, the Romeo and Tybalt of modern physics, into a theory of quantum gravity. In this article, relativity tangos with quantum mechanics in relativistic quantum information (RQI). If I move my quantum computer, RQIers ask, how do I change its information processing? How does space-time’s curvature affect computation? How can motion affect measurements?

Answers to these questions involve tradeoffs.

Nottingham

Nottingham researchers kindly tolerating a seminar by me

For example, acceleration entangles particles. Decades ago, physicists learned that acceleration creates particles. Say you’re gazing into a vacuum—not empty space, but nearly empty space, the lowest-energy system that can exist. Zooming away on a rocket, I accelerate relative to you. From my perspective, more particles than you think—and higher-energy particles—surround us.

Have I created matter? Have I violated the Principle of Conservation of Energy (and Mass)? I created particles in a sense, but at the expense of rocket fuel. You have to give some to get some:

Fuel--;
Particles++;

The math that describes my particles relates to the math that describes entanglement.* Entanglement is a relationship between quantum systems. Say you entangle two particles, then separate them. If you measure one, you instantaneously affect the other, even if the other occupies another city.

Say we encode information in quantum particles stored in a box.** Just as you encode messages by writing letters, we write messages in the ink of quantum particles. Say the box zooms off on a rocket. Just as acceleration led me to see particles in a vacuum, acceleration entangles the particles in our box. Since entanglement facilitates computation, you can process information by shaking a box. And performing another few steps.

When an RQIer told me so, she might as well have added that space-time has 106 dimensions and the US would win the World Cup. Then my T-shirt came to mind. To get some, you have to give some. When you give something, you might get something. Giving fuel gets you entanglement. To prove that statement, I need to do and interpret math. Till I have time to,

Fuel--;
Entanglement++;

offers intuition.

After cropping up in Nottingham, my T-shirt reared its head (collar?) in physics problem after physics problem. By “consuming entanglement”—forfeiting that ability to affect the particle in another city—you can teleport quantum information.

Entanglement--;
Quantum teleportation++;

My research involves tradeoffs between information and energy. As the Hungarian physicist Leó Szilárd showed, you can exchange information for work. Say you learn which half of a box*** a particle occupies, and you trap the particle in that half. Upon freeing the particle—forfeiting your knowledge about its location—you can lift a weight, charge a battery, or otherwise store energy.

Information--;
Energy++;

If you expend energy, Rolf Landauer showed, you can gain knowledge.

Energy--;
Information++;

No wonder my computer-science friends joked about sleep deprivation. But information can energize. For fuel, I forage in the blending of fields like QI and relativity, and in physical intuitions like those encapsulated in the pseudo-Java above. Much as Szilard’s physics enchants me, I’m glad that the pursuit of physics contradicts his conclusion:

while(not_dead){

Information++;

Energy++;

}

The code includes awesome++ implicitly.

*Bogoliubov transformations, to readers familiar with the term.

**In the fields in a cavity, to readers familiar with the terms.

***Physicists adore boxes, you might have noticed.

With thanks to Ivette Fuentes and the University of Nottingham for their hospitality and for their introduction to RQI.

Making predictions in the multiverse

Image

I am a theoretical physicist at University of California, Berkeley. Last month, I attended a very interesting conference organized by Foundamental Questions Institute (FQXi) in Puerto Rico, and presented a talk about making predictions in cosmology, especially in the eternally inflating multiverse. I very much enjoyed discussions with people at the conference, where I was invited to post a non-technical account of the issue as well as my own view of it. So here I am.

I find it quite remarkable that some of us in the physics community are thinking with some “confidence” that we live in the multiverse, more specifically one of the many universes in which low-energy physical laws take different forms. (For example, these universes have different elementary particles with different properties, possibly different spacetime dimensions, and so on.) This idea of the multiverse, as we currently think, is not simply a result of random imagination by theorists, but is based on several pieces of observational and theoretical evidence.

Observationally, we have learned more and more that we live in a highly special universe—it seems that the “physical laws” of our universe (summarized in the form of standard models of particle physics and cosmology) takes such a special form that if its structure were varied slightly, then there would be no interesting structure in the universe, let alone intelligent life. It is hard to understand this fact unless there are many universes with varying “physical laws,” and we simply happen to emerge in a universe which allows for intelligent life to develop (which seems to require special conditions). With multiple universes, we can understand the “specialness” of our universe precisely as we understand the “specialness” of our planet Earth (e.g. the ideal distance from the sun), which is only one of the many planets out there.

Perhaps more nontrivial is the fact that our current theory of fundamental physics leads to this picture of the multiverse in a very natural way. Imagine that at some point in the history of the universe, space is exponentially expanding. This expansion—called inflation—occurs when space is filled with a “positive vacuum energy” (which happens quite generally). We knew, already in 80’s, that such inflation is generically eternal. During inflation, various non-inflating regions called bubble universes—of which our own universe could be one—may form, much like bubbles in boiling water. Since ambient space expands exponentially, however, these bubbles do not percolate; rather, the process of creating bubble universes lasts forever in an eternally inflating background. Now, recent progress in string theory suggests that low energy theories describing phyics in these bubble universes (such as the elementary particle content and their properties) may differ bubble by bubble. This is precisely the setup needed to understand the “specialness” of our universe because of the selection effect associated with our own existence, as described above.

multiverse

A schematic depiction of the eternally inflating multiverse. The horizontal and vertical directions correspond to spatial and time directions, respectively, and various regions with the inverted triangle or argyle shape represent different universes. While regions closer to the upper edge of the diagram look smaller, it is an artifact of the rescaling made to fit the large spacetime into a finite drawing—the fractal structure near the upper edge actually corresponds to an infinite number of large universes.

This particular version of the multiverse—called the eternally inflating multiverse—is very attractive. It is theoretically motivated and has a potential to explain various features seen in our universe. The eternal nature of inflation, however, causes a serious issue of predictivity. Because the process of creating bubble universes occurs infinitely many times, “In an eternally inflating universe, anything that can happen will happen; in fact, it will happen an infinite number of times,” as phrased in an article by Alan Guth. Suppose we want to calculate the relative probability for (any) events A and B to happen in the multiverse. Following the standard notion of probability, we might define it as the ratio of the numbers of times events A and B happen throughout the whole spacetime

P = \frac{N_A}{N_B}.

In the eternally inflating multiverse, however, both A and B occur infinitely many times: N_A, N_B = \infty. This expression, therefore, is ill-defined. One might think that this is merely a technical problem—we simply need to “regularize” to make both N_{A,B} finite, at a middle stage of the calculation, and then we get a well-defined answer. This is, however, not the case. One finds that depending on the details of this regularization procedure, one can obtain any “prediction” one wants, and there is no a priori preferred way to proceed over others—predictivity of physical theory seems lost!

Over the past decades, some physicists and cosmologists have been thinking about many aspects of this so-called measure problem in eternal inflation. (There are indeed many aspects to the problem, and I’m omitting most of them in my simplified presentation above.) Many of the people who contributed were in the session at the conference, including Aguirre, Albrecht, Bousso, Carroll, Guth, Page, Tegmark, and Vilenkin. My own view, which I think is shared by some others, is that this problem offers a window into deep issues associated with spacetime and gravity. In my 2011 paper I suggested that quantum mechanics plays a crucial role in understanding the multiverse, even at the largest distance scales. (A similar idea was also discussed here around the same time.) In particular, I argued that the eternally inflating multiverse and quantum mechanical many worlds a la Everett are the same concept:

Multiverse = Quantum Many Worlds

in a specific, and literal, sense. In this picture, the global spacetime of general relativity appears only as a derived concept at the cost of overcounting true degrees of freedom; in particular, infinitely large space associated with eternal inflation is a sort of “illusion.” A “true” description of the multiverse must be “intrinsically” probabilistic in a quantum mechanical sense—probabilities in cosmology and quantum measurements have the same origin.

To illustrate the basic idea, let us first consider an (apparently unrelated) system with a black hole. Suppose we drop some book A into the black hole and observe subsequent evolution of the system from a distance. The book will be absorbed into (the horizon of) the black hole, which will then eventually evaporate, leaving Hawking radiation. Now, let us consider another process of dropping a different book B, instead of A, and see what happens. The subsequent evolution in this case is similar to the case with A, and we will be left with Hawking radiation. However, this final-state Hawking radiation arising from B is (believed by many to be) different from that arising from A in its subtle quantum correlation structure, so that if we have perfect knowledge about the final-state radiation then we can reconstruct what the original book was. This property is called unitarity and is considered to provide the correct picture for black hole dynamics, based on recent theoretical progress. To recap, the information about the original book will not be lost—it will simply be distributed in final-state Hawking radiation in a highly scrambled form.

A puzzling thing occurs, however, if we observe the same phenomenon from the viewpoint of an observer who is falling into the black hole with a book. In this case, the equivalence principle says that the book does not feel gravity (except for the tidal force which is tiny for a large black hole), so it simply passes through the black hole horizon without any disruption. (Recently, this picture was challenged by the so-called firewall argument—the book might hit a collection of higher energy quanta called a firewall, rather than freely fall. Even if so, it does not affect our basic argument below.) This implies that all the information about the book (in fact, the book itself) will be inside the horizon at late times. On the other hand, we have just argued that from a distant observer’s point of view, the information will be outside—first on the horizon and then in Hawking radiation. Which is correct?

One might think that the information is simply duplicated: one copy inside and the other outside. This, however, cannot be the case. Quantum mechanics prohibits faithful copying of full quantum information, the so-called no-cloning theorem. Therefore, it seems that the two pictures by the two observers cannot both be correct.

The proposed solution to this puzzle is interesting—both pictures are correct, but not at the same time. The point is that one cannot be both a distant observer and a falling observer at the same time. If you are a distant observer, the information will be outside, and the interior spacetime must be viewed as non-existent since you can never access it even in principle (because of the existence of the horizon). On the other hand, if you are a falling observer, then you have the interior spacetime in which the information (the book itself) will fall, but this happens only at the cost of losing a part of spacetime in which Hawking radiation lies, which you can never access since you yourself are falling into the black hole. There is no inconsistency in either of these two pictures; only if you artificially “patch” the two pictures, which you cannot physically do, does the apparent inconsistency of information duplication occurs. This somewhat surprising aspect of a system with gravity is called black hole complementarity, pioneered by ‘t Hooft, Susskind, and their collaborators.

What does this discussion of black holes have to do with cosmology, and, in particular the eternally inflating multiverse? In cosmology our space is surrounded by a cosmological horizon. (For example, imagine that space is expanding exponentially; this makes it impossible for us to obtain any signal from regions farther than some distance because objects in these regions recede faster than speed of light. The definition of appropriate horizons in general cases is more subtle, but can be made.) The situation, therefore, is the “inside out” version of the black hole case viewed from a distant observer. As in the case of the black hole, quantum mechanics requires that spacetime on the other side of the horizon—in this case the exterior to the cosmological horizon—must be viewed as non-existent. (In the paper I made this claim based on some simple supportive calculations.) In a more technical term, a quantum state describing the system represents only the region within the horizon—there is no infinite space in any single, consistent description of the system!

If a quantum state represents only space within the horizon, then where is the multiverse, which we thought exists in an eternally inflating space further away from our own horizon? The answer is—probability! The process of creating bubble universes is a probabilistic process in the quantum mechanical sense—it occurs through quantum mechanical tunneling. This implies that, starting from some initially inflating space, we could end up with different universes probabilistically. All different universes—including our own—live in probability space. In a more technical term, a state representing eternally inflating space evolves into a superposition of terms—or branches—representing different universes, but with each of them representing only the region within its own horizon. Note that there is no concept of infinitely large space here, which led to the ill-definedness of probability. The picture of initially large multiverse, naively suggested by general relativity, appears only after “patching” pictures based on different branches together; but this vastly overcounts true degrees of freedom as was the case if we include both the interior spacetime and Hawking radiation in our description of a black hole.

The description of the multiverse presented here provides complete unification of the eternally inflating multiverse and the many worlds interpretation in quantum mechanics. Suppose the multiverse starts from some initial state |\Psi(t_0)\rangle. This state evolves into a superposition of states in which various bubble universes nucleate in various locations. As time passes, a state representing each universe further evolves into a superposition of states representing various possible cosmic histories, including different outcomes of “experiments” performed within that universe. (These “experiments” may, but need not, be scientific experiments—they can be any physical processes.) At late times, the multiverse state |\Psi(t)\rangle will thus contain an enormous number of terms, each of which represents a possible world that may arise from |\Psi(t_0)\rangle consistently with the laws of physics. Probabilities in cosmology and microscopic processes are then both given by quantum mechanical probabilities in the same manner. The multiverse and quantum many worlds are really the same thing—they simply refer to the same phenomenon occurring at (vastly) different scales.

branching

A schematic picture for the evolution of the multiverse state. As t increases, the state evolves into a superposition of states in which various bubble universes nucleate in various locations. Each of these states then evolves further into a superposition of states representing various possible cosmic histories, including different outcomes of experiments performed within that universe.

The picture presented here does not solve all the problems in eternally inflating cosmology. What is the actual quantum state of the multiverse? What is its “initial conditions”? What is time? How does it emerge? The picture, however, does provide a framework to address these further, deep questions, and I have recently made some progress: the basic idea is that the state of the multiverse (which may be selected uniquely by the normalizability condition) never changes, and yet time appears as an emergent concept locally in branches as physical correlations among objects (along the lines of an old idea by DeWitt). Given the length already, I will not elaborate on this new development here. If you are interested, you might want to read my paper.

It is fascinating that physicists can talk about big and deep questions like the ones discussed here based on concrete theoretical progress. Nobody really knows where these explorations will finally lead us to. It seems, however, clear that we live in an exciting era in which our scientific explorations reach beyond what we thought to be the entire physical world, our universe.

Reporting from the ‘Frontiers of Quantum Information Science’

What am I referring to with this title? It is similar to the name of this blog–but that’s not where this particular title comes from–although there is a common denominator. Frontiers of Quantum Information Science was the theme for the 31st Jerusalem winter school in theoretical physics, which takes place annually at the Israeli Institute for Advanced Studies located on the Givat Ram campus of the Hebrew University of Jerusalem. The school took place from December 30, 2013 through January 9, 2014, but some of the attendees are still trickling back to their home institutions. The common denominator is that our very own John Preskill was the director of this school; co-directed by Michael Ben-Or and Patrick Hayden. John mentioned during a previous post and reiterated during his opening remarks that this is the first time the IIAS has chosen quantum information to be the topic for its prestigious advanced school–another sign of quantum information’s emergence as an important sub-field of physics. In this blog post, I’m going to do my best to recount these festivities while John protects his home from forest fires, prepares a talk for the Simons Institute’s workshop on Hamiltonian complexityteaches his quantum information course and celebrates his birthday 60+1.

The school was mainly targeted at physicists, but it was diversely represented. Proof of the value of this diversity came in an interaction between a computer scientist and a physicist, which led to one of the school’s most memorable moments. Both of my most memorable moments started with the talent show (I was surprised that so many talents were on display at a physics conference…) Anyways, towards the end of the show, Mateus Araújo Santos, a PhD student in Vienna, entered the stage and mentioned that he could channel “the ghost of Feynman” to serve as an oracle for NP-complete decision problems. After making this claim, people obviously turned to Scott Aaronson, hoping that he’d be able to break the oracle. However, in order for this to happen, we had to wait until Scott’s third lecture about linear optics and boson sampling the next day. You can watch Scott bombard the oracle with decision problems from 1:00-2:15 during the video from his third lecture.

oracle_aaronson

Scott Aaronson grilling the oracle with a string of NP-complete decision problems! From 1:00-2:15 during this video.

The other most memorable moment was when John briefly danced Gangnam style during Soonwon Choi‘s talent show performance. Unfortunately, I thought I had this on video, but the video didn’t record. If anyone has video evidence of this, then please share!
Continue reading