# I spy with my little eye…something algebraic.

Look at this picture.

Does any part of it surprise you? Look more closely.

Do you see a boy’s name?

I spell “Peter” with two e’s, but “Piotr” and “Pyotr” appear as authors’ names in papers’ headers. Finding “Petr” in a paper shouldn’t have startled me. But how often does “Gretchen” or “Amadeus” materialize in an equation?

When I was little, my reading list included Eye Spy, Where’s Waldo?, and Puzzle Castle. The books teach children to pay attention, notice details, and evaluate ambiguities.

That’s what physicists do. The first time I saw the picture above, I saw a variation on “Peter.” I was reading (when do I not?) about the intersection of quantum information and thermodynamics. The authors were discussing heat and algebra, not saints or boys who picked pecks of pickled peppers. So I looked more closely.

Each letter resolved into part of a story about a physical system. The P represents a projector. A projector is a mathematical object that narrows one’s focus to a particular space, as blinders on a horse do. The E tells us which space to focus on: a space associated with an amount E of energy, like a country associated with a GDP of $500 billion. Some of the energy E belongs to a heat reservoir. We know so because “reservoir” begins with r, and R appears in the picture. A heat reservoir is a system, like a colossal bathtub, whose temperature remains constant. The Greek letter $\tau$, pronounced “tau,” represents the reservoir’s state. The reservoir occupies an equilibrium state: The bath’s large-scale properties—its average energy, volume, etc.—remain constant. Never mind about jacuzzis. Piecing together the letters, we interpret the picture as follows: Imagine a vast, constant-temperature bathtub (R). Suppose we shut the tap long enough ago that the water in the tub has calmed ($\tau$). Suppose the tub neighbors a smaller system—say, a glass of Perrier.* Imagine measuring how much energy the bath-and-Perrier composite contains (P). Our measurement device reports the number E. Quite a story to pack into five letters. Didn’t Peter deserve a second glance? The equation’s right-hand side forms another story. I haven’t seen Peters on that side, nor Poseidons nor Gallahads. But look closely, and you will find a story. The images above appear in “Fundamental limitations for quantum and nanoscale thermodynamics,” published by Michał Horodecki and Jonathan Oppenheim in Nature Communications in 2013. *Experts: The ρS that appears in the first two images represents the smaller system. The tensor product represents the reservoir-and-smaller-system composite. # Generally speaking My high-school calculus teacher had a mustache like a walrus’s and shoulders like a rower’s. At 8:05 AM, he would demand my class’s questions about our homework. Students would yawn, and someone’s hand would drift into the air. “I have a general question,” the hand’s owner would begin. “Only private questions from you,” my teacher would snap. “You’ll be a general someday, but you’re not a colonel, or even a captain, yet.” Then his eyes would twinkle; his voice would soften; and, after the student asked the question, his answer would epitomize why I’ve chosen a life in which I use calculus more often than laundry detergent. Many times though I witnessed the “general” trap, I fell into it once. Little wonder: I relish generalization as other people relish hiking or painting or Michelin-worthy relish. When inferring general principles from examples, I abstract away details as though they’re tomato stains. My veneration of generalization led me to quantum information (QI) theory. One abstract theory can model many physical systems: electrons, superconductors, ion traps, etc. Little wonder that generalizing a QI model swallowed my summer. QI has shed light on statistical mechanics and thermodynamics, which describe energy, information, and efficiency. Models called resource theories describe small systems’ energies, information, and efficiencies. Resource theories help us calculate a quantum system’s value—what you can and can’t create from a quantum system—if you can manipulate systems in only certain ways. Suppose you can perform only operations that preserve energy. According to the Second Law of Thermodynamics, systems evolve toward equilibrium. Equilibrium amounts roughly to stasis: Averages of properties like energy remain constant. Out-of-equilibrium systems have value because you can suck energy from them to power laundry machines. How much energy can you draw, on average, from a system in a constant-temperature environment? Technically: How much “work” can you draw? We denote this average work by < W >. According to thermodynamics, < W > equals the change ∆F in the system’s Helmholtz free energy. The Helmholtz free energy is a thermodynamic property similar to the energy stored in a coiled spring. One reason to study thermodynamics? Suppose you want to calculate more than the average extractable work. How much work will you probably extract during some particular trial? Though statistical physics offers no answer, resource theories do. One answer derived from resource theories resembles ∆F mathematically but involves one-shot information theory, which I’ve discussed elsewhere. If you average this one-shot extractable work, you recover < W > = ∆F. “Helmholtz” resource theories recapitulate statistical-physics results while offering new insights about single trials. Helmholtz resource theories sit atop a silver-tasseled pillow in my heart. Why not, I thought, spread the joy to the rest of statistical physics? Why not generalize thermodynamic resource theories? The average work <W > extractable equals ∆F if heat can leak into your system. If heat and particles can leak, <W > equals the change in your system’s grand potential. The grand potential, like the Helmholtz free energy, is a free energy that resembles the energy in a coiled spring. The grand potential characterizes Bose-Einstein condensates, low-energy quantum systems that may have applications to metrology and quantum computation. If your system responds to a magnetic field, or has mass and occupies a gravitational field, or has other properties, <W > equals the change in another free energy. A collaborator and I designed resource theories that describe heat-and-particle exchanges. In our paper “Beyond heat baths: Generalized resource theories for small-scale thermodynamics,” we propose that different thermodynamic resource theories correspond to different interactions, environments, and free energies. I detailed the proposal in “Beyond heat baths II: Framework for generalized thermodynamic resource theories.” “II” generalizes enough to satisfy my craving for patterns and universals. “II” generalizes enough to merit a hand-slap of a pun from my calculus teacher. We can test abstract theories only by applying them to specific systems. If thermodynamic resource theories describe situations as diverse as heat-and-particle exchanges, magnetic fields, and polymers, some specific system should shed light on resource theories’ accuracy. If you find such a system, let me know. Much as generalization pleases aesthetically, the detergent is in the details. # Reading the sub(linear) text Physicists are not known for finesse. “Even if it cost us our funding,” I’ve heard a physicist declare, “we’d tell you what we think.” Little wonder I irked the porter who directed me toward central Cambridge. The University of Cambridge consists of colleges as the US consists of states. Each college has a porter’s lodge, where visitors check in and students beg for help after locking their keys in their rooms. And where physicists ask for directions. Last March, I ducked inside a porter’s lodge that bustled with deliveries. The woman behind the high wooden desk volunteered to help me, but I asked too many questions. By my fifth, her pointing at a map had devolved to jabbing. Read the subtext, I told myself. Leave. Or so I would have told myself, if not for that afternoon. That afternoon, I’d visited Cambridge’s CMS, which merits every letter in “Centre for Mathematical Sciences.” Home to Isaac Newton’s intellectual offspring, the CMS consists of eight soaring, glass-walled, blue-topped pavilions. Their majesty walloped me as I turned off the road toward the gatehouse. So did the congratulatory letter from Queen Elizabeth II that decorated the route to the restroom. I visited Nilanjana Datta, an affiliated lecturer of Cambridge’s Faculty of Mathematics, and her student, Felix Leditzky. Nilanjana and Felix specialize in entropies and one-shot information theory. Entropies quantify uncertainties and efficiencies. Imagine compressing many copies of a message into the smallest possible number of bits (units of memory). How few bits can you use per copy? That number, we call the optimal compression rate. It shrinks as the number of copies compressed grows. As the number of copies approaches infinity, that compression rate drops toward a number called the message’s Shannon entropy. If the message is quantum, the compression rate approaches the von Neumann entropy. Good luck squeezing infinitely many copies of a message onto a hard drive. How efficiently can we compress fewer copies? According to one-shot information theory, the answer involves entropies other than Shannon’s and von Neumann’s. In addition to describing data compression, entropies describe the charging of batteriesthe concentration of entanglementthe encrypting of messages, and other information-processing tasks. Speaking of compressing messages: Suppose one-shot information theory posted status updates on Facebook. Suppose that that panel on your Facebook page’s right-hand side showed news weightier than celebrity marriages. The news feed might read, “TRENDING: One-shot information theory: Second-order asymptotics.” Second-order asymptotics, I learned at the CMS, concerns how the optimal compression rate decays as the number of copies compressed grows. Imagine compressing a billion copies of a quantum message ρ. The number of bits needed about equals a billion times the von Neumann entropy HvN(ρ). Since a billion is less than infinity, 1,000,000,000 HvN(ρ) bits won’t suffice. Can we estimate the compression rate more precisely? The question reminds me of gas stations’ hidden pennies. The last time I passed a station’s billboard, some number like$3.65 caught my eye. Each gallon cost about $3.65, just as each copy of ρ costs about HvN(ρ) bits. But a 9/10, writ small, followed the$3.65. If I’d budgeted $3.65 per gallon, I couldn’t have filled my tank. If you budget HvN(ρ) bits per copy of ρ, you can’t compress all your copies. Suppose some station’s owner hatches a plan to promote business. If you buy one gallon, you pay$3.654. The more you purchase, the more the final digit drops from four. By cataloguing receipts, you calculate how a tank’s cost varies with the number of gallons, n. The cost equals $3.65 × n to a first approximation. To a second approximation, the cost might equal$3.65 × n + an, wherein a represents some number of cents. Compute a, and you’ll have computed the gas’s second-order asymptotics.

Nilanjana and Felix computed a’s associated with data compression and other quantum tasks. Second-order asymptotics met information theory when Strassen combined them in nonquantum problems. These problems developed under attention from Hayashi, Han, Polyanski, Poor, Verdu, and others. Tomamichel and Hayashi, as well as Li, introduced quantumness.

In the total-cost expression, \$3.65 × n depends on n directly, or “linearly.” The second term depends on √n. As the number of gallons grows, so does √n, but √n grows more slowly than n. The second term is called “sublinear.”

Which is the word that rose to mind in the porter’s lodge. I told myself, Read the sublinear text.

Little wonder I irked the porter. At least—thanks to quantum information, my mistake, and facial expressions’ contagiousness—she smiled.

With thanks to Nilanjana Datta and Felix Leditzky for explanations and references; to Nilanjana, Felix, and Cambridge’s Centre for Mathematical Sciences for their hospitality; and to porters everywhere for providing directions.

# “Feveral kinds of hairy mouldy fpots”

The book had a sheepskin cover, and mold was growing on the sheepskin. Robert Hooke, a pioneering microbiologist, slid the cover under one of the world’s first microscopes. Mold, he discovered, consists of “nothing elfe but feveral kinds of fmall and varioufly figur’d Mufhroms.” He described the Mufhroms in his treatise Micrographia, a 1665 copy of which I found in “Beautiful Science.” An exhibition at San Marino’s Huntington Library, “Beautiful Science” showcases the physics of rainbows, the stars that enthralled Galileo, and the world visible through microscopes.

Beautiful science of yesterday: An illustration, from Hooke’s Micrographia, of the mold.

“[T]hrough a good Microfcope,” Hooke wrote, the sheepskin’s spots appeared “to be a very pretty fhap’d Vegetative body.”

How like a scientist, to think mold pretty. How like quantum noise, I thought, Hooke’s mold sounds.

Quantum noise hampers systems that transmit and detect light. To phone a friend or send an email—“Happy birthday, Sarah!” or “Quantum Frontiers has released an article”—we encode our message in light. The light traverses a fiber, buried in the ground, then hits a detector. The detector channels the light’s energy into a current, a stream of electrons that flows down a wire. The variations in the current’s strength is translated into Sarah’s birthday wish.

If noise doesn’t corrupt the signal. From encoding “Happy birthday,” the light and electrons might come to encode “Hsappi birthdeay.” Quantum noise arises because light consists of packets of energy, called “photons.” The sender can’t control how many photons hit the detector.

To send the letter H, we send about 108 photons.* Imagine sending fifty H’s. When we send the first, our signal might contain 108- 153 photons; when we send the second, 108 + 2,083; when we send the third, 108 – 6; and so on. Receiving different numbers of photons, the detector generates different amounts of current. Different amounts of current can translate into different symbols. From H, our message can morph into G.

This spring, I studied quantum noise under the guidance of IQIM faculty member Kerry Vahala. I learned to model quantum noise, to quantify it, when to worry about it, and when not. From quantum noise, we branched into Johnson noise (caused by interactions between the wire and its hot environment); amplified-spontaneous-emission, or ASE, noise (caused by photons belched by ions in the fiber); beat noise (ASE noise breeds with the light we sent, spawning new noise); and excess noise (the “miscellaneous” folder in the filing cabinet of noise types).

Beautiful science of today: A microreso-nator—a tiny pendulum-like device— studied by the Vahala group.

Noise, I learned, has structure. It exhibits patterns. It has personalities. I relished studying those patterns as I relish sending birthday greetings while battling noise. Noise types, I see as a string of pearls unearthed in a junkyard. I see them as “pretty fhap[es]” in Hooke’s treatise. I see them—to pay a greater compliment—as “hairy mouldy fpots.”

*Optical-communications ballpark estimates:

• Optical power: 1 mW = 10-3 J/s
• Photon frequency: 200 THz = 2 × 1014 Hz
• Photon energy: h𝜈 = (6.626 × 10-34 J . s)(2 × 1014 Hz) = 10-19 J
• Bit rate: 1 GB = 109 bits/s
• Number of bits per H: 10
• Number of photons per H: (1 photon / 10-19 J) (10-3 J/s)(1 s / 109 bits)(10 bits / 1 H) = 108

An excerpt from this post was published today on Verso, the blog of the Huntington Library, Art Collection, and Botanical Gardens.

With thanks to Bassam Helou, Dan Lewis, Matt Stevens, and Kerry Vahala for feedback. With thanks to the Huntington Library (including Catherine Wehrey) and the Vahala group for the Micrographia image and the microresonator image, respectively.

# The theory of everything: Help wanted

When Scientific American writes that physicists are working on a theory of everything, does it sound ambitious enough to you? Do you lie awake at night thinking that a theory of everything should be able to explain, well, everything? What if that theory is founded on quantum mechanics and finds a way to explain gravitation through the microscopic laws of the quantum realm? Would that be a grand unified theory of everything?

The answer is no, for two different, but equally important reasons. First, there is the inherent assumption that quantum systems change in time according to Schrodinger’s evolution: $i \hbar \partial_t \psi(t) = H \psi(t)$. Why? Where does that equation come from? Is it a fundamental law of nature, or is it an emergent relationship between different states of the universe? What if the parameter $t$, which we call time, as well as the linear, self-adjoint operator $H$, which we call the Hamiltonian, are both emergent from a more fundamental, and highly typical phenomenon: the large amount of entanglement that is generically found when one decomposes the state space of a single, static quantum wavefunction, into two (different in size) subsystems: a clock and a space of configurations (on which our degrees of freedom live)? So many questions, so few answers.

The static multiverse

The perceptive reader may have noticed that I italicized the word ‘static’ above, when referring to the quantum wavefunction of the multiverse. The emphasis on static is on purpose. I want to make clear from the beginning that a theory of everything can only be based on axioms that are truly fundamental, in the sense that they cannot be derived from more general principles as special cases. How would you know that your fundamental principles are irreducible? You start with set theory and go from there. If that assumes too much already, then you work on your set theory axioms. On the other hand, if you can exhibit a more general principle from which your original concept derives, then you are on the right path towards more fundamentalness.

In that sense, time and space as we understand them, are not fundamental concepts. We can imagine an object that can only be in one state, like a switch that is stuck at the OFF position, never changing or evolving in any way, and we can certainly consider a complete graph of interactions between subsystems (the equivalent of a black hole in what we think of as space) with no local geometry in our space of configurations. So what would be more fundamental than time and space? Let’s start with time: The notion of an unordered set of numbers, such as $\{4,2,5,1,3,6,8,7,12,9,11,10\}$, is a generalization of a clock, since we are only keeping the labels, but not their ordering. If we can show that a particular ordering emerges from a more fundamental assumption about the very existence of a theory of everything, then we have an understanding of time as a set of ordered labels, where each label corresponds to a particular configuration in the mathematical space containing our degrees of freedom. In that sense, the existence of the labels in the first place corresponds to a fundamental notion of potential for change, which is a prerequisite for the concept of time, which itself corresponds to constrained (ordered in some way) change from one label to the next. Our task is first to figure out where the labels of the clock come from, then where the illusion of evolution comes from in a static universe (Heisenberg evolution), and finally, where the arrow of time comes from in a macroscopic world (the illusion of irreversible evolution).

The axioms we ultimately choose must satisfy the following conditions simultaneously: 1. the implications stemming from these assumptions are not contradicted by observations, 2. replacing any one of these assumptions by its negation would lead to observable contradictions, and 3. the assumptions contain enough power to specify non-trivial structures in our theory. In short, as Immanuel Kant put it in his accessible bedtime story The critique of Pure Reason, we are looking for synthetic a priori knowledge that can explain space and time, which ironically were Kant’s answer to that same question.

The fundamental ingredients of the ultimate theory

Before someone decides to delve into the math behind the emergence of unitarity (Heisenberg evolution) and the nature of time, there is another reason why the grand unified theory of everything has to do more than just give a complete theory of how the most elementary subsystems in our universe interact and evolve. What is missing is the fact that quantity has a quality all its own. In other words, patterns emerge from seemingly complex data when we zoom out enough. This “zooming out” procedure manifests itself in two ways in physics: as coarse-graining of the data and as truncation and renormalization. These simple ideas allow us to reduce the computational complexity of evaluating the next state of a complex system: If most of the complexity of the system is hidden at a level you cannot even observe (think pre retina-display era), then all you have to keep track of is information at the macroscopic, coarse-grained level. On top of that, you can use truncation and renormalization to zero in on the most likely/ highest weight configurations your coarse-grained data can be in – you can safely throw away a billion configurations, if their combined weight is less than 0.1% of the total, because your super-compressed data will still give you the right answer with a fidelity of 99.9%. This is how you get to reduce a 9 GB raw video file down to a 300 MB Youtube video that streams over your WiFi connection without losing too much of the video quality.

I will not focus on the second requirement for the “theory of everything”, the dynamics of apparent complexity. I think that this fundamental task is the purview of other sciences, such as chemistry, biology, anthropology and sociology, which look at the “laws” of physics from higher and higher vantage points (increasingly coarse-graining the topology of the space of possible configurations). Here, I would like to argue that the foundation on which a theory of everything rests, at the basement level if such a thing exists, consists of four ingredients: Math, Hilbert spaces with tensor decompositions into subsystems, stability and compressibility. Now, you know about math (though maybe not of Zermelo-Fraenkel set theory), you may have heard of Hilbert spaces if you majored in math and/or physics, but you don’t know what stability, or compressibility mean in this context. So let me motivate the last two with a question and then explain in more detail below: What are the most fundamental assumptions that we sweep under the rug whenever we set out to create a theory of anything that can fit in a book – or ten thousand books – and still have predictive power? Stability and compressibility.

Math and Hilbert spaces are fundamental in the following sense: A theory needs a Language in order to encode the data one can extract from that theory through synthesis and analysis. The data will be statistical in the most general case (with every configuration/state we attach a probability/weight of that state conditional on an ambient configuration space, which will often be a subset of the total configuration space), since any observer creating a theory of the universe around them only has access to a subset of the total degrees of freedom. The remaining degrees of freedom, what quantum physicists group as the Environment, affect our own observations through entanglement with our own degrees of freedom. To capture this richness of correlations between seemingly uncorrelated degrees of freedom, the mathematical space encoding our data requires more than just a metric (i.e. an ability to measure distances between objects in that space) – it requires an inner-product: a way to measure angles between different objects, or equivalently, the ability to measure the amount of overlap between an input configuration and an output configuration, thus quantifying the notion of incremental change. Such mathematical spaces are precisely the Hilbert spaces mentioned above and contain states (with wavefunctions being a special case of such states) and operators acting on the states (with measurements, rotations and general observables being special cases of such operators). But, let’s get back to stability and compressibility, since these two concepts are not standard in physics.

Stability

Stability is that quality that says that if the theory makes a prediction about something observable, then we can test our theory by making observations on the state of the world and, more importantly, new observations do not contradict our theory. How can a theory fall apart if it is unstable? One simple way is to make predictions that are untestable, since they are metaphysical in nature (think of religious tenets). Another way is to make predictions that work for one level of coarse-grained observations and fail for a lower level of finer coarse-graining (think of Newtonian Mechanics). A more extreme case involves quantum mechanics assumed to be the true underlying theory of physics, which could still fail to produce a stable theory of how the world works from our point of view. For example, say that your measurement apparatus here on earth is strongly entangled with the current state of a star that happens to go supernova 100 light-years from Earth during the time of your experiment. If there is no bound on the propagation speed of the information between these two subsystems, then your apparatus is engulfed in flames for no apparent reason and you get random data, where you expected to get the same “reproducible” statistics as last week. With no bound on the speed with which information can travel between subsystems of the universe, our ability to explain and/or predict certain observations goes out the window, since our data on these subsystems will look like white noise, an illusion of randomness stemming from the influence of inaccessible degrees of freedom acting on our measurement device. But stability has another dimension; that of continuity. We take for granted our ability to extrapolate the curve that fits 1000 data points on a plot. If we don’t assume continuity (and maybe even a certain level of smoothness) of the data, then all bets are off until we make more measurements and gather additional data points. But even then, we can never gather an infinite (let alone, uncountable) number of data points – we must extrapolate from what we have and assume that the full distribution of the data is close in norm to our current dataset (a norm is a measure of distance between states in the Hilbert space).

The emergence of the speed of light

The assumption of stability may seem trivial, but it holds within it an anthropic-style explanation for the bound on the speed of light. If there is no finite speed of propagation for the information between subsystems that are “far apart”, from our point of view, then we will most likely see randomness where there is order. A theory needs order. So, what does it mean to be “far apart” if we have made no assumption for the existence of an underlying geometry, or spacetime for that matter? There is a very important concept in mathematical physics that generalizes the concept of the speed of light for non-relativistic quantum systems whose subsystems live on a graph (i.e. where there may be no spatial locality or apparent geometry): the Lieb-Robinson velocity. Those of us working at the intersection of mathematical physics and quantum many-body physics, have seen first-hand the powerful results one can get from the existence of such an effective and emergent finite speed of propagation of information between quantum subsystems that, in principle, can signal to each other instantaneously through the action of a non-local unitary operator (rotation of the full system under Heisenberg evolution). It turns out that under certain natural assumptions on the graph of interactions between the different subsystems of a many-body quantum system, such a finite speed of light emerges naturally. The main requirement on the graph comes from the following intuitive picture: If each node in your graph is connected to only a few other nodes and the number of paths between any two nodes is bounded above in some nice way (say, polynomially in the distance between the nodes), then communication between two distant nodes will take time proportional to the distance between the nodes (in graph distance units, the smallest number of nodes among all paths connecting the two nodes). Why? Because at each time step you can only communicate with your neighbors and in the next time step they will communicate with theirs and so on, until one (and then another, and another) of these communication cascades reaches the other node. Since you have a bound on how many of these cascades will eventually reach the target node, the intensity of the communication wave is bounded by the effective action of a single messenger traveling along a typical path with a bounded speed towards the destination. There should be generalizations to weighted graphs, but this area of mathematical physics is still really active and new results on bounds on the Lieb-Robinson velocity gather attention very quickly.

Escaping black holes

If this idea holds any water, then black holes are indeed nearly complete graphs, where the notion of space and time breaks down, since there is no effective bound on the speed with which information propagates from one node to another. The only way to escape is to find yourself at the boundary of the complete graph, where the nodes of the black hole’s apparent horizon are connected to low-degree nodes outside. Once you get to a low-degree node, you need to keep moving towards other low-degree nodes in order to escape the “gravitational pull” of the black hole’s super-connectivity. In other words, gravitation in this picture is an entropic force: we gravitate towards massive objects for the same reason that we “gravitate” towards the direction of the arrow of time: we tend towards higher entropy configurations – the probability of reaching the neighborhood of a set of highly connected nodes is much, much higher than hanging out for long near a set of low-degree nodes in the same connected component of the graph. If a graph has disconnected components, then their is no way to communicate between the corresponding spacetimes – their states are in a tensor product with each other. One has to carefully define entanglement between components of a graph, before giving a unified picture of how spatial geometry arises from entanglement. Somebody get to it.

Erik Verlinde has introduced the idea of gravity as an entropic force and Fotini Markopoulou, et al. have introduced the notion of quantum graphity (gravity emerging from graph models). I think these approaches must be taken seriously, if only because they work with more fundamental principles than the ones found in Quantum Field Theory and General Relativity. After all, this type of blue sky thinking has led to other beautiful connections, such as ER=EPR (the idea that whenever two systems are entangled, they are connected by a wormhole). Even if we were to disagree with these ideas for some technical reason, we must admit that they are at least trying to figure out the fundamental principles that guide the things we take for granted. Of course, one may disagree with certain attempts at identifying unifying principles simply because the attempts lack the technical gravitas that allows for testing and calculations. Which is why a technical blog post on the emergence of time from entanglement is in the works.

Compressibility

So, what about that last assumption we seem to take for granted? How can you have a theory you can fit in a book about a sequence of events, or snapshots of the state of the observable universe, if these snapshots look like the static noise on a TV screen with no transmission signal? Well, you can’t! The fundamental concept here is Kolmogorov complexity and its connection to randomness/predictability. A sequence of data bits like:

10011010101101001110100001011010011101010111010100011010110111011110

has higher complexity (and hence looks more random/less predictable) than the sequence:

10101010101010101010101010101010101010101010101010101010101010101010

because there is a small computer program that can output each successive bit of the latter sequence (even if it had a million bits), but (most likely) not of the former. In particular, to get the second sequence with one million bits one can write the following short program:

string s = ’10′;
for n=1 to $499,999$:
s.append(’10′);
n++;
end
print s;

As the number of bits grows, one may wonder if the number of iterations (given above by $499,999$), can be further compressed to make the program even smaller. The answer is yes: The number $499,999$ in binary requires $\log_2 499,999$ bits, but that binary number is a string of 0s and 1s, so it has its own Kolmogorov complexity, which may be smaller than $\log_2 499,999$. So, compressibility has a strong element of recursion, something that in physics we associate with scale invariance and fractals.

You may be wondering whether there are truly complex sequences of 0,1 bits, or if one can always find a really clever computer program to compress any N bit string down to, say, N/100 bits. The answer is interesting: There is no computer program that can compute the Kolmogorov complexity of an arbitrary string (the argument has roots in Berry’s Paradox), but there are strings of arbitrarily large Kolmogorov complexity (that is, no matter what program we use and what language we write it in, the smallest program (in bits) that outputs the N-bit string will be at least N bits long). In other words, there really are streams of data (in the form of bits) that are completely incompressible. In fact, a typical string of 0s and 1s will be almost completely incompressible!

Stability, compressibility and the arrow of time

So, what does compressibility have to do with the theory of everything? It has everything to do with it. Because, if we ever succeed in writing down such a theory in a physics textbook, we will have effectively produced a computer program that, given enough time, should be able to compute the next bit in the string that represents the data encoding the coarse-grained information we hope to extract from the state of the universe. In other words, the only reason the universe makes sense to us is because the data we gather about its state is highly compressible. This seems to imply that this universe is really, really special and completely atypical. Or is it the other way around? What if the laws of physics were non-existent? Would there be any consistent gravitational pull between matter to form galaxies and stars and planets? Would there be any predictability in the motion of the planets around suns? Forget about life, let alone intelligent life and the anthropic principle. Would the Earth, or Jupiter even know where to go next if it had no sense that it was part of a non-random plot in the movie that is spacetime? Would there be any notion of spacetime to begin with? Or an arrow of time? When you are given one thousand frames from one thousand different movies, there is no way to make a single coherent plot. Even the frames of a single movie would make little sense upon reshuffling.

What if the arrow of time emerged from the notions of stability and compressibility, through coarse-graining that acts as a compression algorithm for data that is inherently highly-complex and, hence, highly typical as the next move to make? If two strings of data look equally complex upon coarse-graining, but one of them has a billion more ways of appearing from the underlying raw data, then which one will be more likely to appear in the theory-of-everything book of our coarse-grained universe? Note that we need both high compressibility after coarse-graining in order to write down the theory, as well as large entropy before coarse-graining (from a large number of raw strings that all map to one string after coarse-graining), in order to have an arrow of time. It seems that we need highly-typical, highly complex strings that become easy to write down once we coarse grain the data in some clever way. Doesn’t that seem like a contradiction? How can a bunch of incompressible data become easily compressible upon coarse-graining? Here is one way: Take an N-bit string and define its 1-bit coarse-graining as the boolean AND of its digits. All but one strings will default to 0. The all 1s string will default to 1. Equally compressible, but the probability of seeing the 1 after coarse-graining is $2^{-N}$. With only 300 bits, finding the coarse-grained 1 is harder than looking for a specific atom in the observable universe. In other words, if the coarse-graining rule at time t is the one given above, then you can be pretty sure you will be seeing a 0 come up next in your data. Notice that before coarse-graining, all $2^N$ strings are equally likely, so there is no arrow of time, since there is no preferred string from a probabilistic point of view.

Conclusion, for now

When we think about the world around us, we go to our intuitions first as a starting point for any theory describing the multitude of possible experiences (observable states of the world). If we are to really get to the bottom of this process, it seems fruitful to ask “why do I assume this?” and “is that truly fundamental or can I derive it from something else that I already assumed was an independent axiom?” One of the postulates of quantum mechanics is the axiom corresponding to the evolution of states under Schrodinger’s equation. We will attempt to derive that equation from the other postulates in an upcoming post. Until then, your help is wanted with the march towards more fundamental principles that explain our seemingly self-evident truths. Question everything, especially when you think you really figured things out. Start with this post. After all, a theory of everything should be able to explain itself.

UP NEXT: Entanglement, Schmidt decomposition, concentration measure bounds and the emergence of discrete time and unitary evolution.

Everyone in grad school has taken on the task of picking the perfect research group at some point.  Then some among us had the dubious distinction of choosing the perfect research group twice.  Luckily for me, a year of grad research taught me a lot and I found myself asking group members and PIs (primary investigators) very different questions.  And luckily for you, I wrote these questions down to share with future generations.  My background as an experimental applied physicist showed through initially, so I got Shaun Maguire and Spiros Michalakis to help make it applicable for theorists too, and most of them should be useful outside physics as well.

Questions to break that silence when your potential advisor asks “So, do you have any questions for me?”

1. Are you taking new students?
- 2a. if yes: How many are you looking to take?
- 2b. if no: Ask them about the department or other professors.  They’ve been there long enough to have opinions.  Alternatively, ask what kinds of questions they would suggest you ask other PIs
3. What is the procedure for joining the group?
4. (experimental) Would you have me TA?  (This is the nicest way I thought of to ask if a PI can fund you with a research assistance-ship (RA), though sometimes they just like you to TA their class.)
4. (theory) Funding routes will often be covered by question 3 since TAs are the dominant funding method for theory students, unlike for experimentalists. If relevant, you can follow up with: How does funding for your students normally work? Do you have funding for me?
5. Do new students work for/report to other grad students, post docs, or you directly?
6. How do you like students to arrange time to meet with you?
7. How often do you have group meetings?
8. How much would you like students to prepare for them?
9. Would you suggest I take any specific classes?
10. What makes someone a good fit for this group?

And then for the high bandwidth information transfer.  Grill the group members themselves, and try to ask more than one group member if you can.

1. How much do you prepare for meetings with PI?
2. How long until people lead their own project? – Equivalently, who’s working on what projects.
3. How much do people on different projects communicate? (only group meeting or every day)
4. Is the PI hands on (how often PI wants to meet with you)?
5. Is the PI accessible (how easily can you meet with the PI if you want to)?
6. What is the average time to graduation? (if it’s important to you personally)
7. Does the group/subgroup have any bonding activities?
8. Do you think I should join this group?
9. What are people’s backgrounds?
10. What makes someone a good fit for this group?

Hope that helps.  If you have any other suggested questions, be sure to leave them in the comments.

# Clocking in at a Cambridge conference

On Facebook last fall, I posted about statistical mechanics. Statistical mechanics is the physics of hordes of particles. Hordes of molecules, for example, form the stench seeping from a clogged toilet. Hordes change in certain ways but not in the reverse ways, suggesting time points in a direction. Once a stink diffuses into the hall, it won’t regroup in the bathroom. The molecules’ locations distinguish past from future.

The post attracted a comment by Ian Durham, associate professor of physics at St. Anselm College. Minutes later, we were instant-messaging about infinitely long evolutions.*

The next day, I sent Ian a paper draft. His reply made me jump more than a whiff of a toilet would. Would I discuss the paper at a conference he was co-organizing?

I almost replied, Are you sure?

Then I almost replied, Yes, please!

The conference, “Eddington and Wheeler: Information and Interaction,” unfolded this March at the University of Cambridge. Cambridge employed Sir Arthur Eddington, the astronomer whose 1919 observation of starlight during an eclipse catapulted Einstein’s general relativity to fame. Decades later, John Wheeler laid groundwork for quantum information.

Though aware of Eddington’s observation, I hadn’t known he’d researched stat mech. I hadn’t known his opinions about time. Time owns a high-rise in my heart; see the fussiness with which I catalogue “last fall,” “minutes later,” and “the next day.” Conference-goers shared news about time in the Old Combination Room at Cambridge’s Trinity College. Against the room’s wig-filled portraits, our projector resembled a souvenir misplaced by a time traveler.

Trinity College, Cambridge.

Presenter one, Huw Price, argued that time has no arrow. It appears to in our universe: We remember the past and anticipate the future. Once a stench diffuses, it doesn’t regroup. The stench illustrates the Second Law of Thermodynamics, the assumption that entropy increases.

If “entropy” doesn’t ring a bell, never mind; we’ll dissect it in future articles. Suffice it to say that (1) thermodynamics is a branch of physics related to stat mech; (2) according to the Second Law of Thermodynamics, something called “entropy” increases; (3) entropy’s rise distinguishes the past from the future by associating the former with a low entropy and the latter with a large entropy; and (4) a stench’s diffusion illustrates the Second Law and time’s flow.

In as many universes in which entropy increases (time flows in one direction), in so many universe does entropy decrease (does time flow oppositely). So, said Huw Price, postulated the 19th-century stat-mech founder Ludwig Boltzmann. Why would universes pair up? For the reason why, driving across a pothole, you not only fall, but also rise. Each fluctuation from equilibrium—from a flat road—involves an upward path and a downward. The upward path resembles a universe in which entropy increases; the downward, a universe in which entropy decreases. Every down pairs with an up. Averaged over universes, time has no arrow.

Freidel Weinert, presenter five, argued the opposite. Time has an arrow, he said, and not because of entropy.

Ariel Caticha discussed an impersonator of time. Using a cousin of MaxEnt, he derived an equation identical to Schrödinger’s. MaxEnt, short for “the Maximum Entropy Principle,” is a tool used in stat mech. Schrödinger’s Equation describes how quantum systems evolve. To draw from Schrödinger’s Equation predictions about electrons and atoms, physicists assume that features of reality resemble certain bits of math. We assume, for example, that the t in Schrödinger’s Equation represents time.

A t appeared in Ariel’s twin of Schrödinger’s Equation. But Ariel didn’t assume what physicists usually assume. MaxEnt motivated his assumptions. Interpreting Ariel’s equation poses a challenge. If a variable acts like time and smells like time, does it represent time?**

A presenter uses the anachronistic projector. The head between screen and camera belongs to David Finkelstein, who helped develop the theory of general relativity checked by Eddington.

Like Ariel, Bill Wootters questioned time’s role in arguments. The co-creator of quantum teleportation wondered why one tenet of quantum physics has the form it has. Using quantum mechanics, we can’t predict certain experiments’ outcomes. We can predict probabilities—the chance that some experiment will yield Possible Outcome 1, the chance that the experiment will yield Possible Outcome 2, and so on. To calculate these probabilities, we square numbers. Why square? Why don’t the probabilities depend on cubes?

To explore this question, Bill told a story. Suppose some experimenter runs these experiments on Monday and those on Tuesday. When evaluating his story, Bill pointed out a hole: Replacing “Monday” and “Tuesday” with “eight o’clock” and “nine” wouldn’t change his conclusion. Which replacements wouldn’t change it, and which would? To what can we generalize those days?

Little of presentation twelve concerned time. Rüdiger Schack introduced QBism, an interpretation of quantum mechanics that sounds like “cubism.” Casting quantum physics in terms of experimenters’ actions, Rüdiger mentioned time. By the time of the mention, I couldn’t tell what anyone meant by “time.” Raising a hand, I asked for clarification.

“You are young,” Rüdiger said. “But you will grow old and die.”

The comment clanged like the slam of a door. It echoed when I followed Ian into Ascension Parish Burial Ground. On Cambridge’s outskirts, conference-goers visited Eddington’s headstone. We found Wittgenstein’s near an uneven footpath; near tangles of undergrowth, Nobel laureates’. After debating about time, we marked its footprints. Paths of glory lead but to the grave.

Here lies one whose name was writ in a conference title: Sir Arthur Eddington’s grave.

Paths touched by little glory, I learned, have perks. As Rüdiger noted, I was the greenest participant. As he had the manners not to note, I was the least distinguished and the most ignorant. Studenthood freed me to raise my hand, to request clarification, to lack opinions about time. Perhaps I’ll evolve opinions at some t, some Monday down the road. That Monday feels infinitely far off. These days, I’ll stick to evolving science—using that other boon of youth, Facebook.

* You know you’re a theoretical physicist (or a physicist-in-training) when you debate about processes that last till kingdom come.

** As long as the variable doesn’t smell like a clogged toilet.

For videos of the presentations—including the public lecture by best-selling author Neal Stephenson—stay tuned to http://informationandinteraction.wordpress.com.

With gratitude to Ian Durham and Dean Rickles for organizing “Information and Interaction” and for the opportunity to participate. With thanks to the other participants for sharing their ideas and time.