The return of the superconducting high school teacher

Last summer, I was blessed with the opportunity to learn about the basics of high temperature superconductors in the Yeh Group under the tutelage of visiting Professor Feng. We formed superconducting samples using a process known as Pulse Laser Deposition. We began testing the properties of the samples using X-Ray Diffraction, AC Susceptibility, and SQUIDs (superconducting quantum interference devices). I brought my new-found knowledge of these laboratory techniques and processes back into the classroom during this past school year. I was able to answer questions about the formation, research, and applications of superconductors that I had been unable to address prior to this valuable experience.

This summer I returned to the IQIM Summer Research Institute to continue my exploration of superconductors and gain even deeper research experience. This time around I have accompanied Caltech second year graduate student Kyle Chen in testing samples using the Scanning Tunneling Microscope (STM), some of which I helped form using Pulse Laser Deposition with Professor Feng last summer. I have always been curious about how we can have atomic resolution. This has been my big chance to have hands-on experience working with STM that makes it possible!

The Scanning Tunneling Microscope was invented by the late Heinrich Rohrer and Gerd Binnig at IBM Research in Zurich, Switzerland in 1981. STM is able to scan the surface contours of substances using a sharp conductive tip. The electron tunneling current through the tip of the microscope is exponentially dependent on the distance (few Angstroms) to the substance surface. The changing currents at different locations can then be compiled to produce three dimensional images of the topography of the surface on the nano-scale. Or conversely the distance can be measured while the current is held constant. STM has a much higher resolution of images and avoids the problems of diffraction and spherical aberration from lenses. This level of control and precision through STM has enabled scientists to use tools with nanometer precision, allowing scientists even to manipulate atoms and their bonds. STM has been instrumental in forming the field of nanotechnology and the modern study of DNA, semiconductors, graphene, topological insulators, and much more! Just five years after building their first STM, Rohrer and Binning’s work rightfully earned them the 1986 Nobel Prize in Physics.

Descending into the Sloan basement, Kyle and I work to prepare and scan several high temperature superconducting (HTSC) Calcium Doped YBCO (\rm Y_{1-x} Ca_x Ba_2 Cu_3 O_{7-\delta}) samples in order better to understand the pairing mechanism that causes Cooper Pairs for superconductivity. In regular metals, the pairing mechanism via phonon lattice vibrations is fairly well understood by physicists. Meanwhile, the pairing mechanism for HTSC is still a mystery. We are also investigating how this pairing changes with doping, as well as how the magnetic field is channeled up vortices within HTSC.

One of our first tasks is to make probe tips for STM. Adding Calcium Chloride to de-ionized water, we are preparing a liquid conductive path to begin the chemical etching of the probe tip. Using a 10V battery, a wire bent into a ring is connected to the battery and placed in the Calcium Chloride solution. Then a thin platinum iridium wire, also connected to the voltage source, is placed at the center of the conductive ring. The circuit is complete and a current of about half an Ampere is used to erode uniformly the outer surface of the platinum iridium wire, forming a sharp tip. We examine the tip under a traditional microscope to scrutinize our work. Ideally, the tip is only one atom thick! If not, we are charged with re-etching until we reach a more suitable straight, uniform, sharp tip.   As we work to prepare the platinum iridium tips, a stoic picture of Neils Bohr looks down at our work with the appropriate adjacent quotation, ” When it comes to atoms, language can be used only as in poetry.  The poet, too, is not nearly so concerned with describing facts as with creating images. ”  After making two or three nearly perfect tips, we clean and store them in the tip case and proceed to the next step of preparation.

We are now ready to clean the sample to be tested. Bromine etching removes any oxidation or impurities that have formed on our sample, leaving a top bromine film layer. We remove the bromine-residue layer with ethanol and then plunge further into the (sub)basement to load the sample into the STM casing before oxidation begins again. The STM in the Yeh Lab was built by Professor Nai-Chang Yeh and her students eleven years ago. There are multiple layers of vacuum chambers and separate dewars, each with its own meticulous series of steps to prepare for STM testing. At the center is a long, central STM tube. Surrounding this is a large cylindrical dewar. On the perimeter is an exterior large vacuum chamber.

First we must load the newly etched YBCO sample and tip into the central STM tube. The inner tube currently lays across a work bench beneath desk lamps. We must transfer the tiny tip from the tip case to just above the sample. While loading the tip with an equally minuscule flathead screwdriver, it became quite clear to me that I could never be a surgeon! The superconducting sample is secured in place with a small cover plate and screw. A series of electronic tests for resistance and capacitance must be conducted to confirm that there are no shorts in the numerous circuits. Next we must vacuum pump the inner cylindrical tube holding the sample, tip, and circuitry until the pressure is 10^{-4} Bar. Then we “bake” the inner chamber, using a heater to expel any other gas, while the vacuum pump continues until we reach approximately 10^{-5} Bar. The heater is turned off and the vacuum continues to pump until we reach 10^{-6} Bar. This entire vacuum process takes approximately 15 hours…

During this span of time, I have the opportunity to observe the dark, cold STM room. The door, walls and ceiling are covered with black rubber and spongy padding to absorb vibration. The STM room is in the lowest level basement for the same reason. The vibration from human steps near the testing generates noise in the data, so every precaution is made to minimize noise. Giant cement blocks lay across the STM metal box to increase inertia and decrease noise. I ask Kyle what he usually does with this “down” time. We discuss the importance of reading equipment manuals to grasp a better understanding of the myriad of tools in the lab. He says he needs to continue reading the papers published by the Yeh Lab Group. In knowing what questions your research group has previously answered, one has a better understanding of the history and the direction of current work.

The next day, the vacuum-pumped inner chamber is loaded to the center of the STM dewar. We flush the surrounding chambers with nitrogen gas to extricate any moisture or impurities that may have entered since our last testing. Next we can set up the equipment for a liquid nitrogen transfer which lasts approximately 2 hours, depending on the transfer rate. As the liquid nitrogen is added to the system, we meticulously monitor the temperature of the STM system. It must reach 80 Kelvin before we again test the electronics. Eventually it is time to add the liquid helium. Since liquid helium is quite expensive, additional precautions are taken to ensure maximum efficiency for helium use. It is beautiful to watch the moisture in the air deposit in frost along the tubing connecting the nitrogen and helium tanks to the STM dewar. The stillness of the quiet basement as we wait for the transfer is calming. Again, we carefully monitor the temperature drop as it eventually reaches 4.2 Kelvin. For this research, STM must be cooled to this temperature because we must drop below the critical temperature of the sample in order to observe superconductivity. The lower the temperature, the more of the superconducting component manifests itself. Hence the spectrum will have higher resolution. Liquid nitrogen is first added because it can carry over 90% of the heat away due to its higher mass. Nitrogen is also significantly cheaper than liquid helium. The liquid helium is added later, because it is even cooler than liquid nitrogen.

After adding additional layers of rubber padding on top of the closed STM, we can move over to the computer that controls the STM tip. It takes approximately one hour for the tip to be slowly lowered within range for a tunneling current. Kyle examines the data from the approach to the surface. If all seems normal, we can begin the actual scan of the sample!
An important part of the lab work is trouble shooting. I have listed the ideal order of steps, but as with life, things do not always proceed as expected. I have grown in awe of the perseverance and ingenuity required for daily troubleshooting. The need to be meticulous in order to avoid error is astonishing. I love that some common household items can be a valuable tool in the lab. For example, copper scrubbers used in the kitchen serve as a simple conducting path around the inner STM chamber. Floss can be used to tie down the most delicate thin wires. I certainly have grown in my immense respect for the patience and brilliance required in real research.

I find irony in the quiet simplicity of recording and analyzing data, the stillness of carefully transferring liquid helium juxtaposed to the immense complexity and importance of this groundbreaking research. I appreciate the moments of simple quiet in the STM room, the fast paced group meetings where everyone chimes in on their progress, or the boisterous collaborative brainstorming to troubleshoot a new problem. The summer weeks in the Sloan basement have been a welcome retreat from the exciting, transformative, and exhausting year in the classroom. I am grateful for the opportunity to learn more about superconductors, quantum tunneling, vacuum pumps, sonicators, lab safety, and more. While I will not be bromine etching, chemically forming STM tips, or doing liquid helium transfers come September, I have a new-found love for the process of research that I will radiate to my students.

High School Physics Teacher Embedded on A Quest to Squash Quantum Noise

Date: 8/22/2013

Location: Caltech Cryo Lab, West Bridge:

Hello: I am Steve Maloney, a Physics and Chemistry teacher intern from Duarte High School, sponsored by IQIM (Institute for Quantum Information and Matter), doing whatever I can to be of assistance to Dr. Nicolas Smith-Lefebvre. Upon meeting him in mid-June I soon learned that our mission for the length of my visit was to assist him in determining with a greater degree of certainty the linear expansion coefficient of silicon in and around 125 K. (See below)

Image

Fig. 1: Silicon cavity.

The temperature of 125 K is of special interest to operators of LIGO (Laser Interferometer Gravitational Wave Observatory), because that is one of two temperatures where the thermal expansion coefficient, \alpha, of silicon is equal to zero. A zero linear expansion coefficient is of special interest to LIGO researchers because a small change in temperature inside the cryostat, (see Fig. 2, below) will not result in a significant change in length for the silicon cavity shown in Fig. 1.

Image

Fig. 2: Inside the Cryostat

Scientists working at LIGO need to know with great precision the length of the resonance cavity, because as gravitational waves pass through the cavity, they simultaneously compress the length of the cavity and stretch in a direction perpendicular to the shortening (warp). The arrival of a gravity wave produces a signal in the Fabry-Perot Interferometer, shown in Fig. 3, below.

Fig. 3: LIGO set-up with Fabry-Perot cavity.

Fig. 3: LIGO set-up with Fabry-Perot cavity.

Because the interferometer is sensitive to changes in length of as little as 1X10-15 m, sources of noise must be reduced to an absolute minimum. This brings us back to establishing the Thermal Coefficient of Linear Expansion, \alpha. Knowing the value of \alpha to a greater certainty will provide LIGO researchers the mathematical tools to better correct for small changes in temperature for the cavity, thus reducing the noise, therefore increasing the sensitivity of the Gravity Wave Detector.

So Where Do I Fit In?

In the Cryo-Lab on Thursday, July 11 2013, Nicolas Smith-Lefebvre, with my assistance, fed a radio frequency of 160.13 Mhz by means of a frequency-to-voltage transducer. The frequency fed into the transducer was changed by a fixed amount, and the change in voltage was noted. The hz / volt constant obtained was 253.9 hz / mvolt.

Nicolas then locked the east-west cryo-cavities so that the beat signal was approximately 0 (zero) volts.  See the plot, below:

Image

We then sent a 3.16 second pulse (approximated) of a 3.6 mWatt, 532 nm laser, (green) onto the surface of a mirror that reflects in the infrared, but absorbs in visible wavelengths. (Note top graph) I manufactured the electrical power interface for the laser by modifying the casing of a BIC disposable pen. The mirror was situated at the aperture of a silicon spacer.

The goal of the experiment was to determine the absorbance of the silicon mirror of the 532 nm laser.

Assuming we know the quantity of energy pulsed into the mirror:

3.6 mWatt * 3.16 s = .011376 joules,

the change in length of the cavity was determined by: Change in f/f1550nm = change in L/ L0

The change in volts (.025) gave us a change in f of 6.3475 x 103 hz.

With L0 having a value of 10 cm, that means change in L was 3.2794 x 10-10cm.

The specific heat capacity of Si is 700j/k*kg.

Coefficient of linear expansion for Si = 2.6 x 10 -6/K.

To calculate increase in temperature we need to obtain the change in L.

If 2.6 x 10 -6/C  x 10 cm x DT = DL = 3.2794 x 10-10 cm, then DT must be 1.2613 x 10 -5 K.

If DT = 1.2613 x 10-5 K , then Q must equal .41 kg x 700 j/kg Kx1.2613 x 10-5 K = 3.6 x10-3 j,

Q/Epulse=  3.6 x 10-3j / 1.1376 x 10-2j = .316 absorbance.

In other words, the silicon cavity reflected about 68% of the light that struck it.

What have I come away with from this experience?

What struck me first and foremost during this summer internship in the Cryo-Lab was the importance of future knowledge workers of having certain key skills. Among them:

Proficiency in Language

Proficiency in Math

Proficiency in Science

Proficiency in Coding

I will share my insights with my local school district and I intend to capitalize on the connections I made during my experience at Caltech.

Acknowledgements:

I would like to thank the Duarte Unified School District for giving me a leave of absence, Rana Adhikari for, yet again, finding space for me in spite of my general ineptitude regarding General Relativity, Spyridon Michalakis (Spiros) for inviting me back and letting me participate in cutting-edge science, and most of all I would like to thank Nicolas Smith-Lefebvre (softball savant),  and David Yeaton-Massey (D-Mass), for their patience, generosity, and mentoring.

 

Image

Hacking nature: loopholes in the laws of physics

I spent my childhood hacking computers. When I was seven, my cousin showed up for Thanksgiving with a box filled with computer parts and we built my first computer. I got into competitive computer gaming around age eleven, and hacking was a natural extension of these activities. Then when I was sixteen, after doing poorly at a Counterstrike tournament, I decided that I should probably apply myself to other things. Needless to say, my parents were thrilled. So that’s when I bought my first computer (instead of building my own), which for deliberate but now antediluvian reasons was a Mac. A few years later, when I was taking CS 106 at Stanford, I was the first student in the course’s history whose reason for buying a Mac was “so that I couldn’t play computer games!” And now you know the story of my childhood.

The hacker mentality is quite different than the norm and my childhood trained me to look at absolutist laws as opportunities to find loopholes (of course only when legal and socially responsible!) I’ve applied this same mentality as I’ve been doing physics and I’d like to share with you some of the loopholes that I’ve gathered.

scharnhorst

Scharnhorst effect enables light to travel faster than in vacuum (c=299,792,458 m/s): this is about the grandaddy of all laws, that nothing can travel faster than light in a vacuum! This effect is the most controversial on my list, because it hasn’t yet been experimentally verified, but it seems obvious with the right picture in mind. Most people’s mental model for light traveling in a vacuum is of little particles/waves called photons traveling through empty space. However, the vacuum is not empty! It is filled with pairs of virtual particles which momentarily fleet into existence. Interactions with these virtual particles create a small amount of ‘resistance’ as photons zoom through the vacuum (photons get absorbed into virtual electron-positron pairs and then spit back out as photons ad infinitum.) Thus, if we could somehow reduce the rate at which virtual particles are created, photons would interact less strongly with the vacuum, and would be able to travel marginally faster than c. But this is exactly what leads to the Casimir effect: the experimentally verified fact that if you take two mirrors and put them ~10 nanometers apart, then they will attract each other because there are more virtual particles created outside the cavity than inside [low momenta virtual modes are inaccessible because the uncertainty principle requires \Delta x \cdot \Delta p= 10nm\cdot\Delta p \geq \hbar/2.] This effect is extremely small, only predicting that light would travel one part in 10^{36} faster than c. However, it should remind us all to deeply question assumptions.

This first loophole used quantum effects to beat a relativistic bound, but the next few loopholes are purely quantum, and are mainly related to that most quantum of all limits, the Heisenberg uncertainty principle.

Smashing the standard quantum limit (SQL) with squeezed measurements: the Heisenberg uncertainty principle tells us that there is a fundamental tradeoff in nature: the more precise your information about an object’s position, the less precise your knowledge about its momentum. Or vice versa, or replace x and p with and t, or any other conjugate variables. This uncertainty principle is oftentimes written as \Delta x\cdot \Delta p \geq \hbar/2. For a variety of reasons, in the early days of quantum mechanics, it was hard enough to imagine creating a state with \Delta x \cdot \Delta p = \hbar/2, but there was some hope because this is obtained in the ground state of a quantum harmonic oscillator. In this case, we have \Delta x = \Delta p = \sqrt{\hbar/2}. However, it was harder still to imagine creating states with \Delta x < \sqrt{\hbar/2}, these states would be said to ‘go beyond the standard quantum limit’ (SQL). Over the intervening years, not only have we figured out how to go beyond the SQL using squeezed coherent states, but this is actually essential in some of our most exciting current experiments, like LIGO.

LIGO is an incredibly ambitious experiment which has been talked about multiple times on this blog. It is trying to usher in a new era of astronomy–moving beyond detecting photons–to detecting gravitational waves, ripples in spacetime which are generated as exceptionally massive objects merge, such as when two black holes collide. The effects of these waves on our local spacetime as they travel past earth are minuscule, on the scale of 10^{-18}m, which is about one thousand times shorter than the ‘diameter’ of a proton, and is the same order of magnitude as \sqrt{\hbar/2}. Remarkably, LIGO has exploited squeezed light to demonstrate sensitivities beyond the SQL. LIGO expects to start detecting gravitational waves on a frequent basis as its upgrades deemed ‘advanced LIGO’ are completed over the next few years.

Compressed sensing beats Nyquist-Shannon: let’s play a game. Imagine I’m sending you a radio signal. How often do you need to measure the signal in order to be able to reconstruct it perfectly? The Nyquist-Shannon sampling theorem is a path-breaking result which Claude Shannon proved in 1949. If you measure at least twice as often as the highest frequency, then you are guaranteed perfect recovery of the signal. This incredibly profound result laid the foundation for modern communications. Also, it is important to realize that your signal can be much more general than simply radio waves, such as with a signal of images. This theorem is a sufficient condition for reconstruction, but is it necessary? Not even close. And it took us over 50 years to understand this in generality.

Compressed sensing was proposed between 2004-2006 by Emmanuel Candes, David Donaho and Terry Tao with important early contributions by Justin Romberg. I should note that Candes and Romberg were at Caltech during this period. The Nyquist-Shannon theorem told us that with a small amount of knowledge (a bound on the highest frequency) that we could reconstruct a signal perfectly by only measuring at a rate twice faster than the highest frequency–instead of needing to measure continuously. Compressed sensing says that with one extra assumption, assuming that only sparsely few of your frequencies are being used (call it 10 out of 1000), that you can recover your signal with high accuracy using dramatically fewer measurements. And it turns out that this assumption is valid for a huge range of applications: enabling real-time MRIs using conventional technology or more relevant to this blog, increasing our ability to distinguish quantum states via tomography.

Unlike the other topics in this blog post, I have never worked with compressed sensing, but my intuition goes like this: instead of measuring in the basis in which you are sparse (frequency for example), measure in a different basis. With high probability each of these measurements will pick up a little piece from each of the occupied modes. Then, to reconstruct your signal, you want to use the L0-“norm” to interpolate in such a way that you use the fewest frequency components possible. Computing the L0-“norm” is not efficient, so one of the major breakthroughs of compressed sensing was showing that with high probability computing the L1-norm approximates the L0 solution, and all of this can be done using a highly efficient linear program. However, I really shouldn’t be speculating because I’ve never invested much time into mastering this new tool, and I’m friends with a couple of the quantum state tomography authors, so maybe they’ll chime in?

Brahms is a cool dude. Brahms as a height map--cliffs=Gibbs phenomena=oh no! First three levels of Brahms wavelets.

Brahms is a cool dude. Brahms as a height map where cliffs=Gibbs phenomena=oh no! First three levels of Brahms as a Haar wavelet.

Wavelets as the mother of all bases: I previously wrote a post about the importance of choosing a convenient basis. Imagine you have an image which has a bunch of sharp contrasts, such as the outline of a person, or a horizon, or a table, basically anything. How do you store it efficiently? Due to the Gibbs phenomena, the Fourier basis is pretty awful for these applications. Here’s another motivating problem, imagine someone plays one note on an instrument. The sound is localized in both time and frequency. The Fourier basis is also pretty awful at storing/detecting this. Wavelets to the rescue! The theory of wavelets uses some beautiful math to solve the longstanding problem of finding a basis which is localized in both position and momenta space (or very close to it.) Wavelets have profound applications, some of my favorite include: modern image compression (JPEG 2000 onwards) is based on wavelets; Ingrid Daubechies and her colleagues used wavelets to detect forged paintings; recovering previously unrecoverable recordings of Brahms at the piano (I heard about this from Barry Simon, of Reed-Simon fame, who is currently teaching his last class ever); and even the FBI uses wavelets to compress images of fingerprints, obtaining a compression ratio of 20:1.

Postselection enables quantum cloning: the no-cloning theorem is well known in the field of quantum information. It says that you cannot find a machine (unitary operation U) which takes an arbitrary input state |\psi\rangle, and a known state |0\rangle, such that the machine maps |\psi\rangle \otimes |0\rangle to |\psi\rangle \otimes |\psi\rangle, and thereby cloning |\psi \rangle. This is very easy to prove using the linearity of quantum mechanics. However, there are loopholes. One of the most trivial loopholes is realizing that one can take the state |\psi\rangle and perform something called unambiguous state discrimination, which either spits out exactly which state |\psi \rangle is with some probability, or otherwise spits out “I don’t know which state.” You can postselect that the unambigious state discrimination succeeded and prepare a unitary which clones the relevant states. Peter Shor has a comment on physics stackexchange describing this. Seth Lloyd and John Preskill outlined a less trivial version of this in their recent paper which tries to circumvent firewalls by using postselected quantum teleportation.

In this blog post, I’ve only described a tiny fraction of the quantum loopholes that have been discovered. If I had more space/time, two of the next examples I would describe are beating classical correlations with quantum entanglement, in order to win at CHSH games. I would also describe weak measurements and some of the peculiar things they lead to. Beyond that, I would probably refer you to Yakir Aharonov’s amazingly fun book about quantum paradoxes.

After reading this, I hope that the next time you encounter an inviolable law of nature, you’ll apply the hacker mentality and attempt to strip it down to its essence, isolate assumptions, and potentially find a loophole. But while you’re doing this, remember that you should never argue with your mother, or with mathematics!

A TED experience

Around one year ago, I unexpectedly received an e-mail asking if I would speak at a local TEDx Youth event themed “Daring Discoveries”.  I hadn’t attended a TEDx conference before (sadly I couldn’t make either of the previous ones held at Caltech).  But I was familiar with the high-profile brand and so enthusiastically accepted the invitation.  A few weeks ago, following a lot of preparation by the speakers and no doubt vastly more by the organizers, the event finally took place.  On many levels it proved to be an unforgettable experience. 

One thing that really struck me was that the conference was organized entirely by a team of local high school students.   I find this truly remarkable, especially given the amount of work involved in putting together this sort of thing.  (Finding speakers, fundraising, obtaining a venue, arranging innumerable technical logistics, putting together a webpage, sifting through applications, etc.  I couldn’t imagine keeping track of all those details, much less at that stage!)  The audience was also noteworthy: mostly other high school students from the area, their families, and other community members.  In total there were about 100 participants.  The vast majority reflected underrepresented groups in the sciences, which made it a particularly appealing outreach opportunity.  

The organizers secured a venue at Puente Hills Mall in City of Industry.  To get the mental juices flowing numerous classic brain teaser decorated the walls near the entrance.  This one was my favorite:  

This is an unusual paragraph. I’m curious how quickly you can find out what is so unusual about it.  It looks so plain you would think nothing was wrong with it! In fact, nothing is wrong with it! It is unusual though. Study it, and think about it, but you still may not find anything odd. But if you work at it a bit, you might find out. Try to do so without any coaching.

Other interesting activities also awaited the participants, including a scavenger hunt and a “big ideas wall” where anyone could jot down ideas they viewed as worth spreading.  It was fun reading what everyone had to say.  

The list of speakers was eclectic and, among others, included a college student/entrepreneur, mathematicians, engineers, and educators.  I found everyone’s talks absolutely riveting and felt really honored to be part of such an accomplished group.  For my part I decided to tell a story about quantum computing—in particular the topological approach (what else?).  Preparing was no easy task.  I had to figure out a way to explain what quantum computers are, what they can do for us, why building one is hard, how “non-Abelian anyons” might one day prove to be the salvation, and why this direction is now looking increasingly promising.  Of course without assuming any prior knowledge of quantum mechanics.  And in about 15 minutes or so.  

Given where we are in the quest for a quantum computer I had no choice but to conclude on a tentative yet optimistic note.  I made sure though to convey what I think is an extremely important message.  Namely, that the journey towards realizing quantum computing technology is as exciting—if not more so—than the finish line.  That journey will undoubtedly be paved with groundbreaking discoveries that reveal spectacular new insights about how the universe works, forcing us to develop new physics paradigms along the way.  It’s the prospect of such discoveries that energizes me to think about how we might achieve mastery over materials on large scales to hopefully overcome one of our generation’s greatest technological challenges.  The Saturday Morning Breakfast Cereal comic below—which I very recently learned about from one of our colloquium speakers— perfectly encapsulates my view on the problem, both as a science advocate and a physicist working in the trenches.  I thought showing this (censorship mine!) was a good message to leave the audience with.

comic

Talking quantum mechanics with second graders

“What’s the hardest problem you’ve ever solved?”

Kids focus right in. Driven by a ruthless curiosity, they ask questions from which adults often shy away. Which is great, if you think you know the answer to everything a 7 year-old can possibly ask you…

Two Wednesdays ago, I was invited to participate in three Q&A sessions that quickly turned into Reddit-style AMA (ask-me-anything) sessions over Skype with four 5th grade classes and one 2nd grade class of students at Medina Elementary in Medina, Washington. When asked by the organizers what I would like the sessions to focus on, I initially thought of introducing students to the mod I helped design for Minecraft, called QCraft, which brings concepts like quantum entanglement and quantum superposition into the world of Minecraft. But then I changed my mind. I told the organizers that I would talk about anything the kids wanted to know more about. It dawned on me that maybe not all 5th graders are as excited about quantum physics as I am. Yet.

The students took the bait. They peppered me with questions for over two hours —everything from “What is a quantum physicist and how do you become one?” to “What is it like to work with a fashion designer (about my collaboration with Project Runway’s Alicia Hardesty on Project X Squared)?” and of course, “Why did you steal the cannon?” (learn more about the infamous Cannon Heist – yes kids, there is an ongoing war between the two schools and Caltech took the last (hot) shot just days ago.)”

Caltech students visited MIT bearing some clever gifts.

Caltech students visited MIT during pre-frosh weekend, bearing some clever gifts.

Then they dug a little deeper: “If we have a quantum computer that knows the answer to everything, why do we need to go to school?” This question was a little tricky, so I framed the answer like this: I compared the computer to a sidekick, and the kids—the future scientists, artists and engineers —to superheroes. Sidekicks always look up to the superheroes for guidance and leadership. And then I got this question from a young girl: “If we are superheroes, what should we do with all this power?” I thought about it for a second and though my initial inclination was to go with: “You should make Angry Birds 3D!”, I went with this instead: “People often say, “Study hard so that one day you can cure cancer, figure out the theory of everything and save the world!” But I would rather see you all do things to understand the world. Sometimes you think you are saving the world when it does not need saving—it is just misunderstood. Find ways to understand one another and move to look for the value in others. Because there is always value in others, often hiding from us behind powerful emotions.” The kids listened in silence and, in that moment, I felt profoundly connected with them and their teachers.

I wasn’t expecting any more “deep” questions, until another young girl raised her hand and asked: “Can I be a quantum physicist, or is it only for the boys?” The ferocity of my answer caught me by surprise: “Of course you can! You can do anything you set your mind to and anyone who tells you otherwise, be it your teachers, your friends or even your parents, they are just wrong! In fact, you have the potential to leave all the boys in the class behind!” The applause and laughter from all the girls sounded even louder among the thunderous silence from the boys. Which is when I realized my mistake and added: “You boys can be superheroes too! Just make sure not to underestimate the girls. For your own sake.

Why did I feel so strongly about this issue of women in science? Caltech has a notoriously bad reputation when it comes to the representation of women among our faculty and postdocs (graduate students too?) in areas such as Physics and Mathematics. IQIM has over a dozen male faculty members in its roster and only one woman: Prof. Nai-Chang Yeh. Anyone who meets Prof. Yeh quickly realizes that she is an intellectual powerhouse with boundless energy split among her research, her many students and requests for talks, conference organization and mentoring. Which is why, invariably, every one of the faculty members at IQIM feels really strongly about finding a balance and creating a more inclusive environment for women in science. This is a complex issue that requires a lot of introspection and creative ideas from all sides over the long term, but in the meantime, I just really wanted to tell the girls that I was counting on them to help with understanding our world, as much as I was counting on the boys. Quantum mechanics? They got it. Abstract math? No problem.*

It was of course inevitable that they would want to know why we created the Minecraft mod, a collaborative work between Google, MinecraftEDU and IQIM – after all, when I asked them if they had played Minecraft before, all hands shot up. Both IQIM and Google think it is important to educate younger generations about quantum computers and the complex ideas behind quantum physics; and more importantly, to meet kids where they play, in this case, inside the Minecraft game. I explained to the kids that the game was a place where they could experiment with concepts from quantum mechanics and that we were developing other resources to make sure they had a place to go to if they wanted to know more (see our animations with Jorge Cham at http://phdcomics.com/quantum).

As for the hardest problem I have ever solved? I described it in my first blog post here, An Intellectual Tornado. The kids sat listening in some sort of trance as I described the nearly perilous journey through the lands of “agony” and “self doubt” and into the valley of “grace”, the place one reaches when they learn to walk next to their worst fears, as understanding replaces fear and respect for a far superior opponent teaches true humility and instills in you a sense of adventure. By that time, I thought I was in the clear – as far as fielding difficult questions from 10 year-olds goes – but one little devil decided to ask me this simple question: “Can you explain in 2 minutes what quantum physics is?” Sure! You see kids, emptiness, what we call the quantum vacuum, underlies the emergence of spacetime through the build-up of correlations between disjoint degrees of freedom, we like to call entangled subsystems. The uniqueness of the Schmidt decomposition over generic quantum states, coupled with concentration of measure estimates over unequal bipartite decompositions gives rise to Schrodinger’s evolution and the concept of unitarity – which itself only emerges in the thermodynamic limit. In the remaining minute, let’s discuss the different interpretations of the following postulates of quantum mechanics: Let’s start with measurements…

Reaching out to elementary school kids is just one way we can make science come alive, and many of us here at IQIM look forward to sharing with kids of any age our love for adventuring far and wide to understand the world around us. In case you are an expert in anything, or just passionate about something, I highly recommend engaging the next generation through visits to classrooms and Skype sessions across state lines. Because, sometimes, you get something like this from their teacher:

Hello Dr. Michalakis,

My class was lucky enough to be able to participate in one of the Skype chats you did with Medina Elementary this morning. My students returned to the classroom with so many questions, wonderings, concerns, and ideas that we could spend the remainder of the year discussing them all.

Your ability to thoughtfully answer EVERY single question posed to you was amazing. I was so impressed and inspired by your responses that I am tempted to actually spend the remainder of the year discussing quantum mechanics J.

I particularly appreciated your point that our efforts should focus on trying to “understand the world” rather than “save” the world. I work each day to try and inspire curiosity and wonder in my students. You accomplished more towards my goal in about 40 minutes than I probably have all year. For that I am grateful.

All the best,
A.T.

* Several of my female classmates at MIT (where I did my undergraduate degree in Math with Computer Science) had a clarity of thought and a sense of perseverance that Seal Team Six would be envious of. So I would go to them for help with my hardest homework.

Tsar Nikita and His Scientists

Once upon a time, a Russian tsar named Nikita had forty daughters:

                Every one from top to toe
                Was a captivating creature,
                Perfect—but for one lost feature.

 
So wrote Alexander Pushkin, the 19th-century Shakespeare who revolutionized Russian literature. In a rhyme, Pushkin imagined forty princesses born without “that bit” “[b]etween their legs.” A courier scours the countryside for a witch who can help. By summoning the devil in the woods, she conjures what the princesses lack into a casket. The tsar parcels out the casket’s contents, and everyone rejoices.

“[N]onsense,” Pushkin calls the tale in its penultimate line. A “joke.”

The joke has, nearly two centuries later, become reality. Researchers have grown vaginas in a lab and implanted them into teenage girls. Thanks to a genetic defect, the girls suffered from Mayer-Rokitansky-Küster-Hauser (MRKH) syndrome: Their vaginas and uteruses had failed to grow to maturity or at all. A team at Wake Forest and in Mexico City took samples of the girls’ cells, grew more cells, and combined their harvest with vagina-shaped scaffolds. Early in the 2000s, surgeons implanted the artificial organs into the girls. The patients, the researchers reported in the journal The Lancet last week, function normally.

I don’t usually write about reproductive machinery. But the implants’ resonance with “Tsar Nikita” floored me. Scientists have implanted much of Pushkin’s plot into labs. The sexually deficient girls, the craftsperson, the replacement organs—all appear in “Tsar Nikita” as in The Lancet. In poetry as in science fiction, we read the future.

Though threads of Pushkin’s plot survive, society’s view of the specialist has progressed. “Deep [in] the dark woods” lives Pushkin’s witch. Upon summoning the devil, she locks her cure in a casket. Today’s vagina-implanters star in headlines. The Wall Street Journal highlighted the implants in its front section. Unless the patients’ health degrades, the researchers will likely list last week’s paper high on their CVs and websites.

http://news.wfu.edu/2011/05/31/research-park-updates-to-be-presented/, http://www.orderwhitemoon.org/goddess/babayaga/BabaYaga.html

Much as Dr. Atlántida Raya-Rivera, the paper’s lead author, differs from Pushkin’s witch, the visage of Pushkin’s magic wears the nose and eyebrows of science. When tsars or millenials need medical help, they seek knowledge-keepers: specialists, a fringe of society. Before summoning the devil, the witch “[l]ocked her door . . . Three days passed.” I hide away to calculate and study (though days alone might render me more like the protagonist in another Russian story, Chekhov’s “The Bet”). Just as the witch “stocked up coal,” some students stockpile Red Bull before hitting the library. Some habits, like the archetype of the wise woman, refuse to die.

From a Russian rhyme, the bones of “Tsar Nikita” have evolved into cutting-edge science. Pushkin and the implants highlight how attitudes toward knowledge have changed, offering a lens onto science in culture and onto science culture. No wonder readers call Pushkin “timeless.”

But what would he have rhymed with “Mayer-Rokitansky-Küster-Hauser”?

 

 

 

“Tsar Nikita” has many nuances—messages about censorship, for example—that I didn’t discuss. To the intrigued, I recommend The Queen of Spades: And selected works, translated by Anthony Briggs and published by Pushkin Press.

 

Defending against high-frequency attacks

It was the summer of 2008. I was 22 years old, and it was my second week working in the crude oil and natural gas options pit at the New York Mercantile Exchange (NYMEX.) My head was throbbing after two consecutive weeks of disorientation. It was like being born into a new world, but without the neuroplasticity of a young human. And then the crowd erupted. “Yeeeehawwww. YeEEEeeHaaaWWWWW. Go get ’em cowboy.”

It seemed that everyone on the sprawling trading floor had started playing Wild Wild West and I had no idea why. After at least thirty seconds, the hollers started to move across the trading floor. They moved away 100 meters or so and then doubled back towards me. After a few meters, he finally got it, and I’m sure he learned a life lesson. Don’t be the biggest jerk in a room filled with traders, and especially, never wear triple-popped pastel-colored Lacoste shirts. This young aspiring trader had been “spurred.”

In other words, someone had made paper spurs out of trading receipts and taped them to his shoes. Go get ’em cowboy.

I was one academic quarter away from finishing a master’s degree in statistics at Stanford University and I had accepted a full time job working in the algorithmic trading group at DRW Trading. I was doing a summer internship before finishing my degree, and after three months of working in the algorithmic trading group in Chicago, I had volunteered to work at the NYMEX. Most ‘algo’ traders didn’t want this job, because it was far-removed from our mental mathematical monasteries, but I knew I would learn a tremendous amount, so I jumped at the opportunity. And by learn, I mean, get ripped calves and triceps, because my job was to stand in place for seven straight hours updating our mathematical models on a bulky tablet PC as trades occurred.

I have no vested interests in the world of high-frequency trading (HFT). I’m currently a PhD student in the quantum information group at Caltech and I have no intentions of returning to finance. I found the work enjoyable, but not as thrilling as thinking about the beginning of the universe (what else is?) However, I do feel like the current discussion about HFT is lop-sided and I’m hoping that I can broaden the perspective by telling a few short stories.

What are the main attacks against HFT? Three of them include the evilness of: front-running markets, making money out of nothing, and instability. It’s easy to point to extreme examples of algorithmic traders abusing markets, and they regularly do, but my argument is that HFT has simply computerized age-old tactics. In this process, these tactics have become more benign and markets more stable.

Front-running markets: large oil producing nations, such as Mexico, often want to hedge their exposure to changing market prices. They do this by purchasing options. This allows them to lock in a minimum sale price, for a fee of a few dollars per barrel. During my time at the NYMEX, I distinctly remember a broker shouting into the pit: “what’s the price on DEC9 puts.” A trader doesn’t want to give away whether they want to buy or sell, because if the other traders know, then they can artificially move the price. In this particular case, this broker was known to sometimes implement parts of Mexico’s oil hedge. The other traders in the pit suspected this was a trade for Mexico because of his anxious tone, some recent geopolitical news, and the expiration date of these options.

Some confident traders took a risk and faded the market. They ended up making between $1-2 million dollars from these trades, relative to what the fair price was at that moment. I mention relative to the fair price, because Mexico ultimately received the better end of this trade. The price of oil dropped in 2009, and Mexico executed its options enabling it to sell its oil at a higher than market price. Mexico spent $1.5 billion to hedge its oil exposure in 2009.

This was an example of humans anticipating the direction of a trade and capturing millions of dollars in profit as a result. It really is profit as long as the traders can redistribute their exposure at the ‘fair’ market price before markets move too far. The analogous strategy in HFT is called “front-running the market” which was highlighted in the New York Times’ recent article “the wolf hunters of Wall Street.” The HFT version involves analyzing the prices on dozens of exchanges simultaneously, and once an order is published in the order book of one exchange, then using this demand to adjust its orders on the other exchanges. This needs to be done within a few microseconds in order to be successful. This is the computerized version of anticipating demand and fading prices accordingly. These tactics as I described them are in a grey area, but they rapidly become illegal.

Making money from nothing: arbitrage opportunities have existed for as long as humans have been trading. I’m sure an ancient trader received quite the rush when he realized for the first time that he could buy gold in one marketplace and then sell it in another, for a profit. This is only worth the trader’s efforts if he makes a profit after all expenses have been taken into consideration. One of the simplest examples in modern terms is called triangle arbitrage, and it usually involves three pairs of currencies. Currency pairs are ratios; such as USD/AUD, which tells you, how many Australian dollars you receive for one US dollar. Imagine that there is a moment in time when the product of ratios \frac{USD}{AUD}\frac{AUD}{CAD}\frac{CAD}{USD} is 1.01. Then, a trader can take her USD, buy AUD, then use her AUD to buy CAD, and then use her CAD to buy USD. As long as the underlying prices didn’t change while she carried out these three trades, she would capture one cent of profit per trade.

After a few trades like this, the prices will equilibrate and the ratio will be restored to one. This is an example of “making money out of nothing.” Clever people have been trading on arbitrage since ancient times and it is a fundamental source of liquidity. It guarantees that the price you pay in Sydney is the same as the price you pay in New York. It also means that if you’re willing to overpay by a penny per share, then you’re guaranteed a computer will find this opportunity and your order will be filled immediately. The main difference now is that once a computer has been programmed to look for a certain type of arbitrage, then the human mind can no longer compete. This is one of the original arenas where the term “high-frequency” was used. Whoever has the fastest machines, is the one who will capture the profit.

Instability: I believe that the arguments against HFT of this type have the most credibility. The concern here is that exceptional leverage creates opportunity for catastrophe. Imaginations ran wild after the Flash Crash of 2010, and even if imaginations outstripped reality, we learned much about the potential instabilities of HFT. A few questions were posed, and we are still debating the answers. What happens if market makers stop trading in unison? What happens if a programming error leads to billions of dollars in mistaken trades? Do feedback loops between algo strategies lead to artificial prices? These are reasonable questions, which are grounded in examples, and future regulation coupled with monitoring should add stability where it’s feasible.

The culture in wealth driven industries today is appalling. However, it’s no worse in HFT than in finance more broadly and many other industries. It’s important that we dissociate our disgust in a broad culture of greed from debates about the merit of HFT. Black boxes are easy targets for blame because they don’t defend themselves. But that doesn’t mean they aren’t useful when implemented properly.

Are we better off with HFT? I’d argue a resounding yes. The primary function of markets is to allocate capital efficiently. Three of the strongest measures of the efficacy of markets lie in “bid-ask” spreads, volume and volatility. If spreads are low and volume is high, then participants are essentially guaranteed access to capital at as close to the “fair price” as possible. There is huge academic literature on how HFT has impacted spreads and volume but the majority of it indicates that spreads have lowered and volume has increased. However, as alluded to above, all of these points are subtle–but in my opinion, it’s clear that HFT has increased the efficiency of markets (it turns out that computers can sometimes be helpful.) Estimates of HFT’s impact on volatility haven’t been nearly as favorable but I’d also argue these studies are more debatable. Basically, correlation is not causation, and it just so happens that our rapidly developing world is probably more volatile than the pre-HFT world of the last Millennia.

We could regulate away HFT, but we wouldn’t be able to get rid of the underlying problems people point to unless we got rid of markets altogether. As with any new industry, there are aspects of HFT that should be better monitored and regulated, but we should have level-heads and diverse data points as we continue this discussion. As with most important problems, I believe the ultimate solution here lies in educating the public. Or in other words, this is my plug for Python classes for all children!!

I promise that I’ll repent by writing something that involves actual quantum things within the next two weeks!

IQIM Presents …”my father”

Debaleena Nandi at Caltech

Debaleena Nandi at Caltech

Following the IQIM teaser, which was made with the intent of creating a wider perspective of the scientist, to highlight the normalcy behind the perception of brilliance and to celebrate the common human struggles to achieve greatness, we decided to do individual vignettes of some of the characters you saw in the video.

We start with Debaleena Nandi, a grad student in Prof Jim Eisenstein’s lab, whose journey from Jadavpur University in West Bengal, India to the graduate school and research facility at the Indian institute of Science, Bangalore, to Caltech has seen many obstacles. We focus on the essentials of an environment needed to manifest the quest for “the truth” as Debaleena says. We start with her days as a child when her double-shift working father sat by her through the days and nights that she pursued her homework.

She highlights what she feels is the only way to growth; working on what is lacking, to develop that missing tool in your skill set, that asset that others might have by birth but you need to inspire by hard work.

Debaleena’s motto: to realize and face your shortcomings is the only way to achievement.

As we build Debaleena up, we also build up the identity of Caltech through its breathtaking architecture that oscillates from Spanish to Goth to modern. Both Debaleena and Caltech are revealed slowly, bit by bit.

This series is about dissecting high achievers, seeing the day to day steps, the bit by bit that adds up to the more often than not, overwhelming, impressive presence of Caltech’s science. We attempt to break it down in smaller vignettes that help us appreciate the amount of discipline, intent and passion that goes into making cutting edge researchers.

Presenting the emotional alongside the rational is something this series aspires to achieve. It honors and celebrates human limitations surrounding limitless boundaries, discoveries and possibilities.

Stay tuned for more vignettes in the IQIM Presents “My _______” Series.

But for now, here is the video. Watch, like and share!

(C) Parveen Shah Production 2014

 

Inflation on the back of an envelope

Last Monday was an exciting day!

After following the BICEP2 announcement via Twitter, I had to board a transcontinental flight, so I had 5 uninterrupted hours to think about what it all meant. Without Internet access or references, and having not thought seriously about inflation for decades, I wanted to reconstruct a few scraps of knowledge needed to interpret the implications of r ~ 0.2.

I did what any physicist would have done … I derived the basic equations without worrying about niceties such as factors of 3 or 2 \pi. None of what I derived was at all original —  the theory has been known for 30 years — but I’ve decided to turn my in-flight notes into a blog post. Experts may cringe at the crude approximations and overlooked conceptual nuances, not to mention the missing references. But some mathematically literate readers who are curious about the implications of the BICEP2 findings may find these notes helpful. I should emphasize that I am not an expert on this stuff (anymore), and if there are serious errors I hope better informed readers will point them out.

By tradition, careless estimates like these are called “back-of-the-envelope” calculations. There have been times when I have made notes on the back of an envelope, or a napkin or place mat. But in this case I had the presence of mind to bring a notepad with me.

Notes from a plane ride

Notes from a plane ride

According to inflation theory, a nearly homogeneous scalar field called the inflaton (denoted by \phi)  filled the very early universe. The value of \phi varied with time, as determined by a potential function V(\phi). The inflaton rolled slowly for a while, while the dark energy stored in V(\phi) caused the universe to expand exponentially. This rapid cosmic inflation lasted long enough that previously existing inhomogeneities in our currently visible universe were nearly smoothed out. What inhomogeneities remained arose from quantum fluctuations in the inflaton and the spacetime geometry occurring during the inflationary period.

Gradually, the rolling inflaton picked up speed. When its kinetic energy became comparable to its potential energy, inflation ended, and the universe “reheated” — the energy previously stored in the potential V(\phi) was converted to hot radiation, instigating a “hot big bang”. As the universe continued to expand, the radiation cooled. Eventually, the energy density in the universe came to be dominated by cold matter, and the relic fluctuations of the inflaton became perturbations in the matter density. Regions that were more dense than average grew even more dense due to their gravitational pull, eventually collapsing into the galaxies and clusters of galaxies that fill the universe today. Relic fluctuations in the geometry became gravitational waves, which BICEP2 seems to have detected.

Both the density perturbations and the gravitational waves have been detected via their influence on the inhomogeneities in the cosmic microwave background. The 2.726 K photons left over from the big bang have a nearly uniform temperature as we scan across the sky, but there are small deviations from perfect uniformity that have been precisely measured. We won’t worry about the details of how the size of the perturbations is inferred from the data. Our goal is to achieve a crude understanding of how the density perturbations and gravitational waves are related, which is what the BICEP2 results are telling us about. We also won’t worry about the details of the shape of the potential function V(\phi), though it’s very interesting that we might learn a lot about that from the data.

Exponential expansion

Einstein’s field equations tell us how the rate at which the universe expands during inflation is related to energy density stored in the scalar field potential. If a(t) is the “scale factor” which describes how lengths grow with time, then roughly

\left(\frac{\dot a}{a}\right)^2 \sim \frac{V}{m_P^2}.

Here \dot a means the time derivative of the scale factor, and m_P = 1/\sqrt{8 \pi G} \approx 2.4 \times 10^{18} GeV is the Planck scale associated with quantum gravity. (G is Newton’s gravitational constant.) I’ve left our a factor of 3 on purpose, and I used the symbol ~ rather than = to emphasize that we are just trying to get a feel for the order of magnitude of things. I’m using units in which Planck’s constant \hbar and the speed of light c are set to one, so mass, energy, and inverse length (or inverse time) all have the same dimensions. 1 GeV means one billion electron volts, about the mass of a proton.

(To persuade yourself that this is at least roughly the right equation, you should note that a similar equation applies to an expanding spherical ball of radius a(t) with uniform mass density V. But in the case of the ball, the mass density would decrease as the ball expands. The universe is different — it can expand without diluting its mass density, so the rate of expansion \dot a / a does not slow down as the expansion proceeds.)

During inflation, the scalar field \phi and therefore the potential energy V(\phi) were changing slowly; it’s a good approximation to assume V is constant. Then the solution is

a(t) \sim a(0) e^{Ht},

where H, the Hubble constant during inflation, is

H \sim \frac{\sqrt{V}}{m_P}.

To explain the smoothness of the observed universe, we require at least 50 “e-foldings” of inflation before the universe reheated — that is, inflation should have lasted for a time at least 50 H^{-1}.

Slow rolling

During inflation the inflaton \phi rolls slowly, so slowly that friction dominates inertia — this friction results from the cosmic expansion. The speed of rolling \dot \phi is determined by

H \dot \phi \sim -V'(\phi).

Here V'(\phi) is the slope of the potential, so the right-hand side is the force exerted by the potential, which matches the frictional force on the left-hand side. The coefficient of \dot \phi has to be H on dimensional grounds. (Here I have blown another factor of 3, but let’s not worry about that.)

Density perturbations

The trickiest thing we need to understand is how inflation produced the density perturbations which later seeded the formation of galaxies. There are several steps to the argument.

Quantum fluctuations of the inflaton

As the universe inflates, the inflaton field is subject to quantum fluctuations, where the size of the fluctuation depends on its wavelength. Due to inflation, the wavelength increases rapidly, like e^{Ht}, and once the wavelength gets large compared to H^{-1}, there isn’t enough time for the fluctuation to wiggle — it gets “frozen in.” Much later, long after the reheating of the universe, the oscillation period of the wave becomes comparable to the age of the universe, and then it can wiggle again. (We say that the fluctuations “cross the horizon” at that stage.) Observations of the anisotropy of the microwave background have determined how big the fluctuations are at the time of horizon crossing. What does inflation theory say about that?

Well, first of all, how big are the fluctuations when they leave the horizon during inflation? Then the wavelength is H^{-1} and the universe is expanding at the rate H, so H is the only thing the magnitude of the fluctuations could depend on. Since the field \phi has the same dimensions as H, we conclude that fluctuations have magnitude

\delta \phi \sim H.

From inflaton fluctuations to density perturbations

Reheating occurs abruptly when the inflaton field reaches a particular value. Because of the quantum fluctuations, some horizon volumes have larger than average values of \phi and some have smaller than average values; hence different regions reheat at slightly different times. The energy density in regions that reheat earlier starts to be reduced by expansion (“red shifted”) earlier, so these regions have a smaller than average energy density. Likewise, regions that reheat later start to red shift later, and wind up having larger than average density.

When we compare different regions of comparable size, we can find the typical (root-mean-square) fluctuations \delta t in the reheating time, knowing the fluctuations in \phi and the rolling speed \dot \phi:

\delta t \sim \frac{\delta \phi}{\dot \phi} \sim \frac{H}{\dot\phi}.

Small fractional fluctuations in the scale factor a right after reheating produce comparable small fractional fluctuations in the energy density \rho. The expansion rate right after reheating roughly matches the expansion rate H right before reheating, and so we find that the characteristic size of the density perturbations is

\delta_S\equiv\left(\frac{\delta \rho}{\rho}\right)_{hor} \sim \frac{\delta a}{a} \sim \frac{\dot a}{a} \delta t\sim \frac{H^2}{\dot \phi}.

The subscript hor serves to remind us that this is the size of density perturbations as they cross the horizon, before they get a chance to grow due to gravitational instabilities. We have found our first important conclusion: The density perturbations have a size determined by the Hubble constant H and the rolling speed \dot \phi of the inflaton, up to a factor of order one which we have not tried to keep track of. Insofar as the Hubble constant and rolling speed change slowly during inflation, these density perturbations have a strength which is nearly independent of the length scale of the perturbation. From here on we will denote this dimensionless scale of the fluctuations by \delta_S, where the subscript S stands for “scalar”.

Perturbations in terms of the potential

Putting together \dot \phi \sim -V' / H and H^2 \sim V/{m_P}^2 with our expression for \delta_S, we find

\delta_S^2 \sim \frac{H^4}{\dot\phi^2}\sim \frac{H^6}{V'^2} \sim \frac{1}{{m_P}^6}\frac{V^3}{V'^2}.

The observed density perturbations are telling us something interesting about the scalar field potential during inflation.

Gravitational waves and the meaning of r

The gravitational field as well as the inflaton field is subject to quantum fluctuations during inflation. We call these tensor fluctuations to distinguish them from the scalar fluctuations in the energy density. The tensor fluctuations have an effect on the microwave anisotropy which can be distinguished in principle from the scalar fluctuations. We’ll just take that for granted here, without worrying about the details of how it’s done.

While a scalar field fluctuation with wavelength \lambda and strength \delta \phi carries energy density \sim \delta\phi^2 / \lambda^2, a fluctuation of the dimensionless gravitation field h with wavelength \lambda and strength \delta h carries energy density \sim m_P^2 \delta h^2 / \lambda^2. Applying the same dimensional analysis we used to estimate \delta \phi at horizon crossing to the rescaled field m_P h, we estimate the strength \delta_T of the tensor fluctuations (the fluctuations of h) as

\delta_T^2 \sim \frac{H^2}{m_P^2}\sim \frac{V}{m_P^4}.

From observations of the CMB anisotropy we know that \delta_S\sim 10^{-5}, and now BICEP2 claims that the ratio

r = \frac{\delta_T^2}{\delta_S^2}

is about r\sim 0.2 at an angular scale on the sky of about one degree. The conclusion (being a little more careful about the O(1) factors this time) is

V^{1/4} \sim 2 \times 10^{16}~GeV \left(\frac{r}{0.2}\right)^{1/4}.

This is our second important conclusion: The energy density during inflation defines a mass scale, which turns our to be 2 \times 10^{16}~GeV for the observed value of r. This is a very interesting finding because this mass scale is not so far below the Planck scale, where quantum gravity kicks in, and is in fact pretty close to theoretical estimates of the unification scale in supersymmetric grand unified theories. If this mass scale were a factor of 2 smaller, then r would be smaller by a factor of 16, and hence much harder to detect.

Rolling, rolling, rolling, …

Using \delta_S^2 \sim H^4/\dot\phi^2, we can express r as

r = \frac{\delta_T^2}{\delta_S^2}\sim \frac{\dot\phi^2}{m_P^2 H^2}.

It is convenient to measure time in units of the number N = H t of e-foldings of inflation, in terms of which we find

\frac{1}{m_P^2} \left(\frac{d\phi}{dN}\right)^2\sim r;

Now, we know that for inflation to explain the smoothness of the universe we need N larger than 50, and if we assume that the inflaton rolls at a roughly constant rate during N e-foldings, we conclude that, while rolling, the change in the inflaton field is

\frac{\Delta \phi}{m_P} \sim N \sqrt{r}.

This is our third important conclusion — the inflaton field had to roll a long, long, way during inflation — it changed by much more than the Planck scale! Putting in the O(1) factors we have left out reduces the required amount of rolling by about a factor of 3, but we still conclude that the rolling was super-Planckian if r\sim 0.2. That’s curious, because when the scalar field strength is super-Planckian, we expect the kind of effective field theory we have been implicitly using to be a poor approximation because quantum gravity corrections are large. One possible way out is that the inflaton might have rolled round and round in a circle instead of in a straight line, so the field strength stayed sub-Planckian even though the distance traveled was super-Planckian.

Spectral tilt

As the inflaton rolls, the potential energy, and hence also the Hubble constant H, change during inflation. That means that both the scalar and tensor fluctuations have a strength which is not quite independent of length scale. We can parametrize the scale dependence in terms of how the fluctuations change per e-folding of inflation, which is equivalent to the change per logarithmic length scale and is called the “spectral tilt.”

To keep things simple, let’s suppose that the rate of rolling is constant during inflation, at least over the length scales for which we have data. Using \delta_S^2 \sim H^4/\dot\phi^2, and assuming \dot\phi is constant, we estimate the scalar spectral tilt as

-\frac{1}{\delta_S^2}\frac{d\delta_S^2}{d N} \sim - \frac{4 \dot H}{H^2}.

Using \delta_T^2 \sim H^2/m_P^2, we conclude that the tensor spectral tilt is half as big.

From H^2 \sim V/m_P^2, we find

\dot H \sim \frac{1}{2} \dot \phi \frac{V'}{V} H,

and using \dot \phi \sim -V'/H we find

-\frac{1}{\delta_S^2}\frac{d\delta_S^2}{d N} \sim \frac{V'^2}{H^2V}\sim m_P^2\left(\frac{V'}{V}\right)^2\sim \left(\frac{V}{m_P^4}\right)\left(\frac{m_P^6 V'^2}{V^3}\right)\sim \delta_T^2 \delta_S^{-2}\sim r.

Putting in the numbers more carefully we find a scalar spectral tilt of r/4 and a tensor spectral tilt of r/8.

This is our last important conclusion: A relatively large value of r means a significant spectral tilt. In fact, even before the BICEP2 results, the CMB anisotropy data already supported a scalar spectral tilt of about .04, which suggested something like r \sim .16. The BICEP2 detection of the tensor fluctuations (if correct) has confirmed that suspicion.

Summing up

If you have stuck with me this far, and you haven’t seen this stuff before, I hope you’re impressed. Of course, everything I’ve described can be done much more carefully. I’ve tried to convey, though, that the emerging story seems to hold together pretty well. Compared to last week, we have stronger evidence now that inflation occurred, that the mass scale of inflation is high, and that the scalar and tensor fluctuations produced during inflation have been detected. One prediction is that the tensor fluctuations, like the scalar ones, should have a notable spectral tilt, though a lot more data will be needed to pin that down.

I apologize to the experts again, for the sloppiness of these arguments. I hope that I have at least faithfully conveyed some of the spirit of inflation theory in a way that seems somewhat accessible to the uninitiated. And I’m sorry there are no references, but I wasn’t sure which ones to include (and I was too lazy to track them down).

It should also be clear that much can be done to sharpen the confrontation between theory and experiment. A whole lot of fun lies ahead.

Added notes (3/25/2014):

Okay, here’s a good reference, a useful review article by Baumann. (I found out about it on Twitter!)

From Baumann’s lectures I learned a convenient notation. The rolling of the inflaton can be characterized by two “potential slow-roll parameters” defined by

\epsilon = \frac{m_p^2}{2}\left(\frac{V'}{V}\right)^2,\quad \eta = m_p^2\left(\frac{V''}{V}\right).

Both parameters are small during slow rolling, but the relationship between them depends on the shape of the potential. My crude approximation (\epsilon = \eta) would hold for a quadratic potential.

We can express the spectral tilt (as I defined it) in terms of these parameters, finding 2\epsilon for the tensor tilt, and 6 \epsilon - 2\eta for the scalar tilt. To derive these formulas it suffices to know that \delta_S^2 is proportional to V^3/V'^2, and that \delta_T^2 is proportional to H^2; we also use

3H\dot \phi = -V', \quad 3H^2 = V/m_P^2,

keeping factors of 3 that I left out before. (As a homework exercise, check these formulas for the tensor and scalar tilt.)

It is also easy to see that r is proportional to \epsilon; it turns out that r = 16 \epsilon. To get that factor of 16 we need more detailed information about the relative size of the tensor and scalar fluctuations than I explained in the post; I can’t think of a handwaving way to derive it.

We see, though, that the conclusion that the tensor tilt is r/8 does not depend on the details of the potential, while the relation between the scalar tilt and r does depend on the details. Nevertheless, it seems fair to claim (as I did) that, already before we knew the BICEP2 results, the measured nonzero scalar spectral tilt indicated a reasonably large value of r.

Once again, we’re lucky. On the one hand, it’s good to have a robust prediction (for the tensor tilt). On the other hand, it’s good to have a handle (the scalar tilt) for distinguishing among different inflationary models.

One last point is worth mentioning. We have set Planck’s constant \hbar equal to one so far, but it is easy to put the powers of \hbar back in using dimensional analysis (we’ll continue to assume the speed of light c is one). Since Newton’s constant G has the dimensions of length/energy, and the potential V has the dimensions of energy/volume, while \hbar has the dimensions of energy times length, we see that

\delta_T^2 \sim \hbar G^2V.

Thus the production of gravitational waves during inflation is a quantum effect, which would disappear in the limit \hbar \to 0. Likewise, the scalar fluctuation strength \delta_S^2 is also O(\hbar), and hence also a quantum effect.

Therefore the detection of primordial gravitational waves by BICEP2, if correct, confirms that gravity is quantized just like the other fundamental forces. That shouldn’t be a surprise, but it’s nice to know.

My 10 biggest thrills

Wow!

BICEP2 results for the ratio r of gravitational wave perturbations to density perturbations, and the density perturbation spectral tilt n.

Evidence for gravitational waves produced during cosmic inflation. BICEP2 results for the ratio r of gravitational wave perturbations to density perturbations, and the density perturbation spectral tilt n.

Like many physicists, I have been reflecting a lot the past few days about the BICEP2 results, trying to put them in context. Other bloggers have been telling you all about it (here, here, and here, for example); what can I possibly add?

The hoopla this week reminds me of other times I have been really excited about scientific advances. And I recall some wise advice I received from Sean Carroll: blog readers like lists.  So here are (in chronological order)…

My 10 biggest thrills (in science)

This is a very personal list — your results may vary. I’m not saying these are necessarily the most important discoveries of my lifetime (there are conspicuous omissions), just that, as best I can recall, these are the developments that really started my heart pounding at the time.

1) The J/Psi from below (1974)

I was a senior at Princeton during the November Revolution. I was too young to appreciate fully what it was all about — having just learned about the Weinberg-Salam model, I thought at first that the Z boson had been discovered. But by stalking the third floor of Jadwin I picked up the buzz. No, it was charm! The discovery of a very narrow charmonium resonance meant we were on the right track in two ways — charm itself confirmed ideas about the electroweak gauge theory, and the narrowness of the resonance fit in with the then recent idea of asymptotic freedom. Theory triumphant!

2) A magnetic monopole in Palo Alto (1982)

By 1982 I had been thinking about the magnetic monopoles in grand unified theories for a few years. We thought we understood why no monopoles seem to be around. Sure, monopoles would be copiously produced in the very early universe, but then cosmic inflation would blow them away, diluting their density to a hopelessly undetectable value. Then somebody saw one …. a magnetic monopole obediently passed through Blas Cabrera’s loop of superconducting wire, producing a sudden jump in the persistent current. On Valentine’s Day!

According to then current theory, the monopole mass was expected to be about 10^16 GeV (10 million billion times heavier than a proton). Had Nature really been so kind as the bless us with this spectacular message from an staggeringly high energy scale? It seemed too good to be true.

It was. Blas never detected another monopole. As far as I know he never understood what glitch had caused the aberrant signal in his device.

3) “They’re green!” High-temperature superconductivity (1987)

High-temperature superconductors were discovered in 1986 by Bednorz and Mueller, but I did not pay much attention until Paul Chu found one in early 1987 with a critical temperature of 77 K. Then for a while the critical temperature seemed to be creeping higher and higher on an almost daily basis, eventually topping 130K …. one wondered whether it might go up, up, up forever.

It didn’t. Today 138K still seems to be the record.

My most vivid memory is that David Politzer stormed into my office one day with a big grin. “They’re green!” he squealed. David did not mean that high-temperature superconductors would be good for the environment. He was passing on information he had just learned from Phil Anderson, who happened to be visiting Caltech: Chu’s samples were copper oxides.

4) “Now I have mine” Supernova 1987A (1987)

What was most remarkable and satisfying about the 1987 supernova in the nearby Large Magellanic Cloud was that the neutrinos released in a ten second burst during the stellar core collapse were detected here on earth, by gigantic water Cerenkov detectors that had been built to test grand unified theories by looking for proton decay! Not a truly fundamental discovery, but very cool nonetheless.

Soon after it happened some of us were loafing in the Lauritsen seminar room, relishing the good luck that had made the detection possible. Then Feynman piped up: “Tycho Brahe had his supernova, Kepler had his, … and now I have mine!” We were all silent for a few seconds, and then everyone burst out laughing, with Feynman laughing the hardest. It was funny because Feynman was making fun of his own gargantuan ego. Feynman knew a good gag, and I heard him use this line at a few other opportune times thereafter.

5) Science by press conference: Cold fusion (1989)

The New York Times was my source for the news that two chemists claimed to have produced nuclear fusion in heavy water using an electrochemical cell on a tabletop. I was interested enough to consult that day with our local nuclear experts Charlie Barnes, Bob McKeown, and Steve Koonin, none of whom believed it. Still, could it be true?

I decided to spend a quiet day in my office, trying to imagine ways to induce nuclear fusion by stuffing deuterium into a palladium electrode. I came up empty.

My interest dimmed when I heard that they had done a “control” experiment using ordinary water, had observed the same excess heat as with heavy water, and remained just as convinced as before that they were observing fusion. Later, Caltech chemist Nate Lewis gave a clear and convincing talk to the campus community debunking the original experiment.

6) “The face of God” COBE (1992)

I’m often too skeptical. When I first heard in the early 1980s about proposals to detect the anisotropy in the cosmic microwave background, I doubted it would be possible. The signal is so small! It will be blurred by reionization of the universe! What about the galaxy! What about the dust! Blah, blah, blah, …

The COBE DMR instrument showed it could be done, at least at large angular scales, and set the stage for the spectacular advances in observational cosmology we’ve witnessed over the past 20 years. George Smoot infamously declared that he had glimpsed “the face of God.” Overly dramatic, perhaps, but he was excited! And so was I.

7) “83 SNU” Gallex solar neutrinos (1992)

Until 1992 the only neutrinos from the sun ever detected were the relatively high energy neutrinos produced by nuclear reactions involving boron and beryllium — these account for just a tiny fraction of all neutrinos emitted. Fewer than expected were seen, a puzzle that could be resolved if neutrinos have mass and oscillate to another flavor before reaching earth. But it made me uncomfortable that the evidence for solar neutrino oscillations was based on the boron-beryllium side show, and might conceivably be explained just by tweaking the astrophysics of the sun’s core.

The Gallex experiment was the first to detect the lower energy pp neutrinos, the predominant type coming from the sun. The results seemed to confirm that we really did understand the sun and that solar neutrinos really oscillate. (More compelling evidence, from SNO, came later.) I stayed up late the night I heard about the Gallex result, and gave a talk the next day to our particle theory group explaining its significance. The talk title was “83 SNU” — that was the initially reported neutrino flux in Solar Neutrino Units, later revised downward somewhat.

8) Awestruck: Shor’s algorithm (1994)

I’ve written before about how Peter Shor’s discovery of an efficient quantum algorithm for factoring numbers changed my life. This came at a pivotal time for me, as the SSC had been cancelled six months earlier, and I was growing pessimistic about the future of particle physics. I realized that observational cosmology would have a bright future, but I sensed that theoretical cosmology would be dominated by data analysis, where I would have little comparative advantage. So I became a quantum informationist, and have not regretted it.

9) The Higgs boson at last (2012)

The discovery of the Higgs boson was exciting because we had been waiting soooo long for it to happen. Unable to stream the live feed of the announcement, I followed developments via Twitter. That was the first time I appreciated the potential value of Twitter for scientific communication, and soon after I started to tweet.

10) A lucky universe: BICEP2 (2014)

Many past experiences prepared me to appreciate the BICEP2 announcement this past Monday.

I first came to admire Alan Guth‘s distinctive clarity of thought in the fall of 1973 when he was the instructor for my classical mechanics course at Princeton (one of the best classes I ever took). I got to know him better in the summer of 1979 when I was a graduate student, and Alan invited me to visit Cornell because we were both interested in magnetic monopole production  in the very early universe. Months later Alan realized that cosmic inflation could explain the isotropy and flatness of the universe, as well as the dearth of magnetic monopoles. I recall his first seminar at Harvard explaining his discovery. Steve Weinberg had to leave before the seminar was over, and Alan called as Steve walked out, “I was hoping to hear your reaction.” Steve replied, “My reaction is applause.” We all felt that way.

I was at a wonderful workshop in Cambridge during the summer of 1982, where Alan and others made great progress in understanding the origin of primordial density perturbations produced from quantum fluctuations during inflation (Bardeen, Steinhardt, Turner, Starobinsky, and Hawking were also working on that problem, and they all reached a consensus by the end of the three-week workshop … meanwhile I was thinking about the cosmological implications of axions).

I also met Andrei Linde at that same workshop, my first encounter with his mischievous grin and deadpan wit. (There was a delegation of Russians, who split their time between Xeroxing papers and watching the World Cup on TV.) When Andrei visited Caltech in 1987, I took him to Disneyland, and he had even more fun than my two-year-old daughter.

During my first year at Caltech in 1984, Mark Wise and Larry Abbott told me about their calculations of the gravitational waves produced during inflation, which they used to derive a bound on the characteristic energy scale driving inflation, a few times 10^16 GeV. We mused about whether the signal might turn out to be detectable someday. Would Nature really be so kind as to place that mass scale below the Abbott-Wise bound, yet high enough (above 10^16 GeV) to be detectable? It seemed unlikely.

Last week I caught up with the rumors about the BICEP2 results by scanning my Twitter feed on my iPad, while still lying in bed during the early morning. I immediately leapt up and stumbled around the house in the dark, mumbling to myself over and over again, “Holy Shit! … Holy Shit! …” The dog cast a curious glance my way, then went back to sleep.

Like millions of others, I was frustrated Monday morning, trying to follow the live feed of the discovery announcement broadcast from the hopelessly overtaxed Center for Astrophysics website. I was able to join in the moment, though, by following on Twitter, and I indulged in a few breathless tweets of my own.

Many of his friends have been thinking a lot these past few days about Andrew Lange, who had been the leader of the BICEP team (current senior team members John Kovac and Chao-Lin Kuo were Caltech postdocs under Andrew in the mid-2000s). One day in September 2007 he sent me an unexpected email, with the subject heading “the bard of cosmology.” Having discovered on the Internet a poem I had written to introduce a seminar by Craig Hogan, Andrew wrote:

“John,

just came across this – I must have been out of town for the event.

l love it.

it will be posted prominently in our lab today (with “LISA” replaced by “BICEP”, and remain our rallying cry till we detect the B-mode.

have you set it to music yet?

a”

I lifted a couplet from that poem for one of my tweets (while rumors were swirling prior to the official announcement):

We’ll finally know how the cosmos behaves
If we can detect gravitational waves.

Assuming the BICEP2 measurement r ~ 0.2 is really a detection of primordial gravitational waves, we have learned that the characteristic mass scale during inflation is an astonishingly high 2 X 10^16 GeV. Were it a factor of 2 smaller, the signal would have been far too small to detect in current experiments. This time, Nature really is on our side, eagerly revealing secrets about physics at a scale far, far beyond what we will every explore using particle accelerators. We feel lucky.

We physicists can never quite believe that the equations we scrawl on a notepad actually have something to do with the real universe. You would think we’d be used to that by now, but we’re not — when it happens we’re amazed. In my case, never more so than this time.

The BICEP2 paper, a historic document (if the result holds up), ends just the way it should:

“We dedicate this paper to the memory of Andrew Lange, whom we sorely miss.”