About shaunmaguire

I'm a PhD student working in quantum information at Caltech. It's astonishing that they gave the keys to this blog to hooligans like myself.

The singularity is not near: the human brain as a Boson sampler?

Ever since the movie Transcendence came out, it seems like the idea of the ‘technological singularity‘ has been in the air. Maybe it’s because I run in an unorthodox circle of deep thinkers, but over the past couple months, I’ve been roped into three conversations related to this topic. The conversations usually end with some version of “ah shucks, machine learning is developing at a fast rate, so we are all doomed. And have you seen those deep learning videos? Computers are learning to play 35 year old video games?! Put this on an exponential trend and we are D00M3d!”

Computers are now learning the rules of this game and then playing it optimally. Are we all doomed?

Computers are now learning the rules of this game, from visual input only, and then playing it optimally. Are we all doomed?

So what is the technological singularity? My personal translation is: are we on the verge of narcissistic flesh-eating robots stealing our lunch money while we commute to the ‘special school for slow sapiens’?

This is an especially hyperbolic view, and I want to be clear to distinguish ‘machine learning‘ from ‘artificial consciousness.’ The former seems poised for explosive growth but the latter seems to require breakthroughs in our understanding of the fundamental science. The two concepts are often equated when defining the singularity, or even artificial intelligence, but I think it’s important to distinguish these two concepts. Without distinguishing them, people sometimes make the faulty association: machine_learning_progress=>AI_progress=>artificial_consciousness_progress.

I’m generally an optimistic person, but on this topic, I’m especially optimistic about humanity’s status as machine overlords for at least the next ~100 years. Why am I so optimistic? Quantum information (QI) theory has a secret weapon. And that secret weapon is obviously Scott Aaronson (and his brilliant friends+colleagues+sidekicks; especially Alex Arkhipov in this case.) Over the past few years they have done absolutely stunning work related to understanding the computational complexity of linear optics. They colloquially call this work Boson sampling.

What I’m about to say is probably extremely obvious to most people in the QI community, but I’ve had conversations with exquisitely well educated people–including a Nobel Laureate–and very few people outside of QI seem to be aware of Aaronson and Arkhipov’s (AA’s) results. Here’s a thought experiment: does a computer have all the hardware required to simulate the human brain? For a long time, many people thought yes, and they even created a more general hypothesis called the “extended Church-Turring hypothesis.”

An interdisciplinary group of scientists has long speculated that quantum mechanics may stand as an obstruction towards this hypothesis. In particular, it’s believed that quantum computers would be able to efficiently solve some problems that are hard for a classical computer. These results led people, possibly Roger Penrose most notably, to speculate that consciousness may leverage these quantum effects. However, for many years, there was a huge gap between quantum experiments and the biology of the human brain. If I ever broached this topic at a dinner party, my biologist friends would retort: “but the brain is warm and wet, good luck managing decoherence.” And this seems to be a valid argument against the brain as a universal quantum computer. However, one of AA’s many breakthroughs is that they paved the way towards showing that a rather elementary physical system can gain speed-ups on certain classes of problems over classical computers. Maybe the human brain has a Boson sampling module?

More specifically, AA’s physical setup involves being able to: generate identical photons; send them through a network of beamsplitters, phase shifters and mirrors; and then count the number of photons in each mode through ‘nonadaptive’ measurements. This setup computes the permanent of a matrix, which is known to be a hard problem classically. AA showed that if there exists a polynomial-time classical algorithm which samples from the same probability distribution, then the polynomial hierarchy would collapse to the third level (this last statement would be very bad for theoretical computer science and therefore for humans; ergo probably not true.) I should also mention that when I learned the details of these results, during Scott’s lectures this past January at the Israeli Insitute of Advanced Studies’ Winter School in Theoretical Physics, that there was one step in the proof which was not rigorous. Namely, they rely on a conjecture in random matrix theory–but at least they have simulations indicating the conjecture should be true.

Nitty gritty details aside, I find the possibility that this simple system is gaining a classical speed-up compelling in the conversation about consciousness. Especially considering that finding permanents is actually useful for some combinatorics problems. When you combine this with Nature’s mischievous manner of finding ways to use the tools available to it, it seems plausible to me that the brain is using something like Boson sampling for at least one non-trivial task towards consciousness. If not Boson sampling, then maybe ‘Fermion smashing’ or ‘minimal surface finding’ or some other crackpottery words I’m coming up with on the fly. The point is, this result opens a can of worms.

AA’s results have bred new life into my optimism towards humanity’s ability to rule the lands and interwebs for at least the next few decades. Or until some brilliant computer scientist proves that human consciousness is in P. If nothing else, it’s a fun topic for wild dinner party speculation.

Ten reasons why black holes exist

I spent the past two weeks profoundly confused. I’ve been trying to get up to speed on this firewall business and I wanted to understand the picture below.

Much confuse. Such lost. [Is doge out of fashion now? I wouldn't know because I've been trapped in a black hole!]

[Technical paragraph that you can skip.] I’ve been trying to understand why the picture on the left is correct, even though my intuition said the middle picture should be (intuition should never be trusted when thinking about quantum gravity.) The details of these pictures are technical and tangential to this post, but the brief explanation is that these pictures are called Penrose diagrams and they provide an intuitive way to think about the time dynamics of black holes. The two diagrams on the left represent the same physics as the schematic diagram on the right. I wanted to understand why during Hawking radiation, the radial momentum for partner modes is in the same direction. John Preskill gave me the initial reasoning, that “radial momentum is not an isometry of Schwarzchild or Rindler geometries,” then I struggled for a few weeks to unpack this, and then Dan Harlow rescued me with some beautiful derivations that make it crystal clear that the picture on the left is indeed correct. I wanted to understand this because if the central picture is correct, then it would be hard to catch up to an infalling Hawking mode and therefore to verify firewall weirdness. The images above are simple enough, but maybe the image below will give you a sense for how much of an uphill battle this was!


This pretty much sums up my last two weeks (with the caveat that each of these scratch sheets is double sided!) Or in case you wanted to know what a theoretical physicist does all day.

After four or five hours of maxing out my brain, it starts to throb. For the past couple of weeks, after breaking my brain with firewalls each day, I’ve been switching gears and reading about black hole astronomy (real-life honest-to-goodness science with data!) Beyond wanting to know the experimental state-of-the-art related to the fancy math I’ve been thinking about, I also had the selfish motivation that I wanted to do some PR maintenance after Nature’s headline: “Stephen Hawking: ‘There are no black holes’.” I found this headline infuriating when Nature posted it back in January. When taken out of context, this quote makes it seem like Stephen Hawking was saying “hey guys, my bad, we’ve been completely wrong all this time. Turn off the telescopes.” When in reality what he was saying was more like: “hey guys, I think this really hard modern firewall paradox is telling us that we’ve misunderstood an extremely subtle detail and we need to make corrections on the order of a few Planck lengths, but it matters!” When you combine this sensationalism with Nature’s lofty credibility, the result is that even a few of my intelligent scientist peers have interpreted this as the non-existence of astrophysical black holes. Not to mention that it opens a crack for the news media to say things like: ‘if even Stephen Hawking has been wrong all this time, then how can we possibly trust the rest of this scientist lot, especially related to climate change?’ So brain throbbing + sensationalism => learning black hole astronomy + PR maintenance.

Before presenting the evidence, I should wave my hands about what we’re looking for. You have all heard about black holes. They are objects where so much mass gets concentrated in such a small volume that Einstein’s general theory of relativity predicts that once an object passes beyond a certain distance (called the event horizon), then said object will never be able to escape, and must proceed to the center of the black hole. Even photons cannot escape once they pass beyond the event horizon (except when you start taking quantum mechanics into account, but this is a small correction which we won’t focus on here.) All of our current telescopes collect photons, and as I just mentioned, when photons get close to a black hole, they fall in, so this means a detection with current technology will only be indirect. What are these indirect detections we have made? Well, general relativity makes numerous predictions about black holes. After we confirm enough of these predictions to a high enough precision, and without a viable alternative theory, we can safely conclude that we have detected black holes. This is similar to how many other areas of science work, like particle physics finding new particles through detecting a particle’s decay products, for example.

Without further ado, I hope the following experimental evidence will convince you that black holes permeate our universe (and if not black holes, then something even weirder and more interesting!)

1. Sgr A*: There is overwhelming evidence that there is a supermassive black hole at the center of our galaxy, the Milky Way. As a quick note, most of the black holes we have detected are broken into two categories, solar mass, where they are only a few times more massive than our sun (5-30 solar masses), or supermassive, where the mass is about 10^5-10^{10} solar masses. Some of the most convincing evidence comes from the picture below. Andrea Ghez and others tracked the orbits of several stars around the center of the Milky Way for over twenty years. We have learned that these stars orbit around a point-like object with a mass on the order of 4\times 10^6 solar masses. Measurements in the radio spectrum show that there is a radio source located in the same location which we call Sagittarius A* (Sgr A*). Sgr A* is moving at less than 1km/s and has a mass of at least 10^5 solar masses. These bounds make it pretty clear that Sgr A* is the same object as what is at the focus of these orbits. A radio source is exactly what you would expect for this system because as dust particles get pulled towards the black hole, they collide and friction causes them to heat up, and hot objects radiate photons. These arguments together make it pretty clear that Sgr A* is a supermassive black hole at the center of the Milky Way!

This plot shows the orbits of a few stars

What are you looking at? This plot shows the orbits of a few stars around the center of our galaxy, tracked over 17 years!

2. Orbit of S2: During a recent talk that Andrea Ghez gave at Caltech, she said that S2 is “her favorite star.” S2 is a 15 solar mass star located near the black hole at the center of our galaxy. S2’s distance from this black hole is only about four times the distance from Neptune to the Sun (at closest point in orbit), and it’s orbital period is only 15 years. The Keck telescopes in Mauna Kea have followed almost two complete orbits of S2. This piece of evidence is redundant compared to point 1, but it’s such an amazing technological feat that I couldn’t resist including it.

We've followed S2's complete orbit. Is it orbiting around nothing? Something exotic that we have no idea about? Or much more likely around a black hole.

We’ve followed S2’s complete orbit. Is it orbiting around nothing? Something exotic that we have no idea about? Or much more likely around a black hole.

3. Numerical studies: astrophysicists have done numerous numerical simulations which provide a different flavor of test. Christian Ott at Caltech is pretty famous for these types of studies.

Image from a numerical simulation that Christian Ott and his student Evan O'Connor performed.

Image from a numerical simulation that Christian Ott and his student Evan O’Connor performed.

4. Cyg A: Cygnus A is a galaxy located in the Cygnus constellation. It is an exceptionally bright radio source. As I mentioned in point 1, as dust falls towards a black hole, friction causes it to heat up and then hot objects radiate away photons. The image below demonstrates this. We are able to use the Eddington limit to convert luminosity measurements into estimates of the mass of Cyg A. Not necessarily in the case of Cyg A, but in the case of its cousins Active Galactic Nuclei (AGNs) and Quasars, we are also able to put bounds on their sizes. These two things together show that there is a huge amount of mass trapped in a small volume, which is therefore probably a black hole (alternative models can usually be ruled out.)

There is a supermassive black hole at the center of this image. Dust falls towards it, gets heated up

There is a supermassive black hole at the center of this image which powers the rest of this action! The black hole is spinning and it emits relativistic jets along its axis of rotation. The blobs come from the jets colliding with the intergalactic medium.

5. AGNs and Quasars: these are bright sources which are powered by supermassive black holes. Arguments similar to those used for Cyg A make us confident that they really are powered by black holes and not some alternative.

6. X-ray binaries: astronomers have detected ~20 stellar mass black holes by finding pairs consisting of a star and a black hole, where the star is close enough that the black hole is sucking in its mass. This leads to accretion which leads to the emission of X-Rays which we detect on Earth. Cygnus X-1 is a famous example of this.

7. Water masers: Messier 106 is the quintessential example.

8. Gamma ray bursts: most gamma ray bursts occur when a rapidly spinning high mass star goes supernova (or hypernova) and leaves a neutron star or black hole in its wake. However, it is believed that some of the “long” duration gamma ray bursts are powered by accretion around rapidly spinning black holes.

That’s only eight reasons but I hope you’re convinced that black holes really exist! To round out this list to include ten things, here are two interesting open questions related to black holes:

1. Firewalls: I mentioned this paradox at the beginning of this post. This is the cutting edge of quantum gravity which is causing hundreds of physicists to pull their hair out!

2. Feedback: there is an extremely strong correlation between the size of a galaxy’s supermassive black hole and many of the other properties in the galaxy. This connection was only realized about a decade ago and trying to understand how the black hole (which has a mass much smaller than the total mass of the galaxy) affects galaxy formation is an active area of research in astrophysics.

In addition to everything mentioned above, I want to emphasize that most of these results are only from the past decade. Not to mention that we seem to be close to the dawn of gravitational wave astronomy which will allow us to probe black holes more directly. There are also exciting instruments that have recently come online, such as NuSTAR. In other words, this is an extremely exciting time to be thinking about black holes–both from observational and theoretical perspectives–we have data and a paradox! In conclusion, black holes exist. They really do. And let’s all make a pact to read critically in the 21st century!

Cool resource from Sky and Telescope.

[* I want to thank my buddy Kaes Van't Hof for letting me crash on his couch in NYC last week, which is where I did most of this work. ** I also want to thank Dan Harlow for saving me months of confusion by sharing a draft of his notes from his course on firewalls at the Israeli Institute for Advanced Study's winter school in theoretical physics.]

Hacking nature: loopholes in the laws of physics

I spent my childhood hacking computers. When I was seven, my cousin showed up for Thanksgiving with a box filled with computer parts and we built my first computer. I got into competitive computer gaming around age eleven, and hacking was a natural extension of these activities. Then when I was sixteen, after doing poorly at a Counterstrike tournament, I decided that I should probably apply myself to other things. Needless to say, my parents were thrilled. So that’s when I bought my first computer (instead of building my own), which for deliberate but now antediluvian reasons was a Mac. A few years later, when I was taking CS 106 at Stanford, I was the first student in the course’s history whose reason for buying a Mac was “so that I couldn’t play computer games!” And now you know the story of my childhood.

The hacker mentality is quite different than the norm and my childhood trained me to look at absolutist laws as opportunities to find loopholes (of course only when legal and socially responsible!) I’ve applied this same mentality as I’ve been doing physics and I’d like to share with you some of the loopholes that I’ve gathered.


Scharnhorst effect enables light to travel faster than in vacuum (c=299,792,458 m/s): this is about the grandaddy of all laws, that nothing can travel faster than light in a vacuum! This effect is the most controversial on my list, because it hasn’t yet been experimentally verified, but it seems obvious with the right picture in mind. Most people’s mental model for light traveling in a vacuum is of little particles/waves called photons traveling through empty space. However, the vacuum is not empty! It is filled with pairs of virtual particles which momentarily fleet into existence. Interactions with these virtual particles create a small amount of ‘resistance’ as photons zoom through the vacuum (photons get absorbed into virtual electron-positron pairs and then spit back out as photons ad infinitum.) Thus, if we could somehow reduce the rate at which virtual particles are created, photons would interact less strongly with the vacuum, and would be able to travel marginally faster than c. But this is exactly what leads to the Casimir effect: the experimentally verified fact that if you take two mirrors and put them ~10 nanometers apart, then they will attract each other because there are more virtual particles created outside the cavity than inside [low momenta virtual modes are inaccessible because the uncertainty principle requires \Delta x \cdot \Delta p= 10nm\cdot\Delta p \geq \hbar/2.] This effect is extremely small, only predicting that light would travel one part in 10^{36} faster than c. However, it should remind us all to deeply question assumptions.

This first loophole used quantum effects to beat a relativistic bound, but the next few loopholes are purely quantum, and are mainly related to that most quantum of all limits, the Heisenberg uncertainty principle.

Smashing the standard quantum limit (SQL) with squeezed measurements: the Heisenberg uncertainty principle tells us that there is a fundamental tradeoff in nature: the more precise your information about an object’s position, the less precise your knowledge about its momentum. Or vice versa, or replace x and p with and t, or any other conjugate variables. This uncertainty principle is oftentimes written as \Delta x\cdot \Delta p \geq \hbar/2. For a variety of reasons, in the early days of quantum mechanics, it was hard enough to imagine creating a state with \Delta x \cdot \Delta p = \hbar/2, but there was some hope because this is obtained in the ground state of a quantum harmonic oscillator. In this case, we have \Delta x = \Delta p = \sqrt{\hbar/2}. However, it was harder still to imagine creating states with \Delta x < \sqrt{\hbar/2}, these states would be said to ‘go beyond the standard quantum limit’ (SQL). Over the intervening years, not only have we figured out how to go beyond the SQL using squeezed coherent states, but this is actually essential in some of our most exciting current experiments, like LIGO.

LIGO is an incredibly ambitious experiment which has been talked about multiple times on this blog. It is trying to usher in a new era of astronomy–moving beyond detecting photons–to detecting gravitational waves, ripples in spacetime which are generated as exceptionally massive objects merge, such as when two black holes collide. The effects of these waves on our local spacetime as they travel past earth are minuscule, on the scale of 10^{-18}m, which is about one thousand times shorter than the ‘diameter’ of a proton, and is the same order of magnitude as \sqrt{\hbar/2}. Remarkably, LIGO has exploited squeezed light to demonstrate sensitivities beyond the SQL. LIGO expects to start detecting gravitational waves on a frequent basis as its upgrades deemed ‘advanced LIGO’ are completed over the next few years.

Compressed sensing beats Nyquist-Shannon: let’s play a game. Imagine I’m sending you a radio signal. How often do you need to measure the signal in order to be able to reconstruct it perfectly? The Nyquist-Shannon sampling theorem is a path-breaking result which Claude Shannon proved in 1949. If you measure at least twice as often as the highest frequency, then you are guaranteed perfect recovery of the signal. This incredibly profound result laid the foundation for modern communications. Also, it is important to realize that your signal can be much more general than simply radio waves, such as with a signal of images. This theorem is a sufficient condition for reconstruction, but is it necessary? Not even close. And it took us over 50 years to understand this in generality.

Compressed sensing was proposed between 2004-2006 by Emmanuel Candes, David Donaho and Terry Tao with important early contributions by Justin Romberg. I should note that Candes and Romberg were at Caltech during this period. The Nyquist-Shannon theorem told us that with a small amount of knowledge (a bound on the highest frequency) that we could reconstruct a signal perfectly by only measuring at a rate twice faster than the highest frequency–instead of needing to measure continuously. Compressed sensing says that with one extra assumption, assuming that only sparsely few of your frequencies are being used (call it 10 out of 1000), that you can recover your signal with high accuracy using dramatically fewer measurements. And it turns out that this assumption is valid for a huge range of applications: enabling real-time MRIs using conventional technology or more relevant to this blog, increasing our ability to distinguish quantum states via tomography.

Unlike the other topics in this blog post, I have never worked with compressed sensing, but my intuition goes like this: instead of measuring in the basis in which you are sparse (frequency for example), measure in a different basis. With high probability each of these measurements will pick up a little piece from each of the occupied modes. Then, to reconstruct your signal, you want to use the L0-“norm” to interpolate in such a way that you use the fewest frequency components possible. Computing the L0-“norm” is not efficient, so one of the major breakthroughs of compressed sensing was showing that with high probability computing the L1-norm approximates the L0 solution, and all of this can be done using a highly efficient linear program. However, I really shouldn’t be speculating because I’ve never invested much time into mastering this new tool, and I’m friends with a couple of the quantum state tomography authors, so maybe they’ll chime in?

Brahms is a cool dude. Brahms as a height map--cliffs=Gibbs phenomena=oh no! First three levels of Brahms wavelets.

Brahms is a cool dude. Brahms as a height map where cliffs=Gibbs phenomena=oh no! First three levels of Brahms as a Haar wavelet.

Wavelets as the mother of all bases: I previously wrote a post about the importance of choosing a convenient basis. Imagine you have an image which has a bunch of sharp contrasts, such as the outline of a person, or a horizon, or a table, basically anything. How do you store it efficiently? Due to the Gibbs phenomena, the Fourier basis is pretty awful for these applications. Here’s another motivating problem, imagine someone plays one note on an instrument. The sound is localized in both time and frequency. The Fourier basis is also pretty awful at storing/detecting this. Wavelets to the rescue! The theory of wavelets uses some beautiful math to solve the longstanding problem of finding a basis which is localized in both position and momenta space (or very close to it.) Wavelets have profound applications, some of my favorite include: modern image compression (JPEG 2000 onwards) is based on wavelets; Ingrid Daubechies and her colleagues used wavelets to detect forged paintings; recovering previously unrecoverable recordings of Brahms at the piano (I heard about this from Barry Simon, of Reed-Simon fame, who is currently teaching his last class ever); and even the FBI uses wavelets to compress images of fingerprints, obtaining a compression ratio of 20:1.

Postselection enables quantum cloning: the no-cloning theorem is well known in the field of quantum information. It says that you cannot find a machine (unitary operation U) which takes an arbitrary input state |\psi\rangle, and a known state |0\rangle, such that the machine maps |\psi\rangle \otimes |0\rangle to |\psi\rangle \otimes |\psi\rangle, and thereby cloning |\psi \rangle. This is very easy to prove using the linearity of quantum mechanics. However, there are loopholes. One of the most trivial loopholes is realizing that one can take the state |\psi\rangle and perform something called unambiguous state discrimination, which either spits out exactly which state |\psi \rangle is with some probability, or otherwise spits out “I don’t know which state.” You can postselect that the unambigious state discrimination succeeded and prepare a unitary which clones the relevant states. Peter Shor has a comment on physics stackexchange describing this. Seth Lloyd and John Preskill outlined a less trivial version of this in their recent paper which tries to circumvent firewalls by using postselected quantum teleportation.

In this blog post, I’ve only described a tiny fraction of the quantum loopholes that have been discovered. If I had more space/time, two of the next examples I would describe are beating classical correlations with quantum entanglement, in order to win at CHSH games. I would also describe weak measurements and some of the peculiar things they lead to. Beyond that, I would probably refer you to Yakir Aharonov’s amazingly fun book about quantum paradoxes.

After reading this, I hope that the next time you encounter an inviolable law of nature, you’ll apply the hacker mentality and attempt to strip it down to its essence, isolate assumptions, and potentially find a loophole. But while you’re doing this, remember that you should never argue with your mother, or with mathematics!

Defending against high-frequency attacks

It was the summer of 2008. I was 22 years old, and it was my second week working in the crude oil and natural gas options pit at the New York Mercantile Exchange (NYMEX.) My head was throbbing after two consecutive weeks of disorientation. It was like being born into a new world, but without the neuroplasticity of a young human. And then the crowd erupted. “Yeeeehawwww. YeEEEeeHaaaWWWWW. Go get ‘em cowboy.”

It seemed that everyone on the sprawling trading floor had started playing Wild Wild West and I had no idea why. After at least thirty seconds, the hollers started to move across the trading floor. They moved away 100 meters or so and then doubled back towards me. After a few meters, he finally got it, and I’m sure he learned a life lesson. Don’t be the biggest jerk in a room filled with traders, and especially, never wear triple-popped pastel-colored Lacoste shirts. This young aspiring trader had been “spurred.”

In other words, someone had made paper spurs out of trading receipts and taped them to his shoes. Go get ‘em cowboy.

I was one academic quarter away from finishing a master’s degree in statistics at Stanford University and I had accepted a full time job working in the algorithmic trading group at DRW Trading. I was doing a summer internship before finishing my degree, and after three months of working in the algorithmic trading group in Chicago, I had volunteered to work at the NYMEX. Most ‘algo’ traders didn’t want this job, because it was far-removed from our mental mathematical monasteries, but I knew I would learn a tremendous amount, so I jumped at the opportunity. And by learn, I mean, get ripped calves and triceps, because my job was to stand in place for seven straight hours updating our mathematical models on a bulky tablet PC as trades occurred.

I have no vested interests in the world of high-frequency trading (HFT). I’m currently a PhD student in the quantum information group at Caltech and I have no intentions of returning to finance. I found the work enjoyable, but not as thrilling as thinking about the beginning of the universe (what else is?) However, I do feel like the current discussion about HFT is lop-sided and I’m hoping that I can broaden the perspective by telling a few short stories.

What are the main attacks against HFT? Three of them include the evilness of: front-running markets, making money out of nothing, and instability. It’s easy to point to extreme examples of algorithmic traders abusing markets, and they regularly do, but my argument is that HFT has simply computerized age-old tactics. In this process, these tactics have become more benign and markets more stable.

Front-running markets: large oil producing nations, such as Mexico, often want to hedge their exposure to changing market prices. They do this by purchasing options. This allows them to lock in a minimum sale price, for a fee of a few dollars per barrel. During my time at the NYMEX, I distinctly remember a broker shouting into the pit: “what’s the price on DEC9 puts.” A trader doesn’t want to give away whether they want to buy or sell, because if the other traders know, then they can artificially move the price. In this particular case, this broker was known to sometimes implement parts of Mexico’s oil hedge. The other traders in the pit suspected this was a trade for Mexico because of his anxious tone, some recent geopolitical news, and the expiration date of these options.

Some confident traders took a risk and faded the market. They ended up making between $1-2 million dollars from these trades, relative to what the fair price was at that moment. I mention relative to the fair price, because Mexico ultimately received the better end of this trade. The price of oil dropped in 2009, and Mexico executed its options enabling it to sell its oil at a higher than market price. Mexico spent $1.5 billion to hedge its oil exposure in 2009.

This was an example of humans anticipating the direction of a trade and capturing millions of dollars in profit as a result. It really is profit as long as the traders can redistribute their exposure at the ‘fair’ market price before markets move too far. The analogous strategy in HFT is called “front-running the market” which was highlighted in the New York Times’ recent article “the wolf hunters of Wall Street.” The HFT version involves analyzing the prices on dozens of exchanges simultaneously, and once an order is published in the order book of one exchange, then using this demand to adjust its orders on the other exchanges. This needs to be done within a few microseconds in order to be successful. This is the computerized version of anticipating demand and fading prices accordingly. These tactics as I described them are in a grey area, but they rapidly become illegal.

Making money from nothing: arbitrage opportunities have existed for as long as humans have been trading. I’m sure an ancient trader received quite the rush when he realized for the first time that he could buy gold in one marketplace and then sell it in another, for a profit. This is only worth the trader’s efforts if he makes a profit after all expenses have been taken into consideration. One of the simplest examples in modern terms is called triangle arbitrage, and it usually involves three pairs of currencies. Currency pairs are ratios; such as USD/AUD, which tells you, how many Australian dollars you receive for one US dollar. Imagine that there is a moment in time when the product of ratios \frac{USD}{AUD}\frac{AUD}{CAD}\frac{CAD}{USD} is 1.01. Then, a trader can take her USD, buy AUD, then use her AUD to buy CAD, and then use her CAD to buy USD. As long as the underlying prices didn’t change while she carried out these three trades, she would capture one cent of profit per trade.

After a few trades like this, the prices will equilibrate and the ratio will be restored to one. This is an example of “making money out of nothing.” Clever people have been trading on arbitrage since ancient times and it is a fundamental source of liquidity. It guarantees that the price you pay in Sydney is the same as the price you pay in New York. It also means that if you’re willing to overpay by a penny per share, then you’re guaranteed a computer will find this opportunity and your order will be filled immediately. The main difference now is that once a computer has been programmed to look for a certain type of arbitrage, then the human mind can no longer compete. This is one of the original arenas where the term “high-frequency” was used. Whoever has the fastest machines, is the one who will capture the profit.

Instability: I believe that the arguments against HFT of this type have the most credibility. The concern here is that exceptional leverage creates opportunity for catastrophe. Imaginations ran wild after the Flash Crash of 2010, and even if imaginations outstripped reality, we learned much about the potential instabilities of HFT. A few questions were posed, and we are still debating the answers. What happens if market makers stop trading in unison? What happens if a programming error leads to billions of dollars in mistaken trades? Do feedback loops between algo strategies lead to artificial prices? These are reasonable questions, which are grounded in examples, and future regulation coupled with monitoring should add stability where it’s feasible.

The culture in wealth driven industries today is appalling. However, it’s no worse in HFT than in finance more broadly and many other industries. It’s important that we dissociate our disgust in a broad culture of greed from debates about the merit of HFT. Black boxes are easy targets for blame because they don’t defend themselves. But that doesn’t mean they aren’t useful when implemented properly.

Are we better off with HFT? I’d argue a resounding yes. The primary function of markets is to allocate capital efficiently. Three of the strongest measures of the efficacy of markets lie in “bid-ask” spreads, volume and volatility. If spreads are low and volume is high, then participants are essentially guaranteed access to capital at as close to the “fair price” as possible. There is huge academic literature on how HFT has impacted spreads and volume but the majority of it indicates that spreads have lowered and volume has increased. However, as alluded to above, all of these points are subtle–but in my opinion, it’s clear that HFT has increased the efficiency of markets (it turns out that computers can sometimes be helpful.) Estimates of HFT’s impact on volatility haven’t been nearly as favorable but I’d also argue these studies are more debatable. Basically, correlation is not causation, and it just so happens that our rapidly developing world is probably more volatile than the pre-HFT world of the last Millennia.

We could regulate away HFT, but we wouldn’t be able to get rid of the underlying problems people point to unless we got rid of markets altogether. As with any new industry, there are aspects of HFT that should be better monitored and regulated, but we should have level-heads and diverse data points as we continue this discussion. As with most important problems, I believe the ultimate solution here lies in educating the public. Or in other words, this is my plug for Python classes for all children!!

I promise that I’ll repent by writing something that involves actual quantum things within the next two weeks!

Reporting from the ‘Frontiers of Quantum Information Science’

What am I referring to with this title? It is similar to the name of this blog–but that’s not where this particular title comes from–although there is a common denominator. Frontiers of Quantum Information Science was the theme for the 31st Jerusalem winter school in theoretical physics, which takes place annually at the Israeli Institute for Advanced Studies located on the Givat Ram campus of the Hebrew University of Jerusalem. The school took place from December 30, 2013 through January 9, 2014, but some of the attendees are still trickling back to their home institutions. The common denominator is that our very own John Preskill was the director of this school; co-directed by Michael Ben-Or and Patrick Hayden. John mentioned during a previous post and reiterated during his opening remarks that this is the first time the IIAS has chosen quantum information to be the topic for its prestigious advanced school–another sign of quantum information’s emergence as an important sub-field of physics. In this blog post, I’m going to do my best to recount these festivities while John protects his home from forest fires, prepares a talk for the Simons Institute’s workshop on Hamiltonian complexityteaches his quantum information course and celebrates his birthday 60+1.

The school was mainly targeted at physicists, but it was diversely represented. Proof of the value of this diversity came in an interaction between a computer scientist and a physicist, which led to one of the school’s most memorable moments. Both of my most memorable moments started with the talent show (I was surprised that so many talents were on display at a physics conference…) Anyways, towards the end of the show, Mateus Araújo Santos, a PhD student in Vienna, entered the stage and mentioned that he could channel “the ghost of Feynman” to serve as an oracle for NP-complete decision problems. After making this claim, people obviously turned to Scott Aaronson, hoping that he’d be able to break the oracle. However, in order for this to happen, we had to wait until Scott’s third lecture about linear optics and boson sampling the next day. You can watch Scott bombard the oracle with decision problems from 1:00-2:15 during the video from his third lecture.


Scott Aaronson grilling the oracle with a string of NP-complete decision problems! From 1:00-2:15 during this video.

The other most memorable moment was when John briefly danced Gangnam style during Soonwon Choi‘s talent show performance. Unfortunately, I thought I had this on video, but the video didn’t record. If anyone has video evidence of this, then please share!
Continue reading

The 10 biggest breakthroughs in physics over the past 25 years, according to us.

Making your way to the cutting edge of any field is a daunting challenge. But especially when the edge of the field is expanding; and even harder still when the rate of expansion is accelerating. John recently helped Physics World create a special 25th anniversary issue where they identified the five biggest breakthroughs in physics over the past 25 years, and also the five biggest open questions. In pure John fashion, at his group meeting on Wednesday night, he made us work before revealing the answers. The photo below shows our guesses, where the asterisks denote Physics World‘s selections. This is the blog post I wish I had when I was a fifteen year-old aspiring physicist–this is an attempt to survey and provide a tiny toehold on the edge (from my biased, incredibly naive, and still developing perspective.)

The IQI's

The IQI’s quantum information-biased guesses of Physics World’s 5 biggest breakthroughs over the past 25 years, and 5 biggest open problems. X’s denote Physics World’s selections. Somehow we ended up with 10 selections in each category…

The biggest breakthroughs of the past 25 years:

*Neutrino Mass: surprisingly, neutrinos have a nonzero mass, which provides a window into particle physics beyond the standard model. THE STANDARD MODEL has been getting a lot of attention recently. This is well deserved in my opinion, considering that the vast majority of its predictions have come true, most of which were made by the end of the 1960s. Last year’s discovery of the Higgs Boson is the feather in its cap. However, it’s boring when things work too perfectly, because then we don’t know what path to continue on. That’s where the neutrino mass comes in. First, what are neutrinos? Neutrinos are a fundamental particle that have the special property that they barely interact with other particles. There are four fundamental forces in nature: electromagnetism, gravity, strong (holds quarks together to create neutrons and protons), and weak (responsible for radioactivity and nuclear fusion.) We can design experiments which allow us to observe neutrinos. We have learned that they are electrically neutral, so they aren’t affected by electromagnetism. They are barely affected by the strong force, if at all. They have an extremely small mass, so gravity acts on them only subtly. The main way in which they interact with their environment is through the weak force. Here’s the amazing thing: only really clunky versions of the standard model can allow for a nonzero neutrino mass! Hence, when a small but nonzero mass was experimentally established in 1998, we gained one of our first toeholds into particle physics beyond the standard model. This is particularly important today, because to the best of my knowledge, the LHC hasn’t yet discovered any other new physics beyond the standard model. The mechanism behind the neutrino mass is not yet understood. Moreover, neutrinos have a bunch of other bizarre properties which we understand empirically, but not their theoretical origins. The strangest of which goes by the name neutrino oscillations. In one sentence: there are three different kinds of neutrinos, and they can spontaneously transmute themselves from one type to another. This happens because physics is formulated in the language of mathematics, and the math says that the eigenstates corresponding to ‘flavors’ are not the same as the eigenstates corresponding to ‘mass.’ Words, words, words. Maybe the Caltech particle theory people should have a blog?

Shor’s Algorithm: a quantum computer can factor N=1433301577 into 37811*37907 exponentially faster than a classical computer. This result from Peter Shor in 1994 is near and dear to our quantum hearts. It opened the floodgates showing that there are tasks a quantum computer could perform exponentially faster than a classical computer, and therefore that we should get BIG$$$ from the world over in order to advance our field!! The task here is factoring large numbers into their prime factors; the difficulty of which has been the basis for many cryptographic protocols. In one sentence, Shor’s algorithm achieves this exponential speed-up because there is a step in the factoring algorithm (period finding) which can be performed in parallel via the quantum Fourier transform.
Continue reading

On the importance of choosing a convenient basis

The benefits of Caltech’s proximity to Hollywood don’t usually trickle down to measly grad students like myself, except in the rare occasions when we befriend the industry’s technical contingent. One of my friends is a computer animator for Disney, which means that she designs algorithms enabling luxuriously flowing hair or trees with realistic lighting or feathers that have gorgeous texture, for movies like Wreck-it Ralph. Empowering computers to efficiently render scenes with these complicated details is trickier than you’d think and it requires sophisticated new mathematics. Fascinating conversations are one of the perks of having friends like this. But so are free trips to Disneyland! A couple nights ago, while standing in line for The Tower of Terror, I asked her what’s she’s currently working on. She’s very smart, as can be evidenced by her BS/MS in Computer Science/Mathematics from MIT, but she asked me if I “know about spherical harmonics.” Asking this to an aspiring quantum mechanic is like asking an auto mechanic if they know how to use a monkey wrench. She didn’t know what she was getting herself into!

me, LIGO, Disney

IQIM, LIGO, Disney

Along with this spherical harmonics conversation, I had a few other incidents last week that hammered home the importance of choosing a convenient basis when solving a scientific problem. First, my girlfriend works on LIGO and she’s currently writing her thesis. LIGO is a huge collaboration involving hundreds of scientists, and naturally, nobody there knows the detailed inner-workings of every subsystem. However, when it comes to writing the overview section of ones thesis, you need to at least make a good faith attempt to understand the whole behemoth. Anyways, my girlfriend recently asked if I know how the wavelet transform works. This is another example of a convenient basis, one that is particularly suited for analyzing abrupt changes, such as detecting the gravitational waves that would be emitted during the final few seconds of two black holes merging (ring-down). Finally, for the past couple weeks, I’ve been trying to understand entanglement entropy in quantum field theories. Most of the calculations that can be carried out explicitly are for the special subclass of quantum field theories called “conformal field theories,” which in two dimensions have a very convenient ‘basis’, the Virasoro algebra.

So why does a Disney animator care about spherical harmonics? It turns out that every frame that goes into one of Disney’s movies needs to be digitally rendered using a powerful computing cluster. The animated film industry has traded the painstaking process of hand-animators drawing every single frame, for the almost equally time-consuming process of computer clusters generating every frame. It doesn’t look like strong AI will be available in our immediate future, and in the meantime, humans are still much better than computers at detecting patterns and making intuitive judgements about the ‘physical correctness of an image.’ One of the primary advantages of computer animation is that an animator shouldn’t need to shade in every pixel of every frame — some of this burden should fall on computers. Let’s imagine a thought experiment. An animator wants to get the lighting correct for a nighttime indoor shot. They should be able to simply place the moon somewhere out of the shot, so that its glow can penetrate through the windows. They should also be able to choose from a drop down menu and tell the computer that a hand drawn lightbulb is a ‘light source.’ The computer should then figure out how to make all of the shadows and brightness appear physically correct. Another example of a hard problem is that an animator should be able to draw a character, then tell the computer that the hair they drew is ‘hair’, so that as the character moves through scenes, the physics of the hair makes sense. Programming computers do these things autonomously is harder than it sounds.

In the lighting example, imagine you want to get the lighting correct in a forest shot with complicated pine trees and leaf structures. The computer would need to do the ray-tracing for all of the photons emanating from the different light sources, and then the second-order effects as these photons reflect, and then third-order effects, etc. It’s a tall order to make the scene look accurate to the human eyeball/brain. Instead of doing all of this ray-tracing, it’s helpful to choose a convenient basis in order to dramatically speed up the processing. Instead of the complicated forest example, let’s imagine you are working with a tree from Super Mario Bros. Imagine drawing a sphere somewhere in the middle of this and then defining a ‘height function’, which outputs the ‘elevation’ of the tree foliage over each point on the sphere. I tried to use suggestive language, so that you’d draw an analogy to thinking of Earth’s ‘height function’ as the elevation of mountains and the depths of trenches over the sphere, with sea-level as a baseline. An example of how you could digitize this problem for a tree or for the earth is by breaking up the sphere into a certain number of pixels, maybe one per square meter for the earth (5*10^14 square meters gives approximately 2^49 pixels), and then associating an integer height value between [-2^15,2^15] to each pixel. This would effectively digitize the height map of the earth. In this case, keeping track of the elevation to approximately the meter level. But this leaves us with a huge amount of information that we need to store, and then process. We’d have to keep track of the height value for each pixel, giving us approximately 2^49*2^16=2^65 bits=4 exabytes that we’d have to keep track of. And this is for an easy static problem with only meter resolution! We can store this information much more efficiently using spherical harmonics.


There are many ways to think about spherical harmonics. Basically, they’re functions which map points on the sphere to real numbers Y_l^m: (\theta,\phi) \mapsto Y_l^m(\theta,\phi)\in\mathbb{R}, such that they satisfy a few special properties. They are orthogonal, meaning that if you multiply two different spherical harmonics together and then integrate over the sphere, then you get zero. If you square one of the functions and then integrate over the sphere, you get a finite, nonzero value. This means that they are orthogonal functions. They also span the space of all height functions that one could define over the sphere. This means that for a planet with an arbitrarily complicated topography, you would be able to find some weighted combination of different spherical harmonics which perfectly describes that planet’s topography. These are the key properties which make a set of functions a basis: they span and are orthogonal (this is only a heuristic). There is also a natural way to think about the light that hits the tree. We can use the same sphere and simply calculate the light rays as they would hit the ideal sphere. With these two different ‘height functions’, it’s easy to calculate the shadows and brightness inside the tree. You simply convolve the two functions, which is a fast operation on a computer. It also means that if the breeze slightly changes the shape of the tree, or if the sun moves a little bit, then it’s very easy to update the shading. Implicit in what I just said, using spherical harmonics allows us to efficiently store this height map. I haven’t calculated this on a computer, but it doesn’t seem totally crazy to think that we’d be able to store the topography of the earth to a reasonable accuracy, with 100 nonzero coefficients of the spherical harmonics to 64 bits of precision, 2^7*2^6= 2^13 << 2^65. Where does this cost savings come from? It comes from the fact that the spherical harmonics are a convenient basis, which naturally encode the types of correlations we see in Earth’s topography — if you’re standing at an elevation of 2000m, the area within ten meters is probably at a similar elevation. Cliffs are what break this basis — but are what the wavelet basis was designed to handle.

I’ve only described a couple bases in this post and I’ve neglected to mention some of the most famous examples! This includes the Fourier basis, which was designed to encode periodic signals, such as music and radio waves. I also have not gone into any detail about the Virasoro algebra, which I mentioned at the beginning of this post, and I’ve been using it heavily for the past few weeks. For the sake of diversity, I’ll spend a few sentences whetting your apetite. Complex analysis is primarily the study of analytic functions. In two dimensions, these analytic functions “preserve angles.” This means that if you have two curves which intersect at a point with angle \theta, then after using an analytic function to map these curves to their image, also in the complex plane, then the angle between the curves will still be \theta. An especially convenient basis for the analytic functions in two-dimensions (\{f: \mathbb{C} \to \mathbb{C}\}, where f(z) = \sum_{n=0}^{\infty} a_nz^n) is given by the set of functions \{l_n = -z^{n+1}\partial_z\}. As always, I’m not being exactly precise, but this is a ‘basis’ because we can encode all the information describing an infinitesimal two-dimensional angle-preserving map using these elements. It turns out to have incredibly special properties, including that its quantum cousin yields something called the “central charge” which has deep ramifications in physics, such as being related to the c-theorem. Conformal field theories are fascinating because they describe the physics of phase transitions. Having a convenient basis in two-dimensions is a large part of why we’ve been able to make progress in our understanding of two-dimensional phase transitions (more important is that the 2d conformal symmetry group is infinite-dimensional, but that’s outside the scope of this post.) Convenient bases are also important for detecting gravitational waves, making incredible movies and striking up nerdy conversations in long lines at Disneyland!