The benefits of Caltech’s proximity to Hollywood don’t usually trickle down to measly grad students like myself, except in the rare occasions when we befriend the industry’s technical contingent. One of my friends is a computer animator for Disney, which means that she designs algorithms enabling luxuriously flowing hair or trees with realistic lighting or feathers that have gorgeous texture, for movies like Wreck-it Ralph. Empowering computers to efficiently render scenes with these complicated details is trickier than you’d think and it requires sophisticated new mathematics. Fascinating conversations are one of the perks of having friends like this. But so are free trips to Disneyland! A couple nights ago, while standing in line for The Tower of Terror, I asked her what’s she’s currently working on. She’s very smart, as can be evidenced by her BS/MS in Computer Science/Mathematics from MIT, but she asked me if I “know about spherical harmonics.” Asking this to an aspiring quantum mechanic is like asking an auto mechanic if they know how to use a monkey wrench. She didn’t know what she was getting herself into!
Along with this spherical harmonics conversation, I had a few other incidents last week that hammered home the importance of choosing a convenient basis when solving a scientific problem. First, my girlfriend works on LIGO and she’s currently writing her thesis. LIGO is a huge collaboration involving hundreds of scientists, and naturally, nobody there knows the detailed inner-workings of every subsystem. However, when it comes to writing the overview section of ones thesis, you need to at least make a good faith attempt to understand the whole behemoth. Anyways, my girlfriend recently asked if I know how the wavelet transform works. This is another example of a convenient basis, one that is particularly suited for analyzing abrupt changes, such as detecting the gravitational waves that would be emitted during the final few seconds of two black holes merging (ring-down). Finally, for the past couple weeks, I’ve been trying to understand entanglement entropy in quantum field theories. Most of the calculations that can be carried out explicitly are for the special subclass of quantum field theories called “conformal field theories,” which in two dimensions have a very convenient ‘basis’, the Virasoro algebra.
So why does a Disney animator care about spherical harmonics? It turns out that every frame that goes into one of Disney’s movies needs to be digitally rendered using a powerful computing cluster. The animated film industry has traded the painstaking process of hand-animators drawing every single frame, for the almost equally time-consuming process of computer clusters generating every frame. It doesn’t look like strong AI will be available in our immediate future, and in the meantime, humans are still much better than computers at detecting patterns and making intuitive judgements about the ‘physical correctness of an image.’ One of the primary advantages of computer animation is that an animator shouldn’t need to shade in every pixel of every frame — some of this burden should fall on computers. Let’s imagine a thought experiment. An animator wants to get the lighting correct for a nighttime indoor shot. They should be able to simply place the moon somewhere out of the shot, so that its glow can penetrate through the windows. They should also be able to choose from a drop down menu and tell the computer that a hand drawn lightbulb is a ‘light source.’ The computer should then figure out how to make all of the shadows and brightness appear physically correct. Another example of a hard problem is that an animator should be able to draw a character, then tell the computer that the hair they drew is ‘hair’, so that as the character moves through scenes, the physics of the hair makes sense. Programming computers do these things autonomously is harder than it sounds.
In the lighting example, imagine you want to get the lighting correct in a forest shot with complicated pine trees and leaf structures. The computer would need to do the ray-tracing for all of the photons emanating from the different light sources, and then the second-order effects as these photons reflect, and then third-order effects, etc. It’s a tall order to make the scene look accurate to the human eyeball/brain. Instead of doing all of this ray-tracing, it’s helpful to choose a convenient basis in order to dramatically speed up the processing. Instead of the complicated forest example, let’s imagine you are working with a tree from Super Mario Bros. Imagine drawing a sphere somewhere in the middle of this and then defining a ‘height function’, which outputs the ‘elevation’ of the tree foliage over each point on the sphere. I tried to use suggestive language, so that you’d draw an analogy to thinking of Earth’s ‘height function’ as the elevation of mountains and the depths of trenches over the sphere, with sea-level as a baseline. An example of how you could digitize this problem for a tree or for the earth is by breaking up the sphere into a certain number of pixels, maybe one per square meter for the earth (5*10^14 square meters gives approximately 2^49 pixels), and then associating an integer height value between [-2^15,2^15] to each pixel. This would effectively digitize the height map of the earth. In this case, keeping track of the elevation to approximately the meter level. But this leaves us with a huge amount of information that we need to store, and then process. We’d have to keep track of the height value for each pixel, giving us approximately 2^49*2^16=2^65 bits=4 exabytes that we’d have to keep track of. And this is for an easy static problem with only meter resolution! We can store this information much more efficiently using spherical harmonics.
There are many ways to think about spherical harmonics. Basically, they’re functions which map points on the sphere to real numbers , such that they satisfy a few special properties. They are orthogonal, meaning that if you multiply two different spherical harmonics together and then integrate over the sphere, then you get zero. If you square one of the functions and then integrate over the sphere, you get a finite, nonzero value. This means that they are orthogonal functions. They also span the space of all height functions that one could define over the sphere. This means that for a planet with an arbitrarily complicated topography, you would be able to find some weighted combination of different spherical harmonics which perfectly describes that planet’s topography. These are the key properties which make a set of functions a basis: they span and are orthogonal (this is only a heuristic). There is also a natural way to think about the light that hits the tree. We can use the same sphere and simply calculate the light rays as they would hit the ideal sphere. With these two different ‘height functions’, it’s easy to calculate the shadows and brightness inside the tree. You simply convolve the two functions, which is a fast operation on a computer. It also means that if the breeze slightly changes the shape of the tree, or if the sun moves a little bit, then it’s very easy to update the shading. Implicit in what I just said, using spherical harmonics allows us to efficiently store this height map. I haven’t calculated this on a computer, but it doesn’t seem totally crazy to think that we’d be able to store the topography of the earth to a reasonable accuracy, with 100 nonzero coefficients of the spherical harmonics to 64 bits of precision, 2^7*2^6= 2^13 << 2^65. Where does this cost savings come from? It comes from the fact that the spherical harmonics are a convenient basis, which naturally encode the types of correlations we see in Earth’s topography — if you’re standing at an elevation of 2000m, the area within ten meters is probably at a similar elevation. Cliffs are what break this basis — but are what the wavelet basis was designed to handle.
I’ve only described a couple bases in this post and I’ve neglected to mention some of the most famous examples! This includes the Fourier basis, which was designed to encode periodic signals, such as music and radio waves. I also have not gone into any detail about the Virasoro algebra, which I mentioned at the beginning of this post, and I’ve been using it heavily for the past few weeks. For the sake of diversity, I’ll spend a few sentences whetting your apetite. Complex analysis is primarily the study of analytic functions. In two dimensions, these analytic functions “preserve angles.” This means that if you have two curves which intersect at a point with angle , then after using an analytic function to map these curves to their image, also in the complex plane, then the angle between the curves will still be An especially convenient basis for the analytic functions in two-dimensions (, where ) is given by the set of functions . As always, I’m not being exactly precise, but this is a ‘basis’ because we can encode all the information describing an infinitesimal two-dimensional angle-preserving map using these elements. It turns out to have incredibly special properties, including that its quantum cousin yields something called the “central charge” which has deep ramifications in physics, such as being related to the c-theorem. Conformal field theories are fascinating because they describe the physics of phase transitions. Having a convenient basis in two-dimensions is a large part of why we’ve been able to make progress in our understanding of two-dimensional phase transitions (more important is that the 2d conformal symmetry group is infinite-dimensional, but that’s outside the scope of this post.) Convenient bases are also important for detecting gravitational waves, making incredible movies and striking up nerdy conversations in long lines at Disneyland!