Greg Kuperberg’s calculus problem

“How good are you at calculus?”

This was the opening sentence of Greg Kuperberg’s Facebook status on July 4th, 2016.

“I have a joint paper (on isoperimetric inequalities in differential geometry) in which we need to know that

(\sin\theta)^3 xy + ((\cos\theta)^3 -3\cos\theta +2) (x+y) - (\sin\theta)^3-6\sin\theta -6\theta + 6\pi \\ \\- 6\arctan(x) +2x/(1+x^2) -6\arctan(y) +2y/(1+y^2)

is non-negative for x and y non-negative and \theta between 0 and \pi. Also, the minimum only occurs for x=y=1/(\tan(\theta/2).”

Let’s take a moment to appreciate the complexity of the mathematical statement above. It is a non-linear inequality in three variables, mixing trigonometry with algebra and throwing in some arc-tangents for good measure. Greg, continued:

“We proved it, but only with the aid of symbolic algebra to factor an algebraic variety into irreducible components. The human part of our proof is also not really a cake walk.

A simpler proof would be way cool.”

I was hooked. The cubic terms looked a little intimidating, but if I converted x and y into \tan(\theta_x) and \tan(\theta_y), respectively, as one of the comments on Facebook promptly suggested, I could at least get rid of the annoying arc-tangents and then calculus and trigonometry would take me the rest of the way. Greg replied to my initial comment outlining a quick route to the proof: “Let me just caution that we found the problem unyielding.” Hmm… Then, Greg revealed that the paper containing the original proof was over three years old (had he been thinking about this since then? that’s what true love must be like.) Titled “The Cartan-Hadamard Conjecture and The Little Prince“, the above inequality makes its appearance as Lemma 7.1 on page 45 (of 63). To quote the paper: “Although the lemma is evident from contour plots, the authors found it surprisingly tricky to prove rigorously.”

As I filled pages of calculations and memorized every trigonometric identity known to man, I realized that Greg was right: the problem was highly intractable. The quick solution that was supposed to take me two to three days turned into two weeks of hell, until I decided to drop the original approach and stick to doing calculus with the known unknowns, x and y. The next week led me to a set of three non-linear equations mixing trigonometric functions with fourth powers of x and y, at which point I thought of giving up. I knew what I needed to do to finish the proof, but it looked freaking insane. Still, like the masochist that I am, I continued calculating away until my brain was mush. And then, yesterday, during a moment of clarity, I decided to go back to one of the three equations and rewrite it in a different way. That is when I noticed the error. I had solved for \cos\theta in terms of x and y, but I had made a mistake that had cost me 10 days of intense work with no end in sight. Once I found the mistake, the whole proof came together within about an hour. At that moment, I felt a mix of happiness (duh), but also sadness, as if someone I had grown fond of no longer had a reason to spend time with me and, at the same time, I had ran out of made-up reasons to hang out with them. But, yeah, I mostly felt happiness.

Greg Kuperberg pondering about the universe of mathematics.

Greg Kuperberg pondering about the universe of mathematics.

Before I present the proof below, I want to take a moment to say a few words about Greg, whom I consider to be the John Preskill of mathematics: a lodestar of sanity in a sea of hyperbole (to paraphrase Scott Aaronson). When I started grad school at UC Davis back in 2003, quantum information theory and quantum computing were becoming “a thing” among some of the top universities around the US. So, I went to several of the mathematics faculty in the department asking if there was a course on quantum information theory I could take. The answer was to “read Nielsen and Chuang and then go talk to Professor Kuperberg”. Being a foolish young man, I skipped the first part and went straight to Greg to ask him to teach me (and four other brave souls) quantum “stuff”. Greg obliged with a course on… quantum probability and quantum groups. Not what I had in mind. This guy was hardcore. Needless to say, the five brave souls taking the class (mostly fourth year graduate students and me, the noob) quickly became three, then two gluttons for punishment (the other masochist became one of my best friends in grad school). I could not drop the class, not because I had asked Greg to do this as a favor to me, but because I knew that I was in the presence of greatness (or maybe it was Stockholm syndrome). My goal then, as an aspiring mathematician, became to one day have a conversation with Greg where, for some brief moment, I would not sound stupid. A man of incredible intelligence, Greg is that rare individual whose character matches his intellect. Much like the anti-heroes portrayed by Humphrey Bogart in Casablanca and the Maltese Falcon, Greg keeps a low-profile, seems almost cynical at times, but in the end, he works harder than everyone else to help those in need. For example, on MathOverflow, a question and answer website for professional mathematicians around the world, Greg is listed as one of the top contributors of all time.

But, back to the problem. The past four weeks thinking about it have oscillated between phases of “this is the most fun I’ve had in years!” to “this is Greg’s way of telling me I should drop math and become a go-go dancer”. Now that the ordeal is over, I can confidently say that the problem is anything but “dull” (which is how Greg felt others on MathOverflow would perceive it, so he never posted it there). In fact, if I ever have to teach Calculus, I will subject my students to the step-by-step proof of this problem. OK, here is the proof. This one is for you Greg. Thanks for being such a great role model. Sorry I didn’t get to tell you until now. And you are right not to offer a “bounty” for the solution. The journey (more like, a trip to Mordor and back) was all the money.

The proof: The first thing to note (and if I had read Greg’s paper earlier than today, I would have known as much weeks ago) is that the following equality holds (which can be verified quickly by differentiating both sides):

4 x - 6\arctan(x) +2x/(1+x^2) = 4 \int_0^x \frac{s^4}{(1+s^2)^2} ds.

Using the above equality (and the equivalent one for y), we get:

F(\theta,x,y) = (\sin\theta)^3 xy + ((\cos\theta)^3 -3\cos\theta -2) (x+y) - (\sin\theta)^3-6\sin\theta -6\theta + 6\pi \\ \\4 \int_0^x \frac{s^4}{(1+s^2)^2} ds+4 \int_0^y \frac{s^4}{(1+s^2)^2} ds.

Now comes the fun part. We differentiate with respect to \theta, x and y, and set to zero to find all the maxima and minima of F(\theta,x,y) (though we are only interested in the global minimum, which is supposed to be at x=y=\tan^{-1}(\theta/2)). Some high-school level calculus yields:

\partial_\theta F(\theta,x,y) = 0 \implies \sin^2(\theta) (\cos(\theta) xy + \sin(\theta)(x+y)) = \\ \\ 2 (1+\cos(\theta))+\sin^2(\theta)\cos(\theta).

At this point, the most well-known trigonometric identity of all time, \sin^2(\theta)+\cos^2(\theta)=1, can be used to show that the right-hand-side can be re-written as:

2(1+\cos(\theta))+\sin^2(\theta)\cos(\theta) = \sin^2(\theta) (\cos\theta \tan^{-2}(\theta/2) + 2\sin\theta \tan^{-1}(\theta/2)),

where I used (my now favorite) trigonometric identity: \tan^{-1}(\theta/2) = (1+\cos\theta)/\sin(\theta) (note to the reader: \tan^{-1}(\theta) = \cot(\theta)). Putting it all together, we now have the very suggestive condition:

\sin^2(\theta) (\cos(\theta) (xy-\tan^{-2}(\theta/2)) + \sin(\theta)(x+y-2\tan^{-1}(\theta/2))) = 0,

noting that, despite appearances, \theta = 0 is not a solution (as can be checked from the original form of this equality, unless x and y are infinite, in which case the expression is clearly non-negative, as we show towards the end of this post). This leaves us with \theta = \pi and

\cos(\theta) (\tan^{-2}(\theta/2)-xy) = \sin(\theta)(x+y-2\tan^{-1}(\theta/2)),

as candidates for where the minimum may be. A quick check shows that:

F(\pi,x,y) = 4 \int_0^x \frac{s^4}{(1+s^2)^2} ds+4 \int_0^y \frac{s^4}{(1+s^2)^2} ds \ge 0,

since x and y are non-negative. The following obvious substitution becomes our greatest ally for the rest of the proof:

x= \alpha \tan^{-1}(\theta/2), \, y = \beta \tan^{-1}(\theta/2).

Substituting the above in the remaining condition for \partial_\theta F(\theta,x,y) = 0, and using again that \tan^{-1}(\theta/2) = (1+\cos\theta)/\sin\theta, we get:

\cos\theta (1-\alpha\beta) = (1-\cos\theta) ((\alpha-1) + (\beta-1)),

which can be further simplified to (if you are paying attention to minus signs and don’t waste a week on a wild-goose chase like I did):

\cos\theta = \frac{1}{1-\beta}+\frac{1}{1-\alpha}.

As Greg loves to say, we are finally cooking with gas. Note that the expression is symmetric in \alpha and \beta, which should be obvious from the symmetry of F(\theta,x,y) in x and y. That observation will come in handy when we take derivatives with respect to x and y now. Factoring (\cos\theta)^3 -3\cos\theta -2 = - (1+\cos\theta)^2(2-\cos\theta), we get:

\partial_x F(\theta,x,y) = 0 \implies \sin^3(\theta) y + 4\frac{x^4}{(1+x^2)^2} = (1+\cos\theta)^2 + \sin^2\theta (1+\cos\theta).

Substituting x and y with \alpha \tan^{-1}(\theta/2), \beta \tan^{-1}(\theta/2), respectively and using the identities \tan^{-1}(\theta/2) = (1+\cos\theta)/\sin\theta and \tan^{-2}(\theta/2) = (1+\cos\theta)/(1-\cos\theta), the above expression simplifies significantly to the following expression:

4\alpha^4 =\left((\alpha^2-1)\cos\theta+\alpha^2+1\right)^2 \left(1 + (1-\beta)(1-\cos\theta)\right).

Using \cos\theta = \frac{1}{1-\beta}+\frac{1}{1-\alpha}, which we derived earlier by looking at the extrema of F(\theta,x,y) with respect to \theta, and noting that the global minimum would have to be an extremum with respect to all three variables, we get:

4\alpha^4 (1-\beta) = \alpha (\alpha-1) (1+\alpha + \alpha(1-\beta))^2,

where we used 1 + (1-\beta)(1-\cos\theta) = \alpha (1-\beta) (\alpha-1)^{-1} and

(\alpha^2-1)\cos\theta+\alpha^2+1 = (\alpha+1)((\alpha-1)\cos\theta+1)+\alpha(\alpha-1) = \\ (\alpha-1)(1-\beta)^{-1} (2\alpha + 1-\alpha\beta).

We may assume, without loss of generality, that x \ge y. If \alpha = 0, then \alpha = \beta = 0, which leads to the contradiction \cos\theta = 2, unless the other condition, \theta = \pi, holds, which leads to F(\pi,0,0) = 0. Dividing through by \alpha and re-writing 4\alpha^3(1-\beta) = 4\alpha(1+\alpha)(\alpha-1)(1-\beta) + 4\alpha(1-\beta), yields:

4\alpha (1-\beta) = (\alpha-1) (1+\alpha - \alpha(1-\beta))^2 = (\alpha-1)(1+\alpha\beta)^2,

which can be further modified to:

4\alpha +(1-\alpha\beta)^2 = \alpha (1+\alpha\beta)^2,

and, similarly for \beta (due to symmetry):

4\beta +(1-\alpha\beta)^2 = \beta (1+\alpha\beta)^2.

Subtracting the two equations from each other, we get:

4(\alpha-\beta) = (\alpha-\beta)(1+\alpha\beta)^2,

which implies that \alpha = \beta and/or \alpha\beta =1. The first leads to 4\alpha (1-\alpha) = (\alpha-1)(1+\alpha^2)^2, which immediately implies \alpha = 1 = \beta (since the left and right side of the equality have opposite signs otherwise). The second one implies that either \alpha+\beta =2, or \cos\theta =1, which follows from the earlier equation \cos\theta (1-\alpha\beta) = (1-\cos\theta) ((\alpha-1) + (\beta-1)). If \alpha+\beta =2 and 1 = \alpha\beta, it is easy to see that \alpha=\beta=1 is the only solution by expanding (\sqrt{\alpha}-\sqrt{\beta})^2=0. If, on the other hand, \cos\theta = 1, then looking at the original form of F(\theta,x,y), we see that F(0,x,y) = 6\pi - 6\arctan(x) +2x/(1+x^2) -6\arctan(y) +2y/(1+y^2) \ge 0, since x,y \ge 0 \implies \arctan(x)+\arctan(y) \le \pi.

And that concludes the proof, since the only cases for which all three conditions are met lead to \alpha = \beta = 1 and, hence, x=y=\tan^{-1}(\theta/2). The minimum of F(\theta, x,y) at these values is always zero. That’s right, all this work to end up with “nothing”. But, at least, the last four weeks have been anything but dull.

Update: Greg offered Lemma 7.4 from the same paper as another challenge (the sines, cosines and tangents are now transformed into hyperbolic trigonometric functions, with a few other changes, mostly in signs, thrown in there). This is a more hardcore-looking inequality, but the proof turns out to follow the steps of Lemma 7.1 almost identically. In particular, all the conditions for extrema are exactly the same, with the only difference being that cosine becomes hyperbolic cosine. It is an awesome exercise in calculus to check this for yourself. Do it. Unless you have something better to do.

8 thoughts on “Greg Kuperberg’s calculus problem

  1. Pingback: Greg Kuperberg’s calculus problem | Φ...

  2. Let me say that the mathematics beyond “the most well-known trigonometric identity of all time” is mostly intractable to me but still this was an enjoyable, very readable and engaging narrative.
    Thanks for sharing your story – it highlights the human aspect in problem solving.

    • Thanks Terry. Indeed, the math used here may seem intractable to most non-practicing mathematicians, but nothing is beyond high school level algebra, trigonometry and basic calculus. The hard part is the “how did he know to do that substitution, or re-write the expression in that other form”? The cases where this was not obvious in my writing are the cases that required two weeks of looking at trigonometric identities until they were my best friends.

    • What a rude and spiteful comment: Spyridon emphatically says he worked out this problem just for fun, with no pretensions to grandeur. You know what the mark of a small, pathetic mind is? Trivializing someone else’s work solely in order to self-aggrandize.

      • It’s not that bad. Perhaps Jack doesn’t have a lot of experience reading about the weeks often spent “barking up the wrong tree” in research, and just wanted to apprise us of another interesting anecdote.

Your thoughts here.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s