Sam Zhang

An illusion of predictability in scientific results

2022-05-11T12:00:00+00:00

Even experts think / that well-estimated means / are certain outcomes. (Haiku summary due to MSR poets Dan and Jake). Click the thumbnail for a mini-explainer.

Elliptic curves and traveling wave solutions to the KdV equations

2021-12-07T12:00:00+00:00

View traveling wave solutions to the KdV equations associated with (points of) elliptic curves.

Bayesian inference for COVID-19 testing

2021-10-12T12:00:00+00:00

My brief foray into epidemiology at the start of COVID-19. Bayesian methods and interactive tools for informing COVID-19 surveillance testing strategies.

Ridge unfolding polytopes

2021-10-12T12:00:00+00:00

See the full screen page.

Curve shortening

2020-10-29T01:44:00+00:00

Full screen version.

Suppose you had a smooth curve $C(s, t)$ parameterized by arclength and time, and you decided to continuously deform $C$ through time analogously to the heat equation. That is, parts of the curve with high curvature are very “hot” and would like to flow toward the “cold” parts that have negative curvature. We set

\[\frac{\partial C}{\partial t} = \frac{\partial^2 C}{\partial^2 s} = \kappa n\]

where $\kappa$ is the curvature and $n$ is the unit normal vector to the curve, so that the curve is shortened, and parts of the curve with larger curvature are shortened more. This is called curve shortening, and it leads to deep areas of geometric analysis. One major result is the Gage-Hamilton-Grayson theorem, which states that under the curve shortening flow, simple closed curves remain smoothly embedded without self-intersections until they become convex, after which they stay convex, before converging to a circle as the curve shrinks to a point.

But how do we implement curve shortening on a computer? Many discretizations have been proposed, and here we implement a particular algorithm proposed by Chow and Glickenstein (2007). The procedure is analogous to the continuous case, except since we have a polygon instead of a smooth curve, we do not have a “curvature” at any point. Instead, we approximate the normal vector for the vertex $x_{i, t}$ by setting

\[n_{i, t} = (x_{i+1, t} - x_i) + (x_{i-1, t} - x_{i, t})\]

and performing the discrete update step

\[x_{i, t+1} = x_{i, t} + \delta n_{i, t}\]

for some step size $\delta > 0$.

Do we have the equivalent of the Gage-Hamilton-Grayson theorem for this discrete curve shortening flow? Not in its entirety. Some results are known, such as that the curve shortens to a point, and it is asympototically an affine transformation of a regular polygon. But a key piece still unknown is whether a curve that starts without self-intersections will remain without self-intersections for each time step, for sufficiently small $\delta$.

In the visualization above, there are parameters to control $\delta$, the number of iterations, and the number of points on the polygon. I did not worry at all about numerical issues, as you might discover! The points on the polygon can of course be moved. The visualization on the right is actually an interactive three-dimensional figure shown with a top-down view. The temporal aspect of the curve-shortening flow is portrayed in color, but also in height, since one can consider the flow as slices in time of a higher dimensional object. I reparameterized the height to be spaced out in the earlier steps since the beginning steps of the algorithm are some of the more interesting. As for all the 3D stuff on this blog, the full screen version is much more satisfying than the tiny figures above.

I was tempted into making this thanks to the beautiful pictures in Discrete and Computational Geometry, by Satyan Devadoss and Joseph O’Rourke.

A catastrophe machine

2020-01-20T12:00:00+00:00

Catastrophe theory was a popular branch of mathematics concerned with qualitative characterizations of ways that singularities relate to potential energy functions. The name “catastrophe theory” comes from how a continuous change in a parameter of a potential function can introduce a discontinuous jump in observed phenomena, and this can be catastrophic in certain cases, like with a buckling beam. I say the field “was” popular, rather than “is”, because it was a classic example of a field of applied mathematics that was overhyped, applied too widely without adequate attention toward scientific fundamentals, and the backlash all but killed it. However, there is some talk about it recently becoming popular again for chemical applications. (Hence why this blog post!)

Plus, the mathematics is indisputably elegant – that was never in doubt, even during the controversy around it – and one can develop a basic intuition for it using simple physical examples that are well-suited for a website like this. I’ve created an interactive “catastrophe machine” that exhibits the basic form of the “cusp” catastrophe. Imagine a wedge shaped like a parabola balancing on a table (see thick blue outline in left figure below). The parameter that we will alter continuously is its center of mass (green dot in figure; drag it around!), and the discontinuity comes in the wedge’s equilibrium resting position.

Move the center of mass to the center-top of the parabola, within the dotted cusp (called the bifurcation set). Notice that two local minima emerge on the potential function. As the center of mass crosses the bifurcation set, one of the local minima disappear. If that was the one that the parabola was resting in, then the parabola undergoes a dramatic change in behavior: a “catastrophe” as it’s been termed.

Note that in the visualization, the line representing the ground moves when the center of mass moves, rather than the parabola itself. I only drew the point corresponding to the global minimum - perhaps it would have been more accurate to include both. Also note that the right hand figure is interactive.

Full screen version.

The right hand figure is the so-called catastrophe surface for this machine. For each possible center of gravity $(x, y)$ in the parabola, we give it z-coordinates for the critical points of the energy function, in other words, the zeros of the derivative of the potential energy function. That turns out to be a cubic polynomial in this case. The roots of a cubic polynomial vary continuously with the coefficients, and there can be one, two, or three roots. The catastrophe occurs when the point must “jump” across the fold on the catastrophe surface.

Two accessible introductions to the subject for anyone who has taken multivariable calculus are Curves and Singularities by J.W. Bruce and P.J. Giblin; Catastrophe Theory and its Applications by Tim Poston and Ian Stewart. Bruce and Giblin analyze this particular catastrophe in detail in Chapter 1, which they call the Poston catastrophe machine. This critique of catastrophe theory is among the more famous in mathematics. I think it still makes a great read, for the types of issues one encounters while doing modeling.

Linear regression, the old fashioned way

2019-06-21T12:00:00+00:00

Suppose you want to fit an ordinary least squares model to a set of points in $\mathbb{R}^2$, but you zoned out during statistics class and now you’re stuck on a desert island. In fact, all you have are a wooden board, a hammer, a bunch of nails, some good old zero-length springs, frictionless cloth loops, and a frictionless rod.

Pretend your wooden board is the plane, and decide on a grid system. Then nail each of your data points into the board. Attach to each nail a zero-length spring that can only move up or down. On the other end of each spring place a little cloth loop, for hanging something.

Now imagine taking the long frictionless rod, and threading it through each of these loops. The equilibrium state it reaches is the line of best fit!

I made an interactive demonstration of this, but it only works in full screen (click to open):

Full screen

You can drag on the rod, but you have to be very accurate with your mouse. Refreshing the page gives a new random set of points.

One can derive the optimality of the equilibrium state using the fact that the potential energy of a zero-length string extended by distance $u$ is $\frac{k}{2}u^2$, where $k$ is Hooke’s constant. It doesn’t matter what Hooke’s constant is, as long as it’s positive, so we can set it to $k=2$ to eliminate the fraction. One ends up with the total potential energy of the system $\sum_{i=0}^n (Ax_i - y_i)^2$, where $(x_i, y_i)$ are the points (nails), and $A$ is the function that tells the $y$ value of the rod at point $x_i$. We know $A$ is linear since the rod is straight. This happens to be the loss function for ordinary least squares!

Squaring is convex, and $Ax_i -y_i$ is affine. Composing convex functions with affine ones gives convex functions, and adding convex functions gives convex functions. So the whole loss function is convex. If there is more than one data point, it is strongly convex, and the only minimum in the system is the unique global minimum. In other words, the resting point for the rod attached to the springs is the (unique) line of best fit. Hooray!

Why do the springs have to go straight down? Well, they don’t have to! If you loosen that restriction, then you end up with a Total Least Squares regression instead. A TLS model allows for error not just in the $y$ axis, like OLS, but also the $x$ axis. For example, suppose you took noisy x-y plane GPS measurements along a straight trail, and you wanted to estimate a line through the actual trail. Since there is error on both $x$ and $y$, one can use TLS. (If we assume the variance on $x$ and $y$ here is the same, then one actually drops into a subcase of TLS called orthogonal regression.)

The example was made using planck.js, which is a nice javascript port of Box2D, although it was sorely lacking in documentation. For instance, I couldn’t find an easy way to run my demo outside of fullscreen.

This setup came from the book, The Mathematical Mechanic, which is full of perversities like this.

A moduli space for triangles

2019-02-12T12:00:00+00:00

Charles Dodgson, better known as Lewis Carroll, once posed the following mathematical question: what is the probability that a random triangle is obtuse?

The answer he gave was incorrect, and several subsequent attempts to correct the answer also fell short. In a 2017 paper from Jason Canterella, Tom Needham, Clayton Shonkwiler, and Gavin Stewart, a nice resolution is presented to this problem.

The key issue with Dodgson’s original solution was that in order to say what probability a random triangle is obtuse, he had to first come up with a space that triangles lived in, and find what part of that space corresponded to obtuse triangles. This turns out to be a subtle issue. Canterella et al. solve this by associating each n-gons with a so-called Grassmanian of 2-planes in $\mathbb{R}^n$. This blog post is a demonstration of their parameterization, omitting all of the details :), using triangles and $\mathbb{R}^3$.

In short, there is a way to send any n-gon with a fixed perimeter to a pair of orthonormal vectors in $\mathbb{R^n}$ up to translation (technically, $2^n$ pairs, but they are all identified with each other). The space of all pairs of orthonormal vectors is a Stiefel manifold, and that is the space we would operate in if we cared about the orientation (as in rotations) of the polygon. But what if we don’t care about the orientation? It turns out we get lucky and rotating the polygon sends these pairs of orthonormal vectors to the same spanning plane. Hence why the Grassmannian of 2-planes serves as the moduli space for polygons when we don’t care about rotations.

Since we’re in $\mathbb{R}^3$ here, we have an identification between the Grassmannian of 2-planes and the real projective plane $\mathbb{R}P^2$ (through the normal vector to the 2-planes). Recall that the real projective plane can be visualized as lines through a sphere: that is what you are looking at in the top pane.

Full screen version.

The top pane (the sphere) is the moduli space, with both the yellow and green triangles from the middle panes displayed on it as yellow and green lines. The moduli space is eight-fold covered by the sphere, hence why there are eight lines of both colors. All of the lines of the same color are quotiented together. For appearance’s sake, I highlighted one of cosets in blue and thickened the representative vectors through that sector.

The yellow line connecting the two distinguished representatives is a geodesic. Thus in some sense, it represents an optimal way to transform the green triangle into the yellow one, or vice versa. That transformation is what you see in the bottom pane.

You can move the vertices of the triangles around in the middle two panes and watch it move in the moduli space. I normalize the perimeters and make some arbitrary choices when it comes to orientation.

Hat tip to Rob Hines, who first told me about this.

The cross-section of a cylinder is an ellipse

2019-01-26T12:00:00+00:00

Here’s an interactive visual diagram to accompany the proof of the elementary fact that the cross-section of a cylinder is an ellipse:

Full screen version.

I was inspired to make this when I saw this illustration in Geometry and the Imagination.

Given a cylinder with radius $r$, and the plane slicing through it, the proof goes as follows. Imagine the cylinder is hollow, with two spheres placed snugly (that is, of radius $r$) into it, so that they touch against the intersecting plane. The claim is that these points where the sphere touch the cross-section are the foci of the ellipse. We’ll show this is an ellipse by proving that the sum of the distance from the foci to any point on the boundary is constant.

Take an arbitrary point on the ellipse, $p$. Then let’s call the lengths of the two red lines $r_{v, p}$ and $r_{h, p}$ for the vertical red line and the “horizontal” (ellipse-bound) red line from $p$. The interactive visualization above loops over the values of $p$. Likewise, the lengths of the blue lines will be denoted $b_{v, p}$ and $b_{h, p}$.

Then notice that the two red lines in the visualization are both tangent to the sphere: the wall of the cylinder is tangent to the sphere at all points where they make contact, and the intersecting plane is tangent to the sphere.

Now recall (or convince yourself) that tangents from a sphere that intersect at the same point are of the same length. That is, $r_{v, p} = r_{h, p}$, and $b_{v, p} = b_{h, p}$ for all $p$.

Moreover, the two vertical lines always sum up to a constant: $r_{v, p} + b_{v, p} = K$ for all $p$.

Combining those two equations gives $r_{h, p} + b_{h, p} = K$ for all $p$, which shows that the cross-section is an ellipse.

I think it’s a testament to good mathematical illustration that the drawing from Geometry and the Imagination is easier to follow (for me) than the interactive 3d one. But there is something very neat about seeing these old diagrams “come to life”.

I made this using the library mathbox.

Whitney’s Umbrella

2019-01-24T12:00:00+00:00

I’ve been test driving the excellent mathbox WebGL library. For starters, here’s a pan-and-zoomable Whitney’s umbrella:

Full screen version.

I have been thinking about how to organize the contents of this blog more visually as well. More posts with 3d visualizations to come.

My main inspirations for beautiful mathematical illustrations have been Geometry and the Imagination and A Topological Picturebook. As I become more comfortable with mathbox, and (painfully) shake the rust off of my d3 skills (while learning d3 v4), I hope more of the images from those books make their way to this blog.