Curve shortening

Suppose you had a smooth curve $C(s, t)$ parameterized by arclength and time, and you decided to continuously deform $C$ through time analogously to the heat equation. That is, parts of the curve with high curvature are very “hot” and would like to flow toward the “cold” parts that have negative curvature. We set

$\frac{\partial C}{\partial t} = \frac{\partial^2 C}{\partial^2 s} = \kappa n$

where $\kappa$ is the curvature and $n$ is the unit normal vector to the curve, so that the curve is shortened, and parts of the curve with larger curvature are shortened more. This is called curve shortening, and it leads to deep areas of geometric analysis. One major result is the Gage-Hamilton-Grayson theorem, which states that under the curve shortening flow, simple closed curves remain smoothly embedded without self-intersections until they become convex, after which they stay convex, before converging to a circle as the curve shrinks to a point.

But how do we implement curve shortening on a computer? Many discretizations have been proposed, and here we implement a particular algorithm proposed by Chow and Glickenstein (2007). The procedure is analogous to the continuous case, except since we have a polygon instead of a smooth curve, we do not have a “curvature” at any point. Instead, we approximate the normal vector for the vertex $x_{i, t}$ by setting

$n_{i, t} = (x_{i+1, t} - x_i) + (x_{i-1, t} - x_{i, t})$

and performing the discrete update step

$x_{i, t+1} = x_{i, t} + \delta n_{i, t}$

for some step size $\delta > 0$.

Do we have the equivalent of the Gage-Hamilton-Grayson theorem for this discrete curve shortening flow? Not in its entirety. Some results are known, such as that the curve shortens to a point, and it is asympototically an affine transformation of a regular polygon. But a key piece still unknown is whether a curve that starts without self-intersections will remain without self-intersections for each time step, for sufficiently small $\delta$.

In the visualization above, there are parameters to control $\delta$, the number of iterations, and the number of points on the polygon. I did not worry at all about numerical issues, as you might discover! The points on the polygon can of course be moved. The visualization on the right is actually an interactive three-dimensional figure shown with a top-down view. The temporal aspect of the curve-shortening flow is portrayed in color, but also in height, since one can consider the flow as slices in time of a higher dimensional object. I reparameterized the height to be spaced out in the earlier steps since the beginning steps of the algorithm are some of the more interesting. As for all the 3D stuff on this blog, the full screen version is much more satisfying than the tiny figures above.

I was tempted into making this thanks to the beautiful pictures in Discrete and Computational Geometry, by Satyan Devadoss and Joseph O’Rourke.

A catastrophe machine

Catastrophe theory was a popular branch of mathematics concerned with qualitative characterizations of ways that singularities relate to potential energy functions. The name “catastrophe theory” comes from how a continuous change in a parameter of a potential function can introduce a discontinuous jump in observed phenomena, and this can be catastrophic in certain cases, like with a buckling beam. I say the field “was” popular, rather than “is”, because it was a classic example of a field of applied mathematics that was overhyped, applied too widely without adequate attention toward scientific fundamentals, and the backlash all but killed it. However, there is some talk about it recently becoming popular again for chemical applications. (Hence why this blog post!)

Plus, the mathematics is indisputably elegant – that was never in doubt, even during the controversy around it – and one can develop a basic intuition for it using simple physical examples that are well-suited for a website like this. I’ve created an interactive “catastrophe machine” that exhibits the basic form of the “cusp” catastrophe. Imagine a wedge shaped like a parabola balancing on a table (see thick blue outline in left figure below). The parameter that we will alter continuously is its center of mass (green dot in figure; drag it around!), and the discontinuity comes in the wedge’s equilibrium resting position.

Move the center of mass to the center-top of the parabola, within the dotted cusp (called the bifurcation set). Notice that two local minima emerge on the potential function. As the center of mass crosses the bifurcation set, one of the local minima disappear. If that was the one that the parabola was resting in, then the parabola undergoes a dramatic change in behavior: a “catastrophe” as it’s been termed.

Note that in the visualization, the line representing the ground moves when the center of mass moves, rather than the parabola itself. I only drew the point corresponding to the global minimum - perhaps it would have been more accurate to include both. Also note that the right hand figure is interactive.

The right hand figure is the so-called catastrophe surface for this machine. For each possible center of gravity $(x, y)$ in the parabola, we give it z-coordinates for the critical points of the energy function, in other words, the zeros of the derivative of the potential energy function. That turns out to be a cubic polynomial in this case. The roots of a cubic polynomial vary continuously with the coefficients, and there can be one, two, or three roots. The catastrophe occurs when the point must “jump” across the fold on the catastrophe surface.

Two accessible introductions to the subject for anyone who has taken multivariable calculus are Curves and Singularities by J.W. Bruce and P.J. Giblin; Catastrophe Theory and its Applications by Tim Poston and Ian Stewart. Bruce and Giblin analyze this particular catastrophe in detail in Chapter 1, which they call the Poston catastrophe machine. This critique of catastrophe theory is among the more famous in mathematics. I think it still makes a great read, for the types of issues one encounters while doing modeling.

Linear regression, the old fashioned way

Suppose you want to fit an ordinary least squares model to a set of points in $\mathbb{R}^2$, but you zoned out during statistics class and now you’re stuck on a desert island. In fact, all you have are a wooden board, a hammer, a bunch of nails, some good old zero-length springs, frictionless cloth loops, and a frictionless rod.

Pretend your wooden board is the plane, and decide on a grid system. Then nail each of your data points into the board. Attach to each nail a zero-length spring that can only move up or down. On the other end of each spring place a little cloth loop, for hanging something.

Now imagine taking the long frictionless rod, and threading it through each of these loops. The equilibrium state it reaches is the line of best fit!

I made an interactive demonstration of this, but it only works in full screen (click to open): Full screen

You can drag on the rod, but you have to be very accurate with your mouse. Refreshing the page gives a new random set of points.

One can derive the optimality of the equilibrium state using the fact that the potential energy of a zero-length string extended by distance $u$ is $\frac{k}{2}u^2$, where $k$ is Hooke’s constant. It doesn’t matter what Hooke’s constant is, as long as it’s positive, so we can set it to $k=2$ to eliminate the fraction. One ends up with the total potential energy of the system $\sum_{i=0}^n (Ax_i - y_i)^2$, where $(x_i, y_i)$ are the points (nails), and $A$ is the function that tells the $y$ value of the rod at point $x_i$. We know $A$ is linear since the rod is straight. This happens to be the loss function for ordinary least squares!

Squaring is convex, and $Ax_i -y_i$ is affine. Composing convex functions with affine ones gives convex functions, and adding convex functions gives convex functions. So the whole loss function is convex. If there is more than one data point, it is strongly convex, and the only minimum in the system is the unique global minimum. In other words, the resting point for the rod attached to the springs is the (unique) line of best fit. Hooray!

Why do the springs have to go straight down? Well, they don’t have to! If you loosen that restriction, then you end up with a Total Least Squares regression instead. A TLS model allows for error not just in the $y$ axis, like OLS, but also the $x$ axis. For example, suppose you took noisy x-y plane GPS measurements along a straight trail, and you wanted to estimate a line through the actual trail. Since there is error on both $x$ and $y$, one can use TLS. (If we assume the variance on $x$ and $y$ here is the same, then one actually drops into a subcase of TLS called orthogonal regression.)

The example was made using planck.js, which is a nice javascript port of Box2D, although it was sorely lacking in documentation. For instance, I couldn’t find an easy way to run my demo outside of fullscreen.

This setup came from the book, The Mathematical Mechanic, which is full of perversities like this.

This is Sam Zhang's blog of mostly mathematical visualizations.

You can try to read this through an Atom-compatible feed reader, but the posts often rely on embedded HTML and Javascript. For the adventurous, here is the feed.xml.