Fibers of functions


Something that is easy to miss in early calculus classes is that the inverse of a function is typically not a function.  We go through this whole confusing notion first with the square root (because while it is true that if x^2 = 16 then x = \pm 4, we all know that we like +4 better), then with trig functions.  I would argue that it is helpful to think about the inverse of a function as a set, and then point out the wonderful fact that if all the inverses of individual points have only one or zero members, then there is a function g so that g(f(x)) = x.

Typically though, inverse images will have more than one point.  Indeed, for a map f: \mathbb{R}^m \to \mathbb{R}^n, you will expect f^{-1}(y) to be m-n dimensional, if m is bigger than n, and a point otherwise.  Intuitively, this is because we have m equations and n unknowns, leaving us with m-n free variables.  This suggests a way of visualizing functions that I have actually never seen used (references to where it has been used are welcome).

What I have in mind is that, if you have a function f: U \subset \mathbb{R}^m \to \mathbb{R}^n, and it so happens that f(U) can be isometrically embedded back into U by choosing the well from the sets f^{-1}(y), then we may plot the inverse images of f on the same graph as we draw the domain of f.

That last paragraph was confusing, so let me give an example right away.  We will look at the function f which maps from the solid torus (donut) to the real numbers, so

The map of the torus that gives the radius of a point. The line in red is the range of the map. Notice it intersects with every shell exactly once.

that f(x) is the distance of x from the center of the solid torus.  Hence f^{-1}(r) will be the (not solid) torus of radius r. I have made the graph I describe above for this map.  Notice that the image of the torus under f, a circle, is indicated in blue in the left of the graph.

This picture has a nice intuition: each surface will map down to one point (so our intuition earlier holds up- f maps a three dimensional object down onto one dimension, so the inverse images are all two dimensional), so we can easily look at this and see the domain, range and action of f on the domain.  Notice also that to plot this in a traditional manner it would take either 4 dimensions as a graph, or 1 overloaded dimension as a parametric plot.  This particular example *could* be displayed using a movie, though again we would be displaying fibers of the map.

The last image of this sort is where we instead map a torus (again, non-solid) to a circle.  Notice that now the map is from a 2-D surface to a 1-D curve, so we expect (and see that) the fibers to be 1 dimensional.

The inverse images of the torus-radius map, as the radius goes from 0 to 1.

Inverse images of a projection-of-sorts of the torus onto a circle.

Hilbert Spaces and Riesz Representation Theorem

I though about doing this as a regular feature precisely long enough to produce this image.

The Riesz Representation Theorem: There are a lot of theorems which go by this name, but I want to deal today with one particular version.  First, I’ll (briefly) define a Hilbert space, which has been defined here before.  It is a (1) vector space equipped with an (2) inner product which is (3) complete with respect to that inner product.

I’m going to assume some familiarity with item (2)- a vector space being a collection of vectors and scalars, so that you can add vectors and multiply by scalars according to some rules.

Also, those who have seen calculus are likely familiar with (3), an inner product, though often the words “dot product” are used.  Abstractly, an inner product is a function that acts on two vectors and returns a real (or complex, though I’ll ignore that here) number.  We often denote the inner product with brackets, so \langle (1,2),(4,5) \rangle = 1\cdot 4 + 2 \cdot 5 = 14 would be the familiar calculus inner product on \mathbb{R}^2.

An abstract inner product must satisfy symmetry, \langle v,w \rangle = \langle w,v \rangle for vectors v,w; linearity, \langle av+bw,z \rangle = a \langle v,z \rangle + b \langle w,z \rangle for scalars a,b and vectors v,w,z (note that by symmetry, I only need to require linearity for the first coordinate, and I automatically get it in the second); and positive definiteness, that is \langle v,v \rangle > 0 (or 0 if is the zero vector).

Part of an orthogonal basis for (square integrable) functions. See Leonid Kovalev's post for a discussion of a different one.

An inner product allows us to talk about the geometry of a space, in particular angles between vectors, following the famous test for orthogonality (right-angledness) in Euclidean space: and are orthogonal iff v \cdot w = 0.  An important inner product is on the space of square integrable (real valued) functions, denoted L^2.  Then we can define an inner product between two functions (the vectors in this vector space) by \langle f,v \rangle = \int f(x)g(x)~dx.

A Cauchy sequence

Finally, (3), being complete means that any Cauchy sequence converges, and a Cauchy sequence is one so that, roughly, eventually all the terms are close together.   The real numbers are complete, but perhaps more helpfully, the rational numbers are not.  Specifically, the sequence, 3, 3.1, 3.14, 3.141, 3.1415,… will get closer and closer to \pi, but since pi is not a rational number, the sequence will not converge in the rationals.

The Theorem!

Given a Hilbert space H, certainly if we fix a vector v, then the inner product against v defines a (continuous) linear function on H, \phi_v(w) = \langle v, w \rangle.  The surprising fact (theorem!) is that given a continuous linear function f on H, there is a unique vector v so that f(w) = \langle v, w \rangle for all w in H.  Put another way: not only does the inner product on H provide a supply of examples of continuous linear functionals, it provides all of them.

A not Cauchy sequence

One great application of this is that every continuous linear functional defined on square integrable functions is an integral!  F[u] = \int f(x)u(x)~dx for some function f.  It becomes a little less surprising when you realize that part of the trick in the theorem is requiring continuity, which is defined in terms of the inner product.  But hey!  Still pretty cool!

More format changes, a short note

In order to provide more regular, updates, I’m moving to a system where I will look to update each M-F at midnight, so loyal reader(s) can have something to look at over coffee.  This will also give me weekends to get a bit of a backlog of updates, so the rigors of getting through a week don’t mess up updating as much.

Also! So as not to disappoint those who came looking for a snippet of math, next Monday’s post is on a result of Frigyes Riesz (“Did I spell his name right?” “Frig, yes!”), whose teaching style is one to be admired.  From Wikipedia:

He had an uncommon method of giving lectures: he entered the lecture hall with an assistant and a docent. The docent then began reading the proper passages from Riesz’s handbook and the assistant inscribed the appropriate equations on the blackboard—while Riesz himself stood aside, nodding occasionally.

Good work if you can get it.

From Saturday Morning Breakfast Cereal.


How to solve a PDE

I briefly considered trying to make the case that smoke was governed by PDEs. Really, I just like this photo.

So here’s an attempt to summarize how to prove that a solution exists for linear, second order partial differential equations.  This is some of my favorite mathematics, though at first I was really turned off by all this talk about Sobolev spaces and functional analysis, instead of just solving the darn thing.  So I encourage those who have never seen any PDE theory yet to dig in (though maybe some ODE familiarity would be helpful).  Unfortunately, I could not come up with any helpful figures/pictures for today (suggestions welcome!), so you get a potpourri of shots.

First, let me define a second order linear differential operator L, but let me do so by describing what it does to a function u: \mathbb{R}^n \to \mathbb{R} (since a second order linear operator is just something that eats functions and gives back functions, in a linear manner):

Lu = -\sum_{j,k = 1}^n (a^{j,k}(x)u_{x_j}(x))_{x_k} + \sum_{j = 1}^n b^j(x)u_{x_j}(x) + c(x)u(x).

The photos on the page are mine. The dog, unfortunately, is not.

One famous example of such an operator is the Laplacian, denoted \Delta, for which a^{j,k} is 1 when j = k, and 0 otherwise, and b^js and c are zero.  Put more simply, the Laplacian is the sum of the pure second derivatives.  Now we provide first an outline, and then a few details of a proof of existence of solutions.

How to solve Lu = f:

1. Associate to the operator L a bilinear form B.  That is, a function B that eats two functions and gives a number which is linear in each coordinate.
2. Remember from functional analysis that, under certain conditions, there exists a unique u so that  B[u,v] = \lambda(v) for any v,  where \lambda is a linear functional (eats functions, gives a number).
3. Define \lambda(v) := \int f(x) v(x)~dx.
4. Pat yourself on the back for defining B in a clever way so that the u produced actually solves Lu = f in a weak sense.

Again, with slightly more detail:

1. We’ll define
B[u,v]:=\int\left(\sum_{j,k = 1}^na^{j,k}u_{x_j}v_{x_k}+\sum_{j = 1}^nb^ju_{x_j}v+cuv\right)dx,

It's a fish!

where everything is a function of x, but that really clutters up the equation.  A reason you might have thought to do this yourself is that it takes a little pressure off of u.  That is, we only need to require that u has one derivative, and that derivative only has to be defined under an integral sign.  If u happened to have two derivatives, we could integrate the first term by parts, and then we would have B[u,v]=\int Lu\cdot v~dx.  In this beautiful, integrable, differentiable world, if we could show that \int Lu \cdot v~dx = \int f \cdot v~dx for all functions v, then it would be reasonable to conclude that Lu = f.
2. The Lax-Milgram theorem says, roughly, that if H is a Hilbert space, then for a bilinear form B: H \times H \to \mathbb{R} and a (continuous) linear functional \lambda: H \to \mathbb{R}, there is a unique u \in H

Again, I could make the case that gravity is a differential equation, whose solutions are parabolas...

so that B[u,v]=\lambda(v) for each v \in H.  There are some other non-trivial hypotheses on B, which may be translated into hypotheses on L, but let’s revel in success now.
3.-4. These steps were actually described pretty well above.

So we need functional analysis to use this great Lax-Milgram theorem.  We need Sobolev spaces because our favorite function spaces from calculus- C^m, the m-times differentiable functions- are not Hilbert spaces.  Certain Sobolev spaces are, though, and they provide precisely the integrability conditions we need to get precise conditions to guarantee solutions for large classes of PDE.

The 100 words/picture is tough sometimes, but other times I get to put artsy photos in, so everything balances out.

Non-traditional computers

Just saw an article describing an unusual computer that reminded me of two previous articles I’ve seen in the past.  Not many pictures today, but there were a lot the past few days, and the links have plenty of nice pictures.

1. The first article is from a landscape architecture blog, and describes a hydraulic computer, which is specifically used to solve partial differential equations:

“Built in 1936, this machine was “the world’s first computer for solving [partial] differential equations,” which “for half a century has been the only means of calculations of a wide range of problems in mathematical physics.” Absolutely its most amazing aspect is that solving such complex mathematical equations meant playing around with a series of interconnected, water-filled glass tubes. You “calculated” with plumbing.”

I would venture a guess that the system was built around the idea that there aren’t that many really important PDE (in particular, Lawrence Evans’ classic 660page PDE text lists 32), and that by applying the right sort of pressure to water, you can exhibit many of these PDE.

This would be sort of a neat trick: we describe scientific phenomena (i.e., how water moves) using math, but then calculate values of a different phenomena using the first.  It would be like noticing that your car is making a noise that sounds like a distressed cat, and then looking at what distresses your cat to diagnose your car.  Only maybe adding “…in a mathematically rigorous manner” after the words “distressed cat”.  And then realizing that in fact most car troubles can be mimicked by distressed cats.  This analogy has gone too far.


Xiaoji Chen at MIT (now Microsoft, apparently) designed a mechanical computer, and the videos are definitely worth a watch.  I guess really all you need is a way of linking elements together to get a certain amount of logic, but I really liked this project.

3. Wang Tiles are a way of computing, again by just putting together enough logic, but they work in an interesting way.  In particular, you start with a set of square-ish tiles, but some marking on each edge.  Then by placing a certain number of tiles in the first row, there will be a unique way of filling out the tiling, and the pattern may bounce back in a way that conveys some information.  Branko Grunbaum and G.C. Shephard have a fantastic text on general tilings, and how to, for example, add two numbers with Wang tiles.

Below is an example of a calculation.  The 16 Wang tiles are shown top left, and instead of an edge marking, there is an edge coloring.  I don’t know what the top center and top right images refer to, but the bottom image shows the calculation of 5+9=14.  It is not hard to convince yourself that if you require the “outside” of the tiling to be black, then once you place the two pieces with the black circles in the center (at the positions 5 and 9), the rest of the tiling is determined.  The tile with the green circle in the center is the answer.  There are similar ways of multiplying, etc.

Computing with Wang tiles. From

Hausdorff dimension II

I left off yesterday asking how to measure a surface (something sort of 2-dimensional) in 3-space, and noting that Lebesgue measure would give this set 0 area.  This is where Hausdorff measure comes in.  There is in fact a continuum of Hausdorff measures, one for each real number.  First it is helpful to define the diameter of a set, which is just the farthest distance apart of two points in the set.  For a circle, the diameter is the usual definition.  Also, we’ll denote the diameter of a set U by |U|.

The construction of the measure is typically described in two steps: Suppose we are trying to find the m-dimensional Hausdorff measure of a set U.  Then in the first step, we cover U by sets no larger than some number, d, and look at the sum of the diameters of these sets raised to the mth power. The smallest (well, infimum) of these coverings will be the number we associate with the parameter d.  In the second step, we let d go to zero.  The Hausdorff measure is the limit, if it exists.  In symbols,

Two sets with (approximately) the same diameter, indicated with the red line.

H^m_d(U) = inf \sum_{k = 1}^{\infty} |U_k|^m, where |U_k| < d for each k.  Then H^m(U) = \lim_{d \to 0} H^m_d(U).  There is an easy result (see, for example, Falconer’s “Geometry of Fractal Sets“) that says that for any set U, there is a unique value, called the Hausdorff dimension of U so that H^m(U) = \infty \text{ if } 0 \leq d < \dim U and H^d(E) = 0 \text{ if } \dim U< s < \infty.  Note that the Hausdorff measure might be 0 or infinite, but this number will still exist.

We should also note that:

  1. Nowhere in the definition did I use anything but distance, making all these concepts valid in metric spaces, as well as in Euclidean (vector) spaces.
  2. Nowhere in the definition did I require the Hausdorff dimension to be an integer.

Expanding on this second point brings us into discussing fractals, which I have discussed a bit in the past week, but have not defined.  One reason for this is that I have not seen a great definition.  Self similarity is important, as is having a Hausdorff dimension different from its topological dimension.  At the very least, any self similar set whose Hausdorff dimension is not an integer would be given “fractal” status, though this would miss many good examples.  In any case, for any self similar set, the Hausdorff dimension is log(m)/log(s), where m is the number of copies at each step, and s is the (linear) scaling factor.

So if you view the real plane as a self similar set taken by starting with a square, then dividing it into four squares, and then each of those into four squares, and so on, then the number of copies is 4 and the scaling factor is 2 (Remember, it is the linear scaling factor! Each side is half the length it used to be.), so the Hausdorff dimension is log(4)/log(2), which is precisely 2, since \log{4}/\log{2} = \log{2^2}/\log{2} = 2\log{2}/\log{2}.

I have printed below two more examples of fractals and their dimensions.  Thanks to Leonid Kovalev for pointing out how to generate the Sierpinski carpet.  Also see this great wikipedia post for a much more complete list of fractals and their dimensions.

Dimension log(3)/log(2).
Has dimension log(8)/log(3) = 3*log(2)/log(3).

Introduction to Hausdorff Measure

It was not until my third year of graduate school that I thought to ask why geometric measure theory was called geometric measure theory.  It just always seemed like the answer should have been fairly obvious- the questions were very geometric, and everyone likes measures, right?  Well the prototypical problem in GMT is Plateau’s problem, which is roughly: given a closed curve in three space, what surface has the smallest area, and the given curve as a boundary.

A curve in 3-space, and a minimizing surface.

In order to study this problem for a “large enough” collection of sets to guarantee a solution, mathematicians could not rely on nice parameterizations existing over which surface integrals would be taken.  This can be thought of in analogy with the fact that “every nth degree polynomial has n roots” is not true over the real numbers, but it is true over the complex numbers.  But the notion of the area of such a “rectifiable set” is not immediately clear.

Covering a set by squares

More precisely, in mathematics, a “measure” is a function which eats sets, and returns a number that is, intuitively, the “size” of that set.  Perhaps the most commonly used measure- Lebesgue measure- is constructed exactly in this intuitive manner.  We first cover the set with squares (or lines in dimension 1, or cubes in dimension 3, or hypercubes in dimension 4, etc) with side length 1, then squares of side length 0.5, then 0.25, and so on.  By adding up the area (the area of an n-cube we define to be the side length to the nth power) of the squares in each case, we get a sequence, and if (when) that sequence converges, we call the number the (Lebesgue) measure of the set.

The problem with this approach is that if I have a reasonable line sitting in the plane, the Lebesgue measure of that line will be 0.  Similarly for a surface sitting in 3-space.  This is, in some sense, good.  The “length” of a point should be 0, and the “area” of a line should also be 0.  But this is a bad thing in the sense that maybe someone (like Joseph Plateau) is interested in the area of a surface in 3-space.

Covering the same set with smaller squares.

Tomorrow will be more on finding the area of such a surface (hint: the answer is Hausdorff measure)!  And fractals!  Then maybe the discussion will come back to “why is it called geometric measure theory?”