interactivemathsoftware-engineering

February 5th, 2018

In this post, I want to look at some straightforward examples of a seemingly-paradoxical truth:

It can be **easier** to prove a **more abstract** statement!

Since people often seem inclined to think `abstract == hard`

, this may be a bit of a surprising
claim. How could a more abstract statement be *easier* to prove than a concrete one?

Let's think about a few cases where abstract beats concrete. We'll start in arithmetic, then work our way over to geometry and, finally, software engineering.

Here is a claim about numbers that I would like you to prove:

$$ 865423174 \cdot 4824239984 = 4824239984 \cdot 865423174 $$A very concrete theorem, if there ever was one! How should we prove it? Concretely? That's easy enough: just do the long multiplications on each side, and check that the results are the same.

Is this proof really satisfactory? First of all, you had to do a bunch of nitty-gritty work to complete the proof, calculating two long products (I hope you didn't make any mistakes along the way!) Second, if I ask you the same question with two new numbers, you have to start your work all over again. And finally, your proof (while perfectly sound) really does not give much insight about what is going on.

But I bet you didn't actually compute that product. I *bet* you reasoned "I know that
multiplication of numbers is commutative, so \(x \cdot y = y \cdot x\) for *any* numbers
\(x\) and \(y\). This equation is just a special case!"

And in fact, the *proof* that multiplication of (natural) numbers is commutative is
easy enough that you could work it out in *less time* than it would take to actually perform
the multiplication by hand!

So here we have an example of two theorems, a concrete one and an abstract one, where the
abstract theorem implies the concrete one *and* is easier to prove. Besides that, it is
more informative: the abstract formulation helps explain *why* the equation holds, not just
*that* it holds.

So this particular abstract statement happened to be easier to prove. Is this a general phenomenon, or did I just pick a loaded example?

In some sense, we should not be surprised that more abstract statements are easier to prove. We don't have a good, quantifiable metric for "easiness" of a proof, so let's try this approximation: we could say statement \(A\) is "no harder to prove" than statement \(B\) if every proof of \(B\) can be automatically converted to a proof of \(A\).

There may be many different ways to prove the abstract statement \(\forall x, y.~ x \cdot y = y \cdot x\). And each one of those proofs can *also* be used to prove that \(865423174 \cdot 4824239984 = 4824239984 \cdot 865423174\). On the other hand, there may be *additional* ways to prove \(865423174 \cdot 4824239984 = 4824239984 \cdot 865423174\) which will not generalize to proofs for all \(x\) and \(y\). That is, the abstract theorem is "no harder to prove" than the concrete one (and is, perhaps, "easier" too)

So there is a definite sense in which proving a more abstract statement may be simpler than proving a concrete statement: there may just be fewer ways of proving the abstract statement! In the concrete case, you may have a million ways that a proof could start. But in the abstract case, you're often "on rails" to some extent, intentionally limiting the possible next steps your proof could take.

By abstracting away irrelevant details, we're left with a smaller conceptual space to go searching for a proof.

The arithmetic example is a bit simplistic, so let's do something more fun.

The idea for this post actually came about while I was reading The Pythagorean Theorem for Babies with my kids. The author clearly did some good thinking about how to express the concepts behind the Pythagorean theorem in a simple way; for example, I love his definition of a right triangle:

There are many ways to make a triangle, but this book is about a special kind of triangle.
If you cut a rectangle in half like this [...], you get a shape called a right triangle.
A right triangle has three sides, just like any other triangle, but it's special because
a square fits perfectly into one of the corners.

I thought the *definition* of a right triangle as half a rectangle was pretty clever.
Anyway, the book goes on to prove the Pythagorean theorem using a pretty standard
dissection proof.
I've always found these a little unsatisfactory, since they involve the squares you care
about, plus some extra junk, and you have to do a bit of light accounting to show that the
extra junk is balanced.

All this got me thinking about how to present my favorite proof of the
Pythagorean theorem to my kids. And since *that* proof is an instance of "abstract is simple",
here we are! So, on to the proof.

First, we'll be making use of a certain non-trivial scale-invariance property of Euclidean geometry. The property we care about is this:

Depending on your perspective, this statement may be any of

obviously true by geometric arguments (Theorem 8.4 here plus decomposition methods, roughly),

obviously true by dimensional analysis (as in this very nice survey of scaling methods by Amitabha Ghosh),

not obvious at all (but nevertheless true for Euclidean geometry), or

We'll gloss over these (very interesting!) details and simply assert the scaling lemma as an axiom of Euclidean geometry, at least for "nice enough" figures.

Recall the Pythagorean theorem that we all know and love (or perhaps just tolerate):

We're going to collaborate to create a simple proof of a *generalized* Pythagorean theorem,
from which the traditional Pythagorean theorem will fall out immediately. Specifically, we'll
prove this:

Actually, even though I said "we're going to collaborate", *you* have to do the work
down below here. It's a bit of a puzzle; tinker around with the diagram below until the proof
is complete!

Drag triangle to move, or drag corners to rotate and resize. Click the edges to reflect the attached shape. Use the menu to experiment with different shapes.

Remember what you're trying to prove: that the area of the figure on the long side is equal to the sum of the areas of the figures on the other two sides. Once you've found the right configuration, the proof will appear out of thin air!

**Generalized Pythagorean Theorem:** Suppose a right triangle has vertices \(A,B,C\), and a figure is
drawn on the edge \(\overline{AB}\). Copy the figure to the other two edges using similarity
transformations that move \(\overline{AB}\) to \(\overline{AC}\) and \(\overline{BC}\).
Then the area of the figure on the largest edge is equal to the sum of the areas of the two other
figures.

**Proof:** By the scaling lemma, it is enough to prove this
for any *specific* figure of our choosing. Why? Suppose we can prove the Generalized Pythagorean Theorem for *some specific* shape \(A\), so we know $$ \textsf{Area}(A) = \textsf{Area}(A') + \textsf{Area}(A'')$$ or, equivalently,
$$ 1 = \frac{\textsf{Area}(A')}{\textsf{Area}(A)} + \frac{\textsf{Area}(A'')}{\textsf{Area}(A)}$$
Now apply the scaling lemma, using *any other* shape \(B\), to get
$$ 1 = \frac{\textsf{Area}(B')}{\textsf{Area}(B)} + \frac{\textsf{Area}(B'')}{\textsf{Area}(B)}$$
or, equivalently,
$$ \textsf{Area}(B) = \textsf{Area}(B') + \textsf{Area}(B'')$$

This means we only need to prove the theorem for *one* shape, *of our choosing*, and we get all other shapes for free.

Let's try to prove it for...

**Corollary (Classical Pythagorean Theorem):** If \(C\) is the length of the
longest side of a right triangle, and \(A\), \(B\) are the lengths of the other two sides, then
$$C^2 = A^2 + B^2$$

**Proof:** This is a specific case of the Generalized
Pythagorean Theorem, when the figures are squares.

A similar effect appears in software engineering, where *increasing* the amount of polymorphism in
a function's type *decreases* the number of possible implementations it may have. For example,
there are 16 functions of type \(\texttt{Bool} \times \texttt{Bool} \to \texttt{Bool}\), but only *two*
functions of type \(\forall T.~ T \times T \to T\).

In programming languages that support parametric polymorphism, you can use this effect to create lightweight unit tests that still offer guarantees about their normal, heavyweight behavior.

For example, imagine that we have a program that carries out some algorithms on certain graphs. Let's suppose these graphs are quite complex and heavyweight; they might take a lot of space on disk, or have very expensive-to-compute neighbor functions, or maybe they just are the result of a very long-running computation. Perhaps they are just difficult to construct because of unwieldy APIs.

Now suppose that you have written a depth-first search algorithm for these graphs. It might have a type something like this:

```
dfs :: ExpensiveGraph -> List ExpensiveVertex
```

To write unit tests for `dfs`

, you'll need to construct a bunch of `Graph`

s to use as inputs.
Where will you get them? No option is really excellent; you first might build somes graphs as part of the test setup code, but they could be very expensive to compute on demand. So maybe you'll
set up some kind of cache or database, but you'll need to teach your test harness about it
(and hope it stays synchronized with the main code as versions change!). Building the graphs
by hand might be unpleasant if the APIs are difficult to use. And if the graphs are derived from
production data somehow, you'll need a test version of the database to work with.

All this to test a depth-first search algorithm? Why does this seem like overkill? Because all of
the testing artifacts are only there to get us some `ExpensiveGraph`

s to work with; none of it
actually helps test the *implementation* of `dfs`

at all!

Instead, let's follow the mantra of "abstract is simple", and *parameterize* `dfs`

by the type
of graph we are working with! We could imagine introducing a generic graph type `Graph v`

,
where `v`

is the type of vertices in this graph. Then we would write something like

```
data Graph v = ...
type ExpensiveGraph = Graph ExpensiveVertex
dfs :: Graph v -> List v
```

Now in our testing code, we can ignore `ExpensiveGraph`

altogether, just like we ignored
squares to prove the Pythagorean theorem, and ignored numbers to prove that a certain product commuted! Instead, we can test `dfs`

on some easily-constructed graph type. Parametric polymorphism
ensures that the behavior of `dfs`

cannot vary for different choices of `v`

, so we are free to
test `dfs`

using whatever graph type we prefer!

Abstraction sometimes gets cast in a negative light, as an academic tool that is
disengaged from concrete real-world domains like applied mathematics or software
engineering. But abstraction is a powerful tool exactly *because* it lets us deal with
concrete problems, giving us powerful tools for reasoning and understanding. Hone your
powers of abstraction while keeping one eye on the real-world problems you care about,
and reap the benefits!