# Gracious Living

The Axiom of Choice
November 5, 2010, 06:33
Filed under: Math, Set Theory | Tags: , , , ,

This post is going to be a bit more technical than most, so feel free to skim it if you find it difficult.  You should know the statements of the theorems and the definition of well-ordering, though.

When I stated the ZFC set axioms, I put choice last for a reason.  As I said there, it’s a bit like the parallel postulate in Euclidean geometry (if you don’t know this story, you should read about it — it’s fascinating).  Unlike the other axioms, it’s non-constructive, and it looks complicated enough that you should be able to get it from the other axioms, though it’s in fact independent of them (meaning they can’t prove it or disprove it).  For these reasons, many mathematicians in the early half of the century used choice sparingly, and made it very clear what pieces of their work required it.  But as we’ve seen, we needed it to show that every infinite set contains a sequence, and likewise, many of its consequences are so useful that it’s common nowadays to use it without reservation.  (And yes, Munroe, the Banach-Tarski paradox is a good thing.)

Besides important consequences, there are about four important equivalent statements to the axiom of choice, and that’s what I’ll be talking about today.  Hopefully this will give you an idea of what a choice function is and how powerful it is.  We’ll be showing the following statements are equivalent:

1. The axiom of choice: every set $S$ has a choice function $c:\mathcal{P}(S)\rightarrow S$ such that $c(A)\in A$ for all $A\subset S$.
2. Every Cartesian product of non-empty sets is non-empty.
3. Zorn’s Lemma: Every partially ordered set, in which every totally ordered subset has an upper bound, has a maximal element.
4. Hausdorff Maximum Principle: Every partially ordered set has a maximal totally ordered subset.
5. Well-Ordering Theorem: Every set can be well-ordered.

Below the fold, definitions of these terms and proofs of equivalence.

In the post on the ZFC axioms, I asked you to prove that (1) was equivalent to (2). If you didn’t figure it out, here’s the proof. For the record, we know that finite products of nonempty sets are nonempty by induction. The infinite case requires choice, though. Let $\{A_i\}$ be a set of sets indexed by $i\in I$. An element of $\prod_{i\in I}A_i$ is a ginormous ordered tuple, indexed by $I$, whose $i$th coordinate is an element of $A_i$; that is, it’s a simultaneous choice of an element out of $A_i$ for every $i$. But if we know the axiom of choice, we can simply create a choice function $c$ on $\bigcup_{i\in I} A_i$, and then the tuple whose $i$th coordinate is $c(A_i)$ is an element of $\prod_{i\in I}A_i$. Conversely, if every infinite product of non-empty sets has an element, then given a set $B$, find an element of $\prod_{P\in\mathcal{P}(B)-\{\emptyset\}}P$; the function sending $P$ to the $P$th coordinate of this element is a choice function for $B$.

For the latter three statements, we need some more terminology from order theory. Recall that a partially ordered set (or poset) is a set with a relation $\le$ that is reflexive, antisymmetric, and transitive. This doesn’t mean the relation compares every pair of elements: if it does (so for any two elements, one is less than the other) we call the set a totally ordered set (I advocate the term “toset”). The upper bound of a subset $T\subset P$ is an element $s\in P$ such that $s\ge t$ for all $t\in T$. This isn’t necessarily an element of $T$. A maximal element is an element $m\in P$ such that there is no element $p\in P$ with $p>m$. In the Hausdorff Maximum Principle, the word “maximal” is being used differently: a maximal totally ordered subset is a subset such that, if you added any other element to it, it would no longer be totally ordered. Finally, a well-ordered set is a toset such that every subset $T\subset P$ has a least element, an element $t\in T$ such that $t\le u$ for every $u\in T$.

The remaining equivalences can be proved using a cyclic proof: a proof that (1) implies (4), which implies (3), which implies (5), which implies (1). First assume the axiom of choice: we’re going to prove the Hausdorff Maximum Principle. This is the longest and hardest part. We start with a lemma:

Lemma. Let $X$ be a set and $P\subset\mathcal{P}(X)$. $P$ is a poset under inclusion. Assume that $\emptyset\in P$, that if $A\subset B\in P$ then $A\in P$ as well, and that $P$ contains the union (which is the least upper bound) of each of its totally ordered subsets. Then $P$ has a maximal element (in both senses).

Proof. Let $c$ be a choice function for $X$. We’ll first define a “growing map” that sends each element of $P$ to a larger element, if possible. Define $f:P\rightarrow\mathcal{P}(X)$ by $f(A)=\{x\in X:A\cup\{x\}\in P\}$. So $f(A)\supset A$ in general, and $f(A)=A$ iff $A$ is maximal. But we don’t know that $f(A)$ is in $P$:
$f(A)$ could have many more elements than $A$, and we only know that adding any one of them gives a set in $P$. So define $g:P\rightarrow\mathcal{P}(X)$ by $g(A)=A\cup\{c(f(A)-A)\}$ whenever $f(A)\ne A$, and $g(A)=A$ if $f(A)=A$. Again, $g(A)\supset A$, and $g(A)=A$ iff $A$ is maximal. But this time, $g(A)$ only has one more element, at most, than $A$.

We want to look at the things you can get from applying $g$ a ludicrous number of times. Let a tower be a subset $T$ of $P$ such that $\emptyset\in T$, $T$ contains the union of each of its totally ordered subsets, and $g(A)\in T$ for all $A\in T$. $P$ itself is a tower, so we know that a tower exists. Let $R$ be a totally ordered tower, and let $U=\bigcup R$. Then $g(U)\subset U$, but $U\subset g(U)$ by definition, so $g(U)=U$. Thus $U$ is maximal. So we just need to find a totally ordered tower.

A common thing to do in these situations is to check the condition for the smallest possible thing. So I claim that the intersection of the set of towers, which I’ll call $I$, is a totally ordered tower. It’s easy to check that it satisfies the tower definition. For the ordering condition, define $C\subset S$ as the set of elements of $S$ that are comparable (either supersets or subsets) with every element of $I$. Then $I$ is totally ordered iff $I\subset C$, because then every element of $I$ is comparable with every other. Since $I$ is a subset of every tower, we just have to show that $C$ is a tower.

Since the empty set is a subset of everything, $\emptyset\in C$. Let $D$ be a totally ordered subset of $C$; then if $B\in I$ is not a subset of $\bigcup D$, it cannot be a subset of any element of $D$, so it is a superset of every element of $D$, and thus it is a superset of $\bigcup D$. So every set $B$ is comparable with $\bigcup D$, so $\bigcup D\in C$. Finally, we need to show that for every $A\in C$, we have $g(A)\in C$, which means that every element of $I$ is comparable with $g(A)$. Since $A\subset g(A)$, we can weaken this to $f(A)\subset B$ or $B\subset A$ for every $B\in I$; if this holds, then $C$ is a tower, so $I$ is a toset, and the theorem is true.

Brace yourself — the easiest way to prove this is to use yet another tower! But we’re near the end. Fix a set $A\in C$, and define $E_A$ to be the set of elements $B\in I$ such that $g(A)\subset B$ or $B\subset A$. Again, if $E_A$ is a tower for an arbitrary $A$, then $E_A\subset I$, so $E_A=I$, and since $A$ was arbitrary, we’re done. It’s obvious that $\emptyset\in E_A$; likewise, if $D$ is a totally ordered subset of $E_A$, then $\bigcup D\subset I$ since $I$ is a tower, and if $g(A)\not\subset\bigcup D$ then $g(A)$ is not a subset of any element of $D$, so they all contain $A$ instead, so $A\subset D$. Finally, we have to check that $E_A$ is closed under the action of $g$. Let $B\in E_A$, so $g(B)\in I$. If $g(A)\subset B$, then $g(A)\subset g(B)$, so $g(B)\in E_A$. If $B=A$, then $g(A)=g(B)$, so $g(B)\in E_A$. If $B\subsetneq A$, then by definition of $C$, either $g(B)\subset A$ or $A\subset g(B)$. But if $A\subsetneq g(B)$, then since $B\subsetneq A$, we’ve added at least two elements to get from $B$ to $g(B)$. So $g(B)\subset A$, and then $B\in E_A$.

To summarize: we proved that $E_A=I$; since $A$ was arbitrary, then $E_A=I$ for every $A$; this proved the second part of the tower condition for $C$; since $C$ was a tower, it contained $I$, and it was defined so that this meant $I$ was a toset; since $I$ was a tower toset, $\bigcup I$ was a fixed point of $g$; thus, $\bigcup I$ was the maximal totally ordered subset we wanted. This proves the lemma. $\square$

Armed with our faithful lemma, the Hausdorff Maximum Principle is just one step further.

Proof.  Given a poset $X$, let $C$ be the set of totally ordered subsets of $X$.  Clearly $\emptyset$ is totally ordered; subsets of totally ordered sets are still totally ordered; and given a toset under inclusion $D\subset C$ (a toset of tosets in $X$), if $A,B\in\bigcup D$, then there are elements $E,F\in D$ such that $A\in E,B\in F$, and either $E\subset F$ or vice versa, but either way, $A$ and $B$ are comparable.  So $C$ fits the lemma, and thus it has a maximal element. $\square$

Theorem. The Hausdorff Maximum Principle implies Zorn’s Lemma.

Proof.  Let $P$ be a poset in which every totally ordered subset has an upper bound.  Let $M$ be a maximal totally ordered subset by the HMP, and let $m$ be an upper bound for $M$.  If any element $a$ has $a>m$, then $a$ is greater than all elements of $M$ by transitivity, so $M\cup\{a\}$ is a totally ordered subset.  But $M$ was supposed to be maximal, so no $a$ has $a>m$.  Thus, $m$ is a maximal element of $P$.

Theorem.  Zorn’s Lemma implies the Well-Ordering Theorem.

Proof.  Let $X$ be a set and let $\mathcal{A}$ be the set of pairs $(A,\le)$ where $A\subset X$ and $\le$ is a well-ordering of $A$.  We can define a partial ordering $\prec$ on $\mathcal{A}$ by saying $(A,\le)\prec(A^\prime,\le^\prime)$ whenever $A\subset A^\prime$, $\le$ agrees with $\le^\prime$ on $A$, and if $x\in A$, then $y\in A$ for all $y\in A^\prime$ with $y\le^\prime x$.  (We say that $A$ is an initial segment of $A^\prime$.)  It’s easy to see that this is a partial order on $\mathcal{A}$.  If $\mathcal{T}=\{(T,\le_T)\}$ is a totally ordered subset, then $(\bigcup T,\bigcup\le_T)$ is an upper bound (the union of the relations means that $x\left(\bigcup\le_T\right)y$ if $x\le_T y$ for some $T\ni x,y$).  By Zorn’s lemma, $\mathcal{A}$ has a maximal element $(M,\le_M)$ for $\prec$.  If $x\in X-M$, then we can define $(M^\prime,\le_{M^\prime})$ by $M^\prime=M\cup\{x\}$ and $\le_{M^\prime}$ restricting to $\le_M$ on $M$, but with $x$ larger than all members of $M$.  Then $(M,\le_M)\prec (M^\prime,\le_{M^\prime})$, which contradicts maximality.  Thus, $M=X$ and $\le_M$ well-orders $X$.  $\square$

Theorem. The Well-Ordering Theorem implies the Axiom of Choice.

Proof.  Given a set $X$, well-order it.  The choice function just sends each subset to its least element! $\square$

That’s all, folks!