Skip to main content

Proving Another "Real Theorem" with Topos Theory

25 Mar 2024

Another day, another post that starts with “So I was on mse…”, lol. Somebody asked whether maximizing over a compact set is a continuous thing to do. That is, given a continuous function $f : K \times X \to \mathbb{R}$ is the function $x \mapsto \max_{k \in K} f(k,x)$ continuous?

If you’re me, this looks an awful lot like the usual extreme value theorem, but where “everything in sight depends continuously on a parameter from $X$”. Indeed, say we had a family of compact sets $K_x$ depending on a parameter from $X$, and a family of functions $f_x : K_x \to \mathbb{R}$ which is continuous both in the choice of $x$ and in the choice of $k \in K_x$. Then the usual extreme value theorem tells us that we have pointwise maxima $m_x = \max_{k \in K_x} f_x(k)$, and it’s natural to ask whether these maxima also vary continuously in the parameter $x$.

Depending on your experience with bundles, you might be aware that the usual way to encode a “family of sets $A_x$ continuously depending on $x \in X$” is with a single space $A$ (usually called the total space) equipped with a map $\pi : A \to X$. Then the fibres of $\pi$ give us the desired family of sets $A_x = \pi^{-1}(x)$1. As a super concrete example, say we want to represent the constant family $A_x = \mathbb{R}$. Then we should use the trivial bundle $\mathbb{R} \times X$ with the obvious projection map $\pi : \mathbb{R} \times X \to X$2.

I bet you weren’t expecting a cute exercise this early in the post!

Can you show that a family of functions $f_x : A_x \to B_x$ over $X$ is the same thing as a single function $f : A \to B$ making the following diagram commute:

Recall we’re thinking of the sets $A_x$ and $B_x$ as being the fibres $\pi_A^{-1}(x)$ and $\pi_B^{-1}(x)$.

To go from the family $f_x$ to a single function $f$ on the total spaces, it might be worth remembering that the total space $A$ is just the disjoint union of the fibres3 $\sum_{x \in X} A_x$ (suitably topologized).

So the constant family $K_x = K$ of compact sets is represented by the trivial bundle $K \times X$, and (by the above exercise) a family of functions $f_x : K_x \to \mathbb{R}$ is represented by a function $f : K \times X \to \mathbb{R} \times X$ with $\pi_2 (f(k,x)) = x$. Since we know what the second component must be, this is the same data as a function $K \times X \to \mathbb{R}$! This tells us that, if we can prove some “$X$-parameterized” version of the extreme value theorem (where the resulting maxima also depend continuously on a parameter $x \in X$), we can answer the OP’s question by looking at the special case where $K_x$ is a constant family!

If you’ve been exposed to some ideas in topos theory, you should be screaming something to do with the “internal logic” right now! Indeed, a theorem inside the topos of sheaves on $X$ externalizes to give a version of that theorem where everything is automatically continuous in a parameter from $X$. So if we have access to the extreme value theorem inside $\mathsf{Sh}(X)$ (that is, if we have a constructive proof of the extreme value theorem) we would immediately get the parameterized version for free!

So then, we want to get our hands on a constructive proof of the extreme value theorem. Typically constructive theorems which assert the existence of something (in this case, a maximum) require some ~bonus assumptions~ which are invisible in the classical case. This is because existence is a much stronger phenomenon constructively4.

To start, we probably want to be working with locales instead of topological spaces. Since the person on mse was happy with $K$ and $X$ metrizable, they’ll certainly be ok with sobriety since hausdorff-ness already implies it! Since sober spaces and locales (with enough points) are the same thing, we lose nothing by moving to the world of locales. There’s a good notion of compactness for locales, and people have spent a lot of time proving classical theorems constructively in this setting! So, optimistically, we can google "extreme value theorem"+"locale" to see if someone has already done what we’re looking for, and our optimism is quickly rewarded! We find Graham Manuell’s slides from TACL 2022 which give a constructive proof of the following:

If $K$ is a compact, overt, positive locale, and $f : K \to \mathbb{R}$ is continuous, then $f$ has a maximum $\max_K f$ given by a dedekind real.

Let’s say a few words about what all these weird adjectives mean!

First, let’s handle the easiest one: Compact.

The most familiar definition of compactness is that every open cover has a finite subcover. Precisely, if $\mathcal{U}$ is a family of open subsets of $X$ with $X = \bigvee \mathcal{U}$, then there should be a (kuratowski-)finite subfamily $\mathcal{U}_0 \subseteq \mathcal{U}$ so that actually $X = \bigvee \mathcal{U}_0$.

However, constructively it’s better to avoid saying the word “finite” if you can. The notion of finiteness splits into different inequivalent notions (kuratowski-finite, bishop-finite, dedekind-finite, etc) which makes the whole thing a bit of a mess5. For this reason, there’s a different, equivalent definition of compactness which we often prefer:

We say $X$ is called compact if for every directed open cover $\mathcal{U}$6 with $X = \bigvee \mathcal{U}$, we must actually have $X \in \mathcal{U}$!

As a cute exercise, prove these two definitions are equivalent! Try your hardest to make sure your proof is constructive!

There’s one last angle on compactness that I want to emphasize here as well: We say a locale is compact if universal quantification is open. That is, if $U \subseteq K \times X$ is open, we want to know if \(\{ x \mid \forall k . (k,x) \in U \} \subseteq X\) is open. This turns out to be true (for every $X$ and every $U$) if and only if $K$ is compact7!

The next adjective on the list is Overt. This is a kind of tricky notion to understand at first, since it’s invisible classically. We say a locale $O$ is overt iff for any locale $X$ and open subset $U \subseteq O \times X$, the projection \(\{ x \mid \exists o \in O . (o,x) \} \subseteq X\) is open. Note how this is dual to the universal quantifier present for compact sets. Also note that this set is exactly the image of $U$ under $\pi_X : O \times X \to X$. Since, classically, this projection map is always open, we learn that classically every locale is overt8.

There’s an equivalent characterization of overtness, which is often useful in practice. We want to check that the unique map $X \overset{!}{\to} 1$ is open. Since open maps are preserved by pullbacks, this gives the original definition for free. It also shows how this definition is classically invisible, since every subset of the terminal locale is open when the terminal locale is \(\{\star\}\).

However, recall that in a sheaf topos $\text{Sh}(X)$ the terminal locale is $X$! Then a locale in $\text{Sh}(X)$ is just an external locale with a map into $X$, and overtness of the internal locale corresponds to openness of the external map! So there’s obviously no reason to expect every locale to be overt constructively, since it’s easy to find continuous functions that aren’t open.

Lastly, we say that a locale is Positive iff every open cover must be inhabited. That is, if $X = \bigvee \mathcal{U}$ for $\mathcal{U}$ a family of open sets, the family $\mathcal{U}$ must be inhabited.

Recall we say $A$ is inhabited exactly when $\exists a \in A$ is true. In $\text{Sh}(X)$, this means that there’s an open cover of $X$ so that $A$ has a section on every element of the cover. That is, exactly when the structure map $A \to X$ is surjective.

Now, let’s outline the proof from Manuell’s slides9. Recall a Dedekind Real is a pair of cuts $(L,U)$ where

If you haven’t seen this before, we think of $r = (L,U)$ as pinning down a real number by saying how it compares to every rational. If $q \in L$, we say $q \lt r$, and if $q \in U$ we say $q \gt r$. See the nlab page for more. Recall also10 that, a real number in $\text{Sh}(X)$ is a continuous function $X \to \mathbb{R}$. The lower/upper cuts are also useful separately, and externalize to lower/upper semicontinuous functions on $X$.

Now, finally, for the proof idea:

$\ulcorner$ Say $f : K \to \mathbb{R}$ is a map from a positive, overt, compact locale $K$. First we’ll use positive overtness to build

\[L = \{ q \in \mathbb{Q} \mid \exists k \in K . q \lt k \}.\]

Then, we’ll use positive compactness to build

\[U = \{ q \in \mathbb{Q} \mid \forall k \in K . k \lt q \}.\]

Lastly, we need to show that the $L$ and $U$ we just built are compatible. It’s easy to show they don’t overlap, but showing they “have no gap” requires a small argument (which you can find on Manuell’s slide 8). $\lrcorner$

Note how (after externalizing) this is really the same argument that Sangchul Lee gave in the accepted answer on mse!

First we show that $g(x) = \max_K f(k,x)$ is lower semicontinuous (that is, we show that it’s a lower real). This doesn’t use any facts about $K$, since it’s only using positive overtness of $K \times X$, which is always true!

Then we show that $g(x)$ is upper semicontinuous (that is, we show it’s an upper real). This is where we crucially use compactness, both externally and internally.

The compatibility conditions basically amount to checking your upper and lower semicontinuous functions are the same, but since Sangchul’s answer is working with a single function the whole time there’s no need to verify that.

So now we have this constructive theorem. What does it actually tell us after we externalize?

Well, you can chase through the definitions of compact, overt, and positive inside (say) a sheaf topos $\text{Sh}(X)$ and see what you get!

First, as we alluded to in the introduction to this post, a locale $K$ internal to $\text{Sh}(X)$ is exactly a locale map11 $\pi : \Gamma(K) \to X$ where $\Gamma$ is the usual global sections map.

From here, it’s not (too) hard to see that $K$ is compact if and only if $\pi$ is proper, that $K$ is overt if and only if $\pi$ is open, and that $K$ is positive if and only if $\pi$ is surjective12.

So altogether we learn that

If $\pi : A \to X$ is a proper, open, surjection and we have $f : A \to \mathbb{R}$, then

\[x \mapsto \max_{k \in \pi^{-1}(x)} f(k)\]

is continuous.

In particular, we answer the OP’s question! Taking $\pi : K \times X \to X$ to be the usual projection map (which is open and surjective, plus proper since $K$ is compact) we learn that

\[x \mapsto \max_{(k,x) \in \pi^{-1}(x)} f(k,x)\]

is continuous. As desired!

Ok! Thanks for reading, all! This felt like it took forever for what a short post it was, but I had a great time writing it. I’m flying home from the AMS Sectional today, where I gave a talk at the special session on Homotopy Theory and Category Theory in Interaction. I had a great time, and met a lot of really awesome people. It was a small group, which means we had lots of time to hang out and get to know each other.

I’ll write up some stuff about the conference (and my talk) soon, but for now I need to get ready to go to the airport! Stay safe ^_^.

  1. I really don’t like Goldblatt’s Topoi as a book, but his section on the relationship between bundles over $X$ and sets continuously “indexed” by $X$ is super good. It’s chapter 4 section 5, and it has some really helpful pictures and examples! 

  2. Note that it’s possible to have constant fibres, without being the trivial bundle! In this case, even though each fibre is the same as every other, we glue them together in an interesting way! To be the trivial bundle, you need to know the fibres are all the same, and that we glued them together in the most naive way possible. For instance, the trivial $\mathbb{R}$-bundle over the circle $S^1$ looks like a cylinder, since we glue the copies of $\mathbb{R}$ together in the simplest possible way. However, we could slowly “twist” our copies of $\mathbb{R}$ as we move around the circle as we glue them together to get a moebius strip! This is still a continuously varying family of copies of $\mathbb{R}$, but now the bundle is nontrivial! Here’s a picture from wikipedia:

    As a cute exercise, can you come up with two bundles over $S^1$ where each fibre has two elements? One should be a trivial bundle, and the other shouldn’t be. 

  3. Any type theorists in the room are probably screaming Dependent Sum right around now! That’s extremely fair, and I almost put something in the main post about this, but I ended up editing it out. 

  4. As witnessed (pun intended) by the fact that, interpreted in suitable topoi, a constructive existence proof gives rise to free theorems saying there’s an algorithm that produces the desired object, or that the desired object can be defined continuously in a parameter, etc. This is exactly the phenomenon we’re trying to exploit, and we have to do work somewhere! 

  5. I still haven’t taken the time to really familiarize myself with the various notions of finiteness, and how they “feel”. One day soon I want to, though. 

  6. Recall a partial order $(D,\leq)$ is called directed if it’s inhabited and for any $x,y \in D$ their join $x \lor y$ is in $D$ too Edit: There’s a mutual uppoer bound $x,y \leq z$ also in $D$. Thanks to Tom de Jong for this correction!

    This is really hiding kuratowski-finiteness again, since directedness guarantees that for any kuratowski-finite subset $F \subseteq D$ that $\bigvee F \in D$. This is the usual argument that binary joins imply finite (nonempty) joins. 

  7. I was curious about this, and Pedro pointed out that one direction is basically the classical tube lemma. I don’t actually see how to do the other direction (at least quickly) and I don’t really have time to think about it right now. If someone figures it out I’d love to hear about it in the comments! 

  8. Constructively it’s still true that every locale with enough points is automatically overt (see the nlab). It’s a very mild condition, see the discussion here, for instance, or Paul Taylor’s Overt Subspaces of $\mathbb{R}^n$

  9. He, in turn, mentions was based on the treatment in Paul Taylor’s A Lambda-Calculus for Real Analysis. I actually have this paper saved, but it’s long and I wasn’t sure how easy it would be to translate Taylor’s results into language I’m more familiar with. Now that I’ve seen Manuell do it, though, I have plans to read this paper pretty soon! 

  10. I’ve always gotten slightly annoyed, or at least laughed quietly to myself, when authors ask the reader to “recall” some fact that they quite possibly don’t know.

    I really want this post to be done, though (I’ve been working on it for almost a month) and if I have to explain why upper/lower reals correspond to upper/lower semicontinuous functions I’ll never finish… I know, I tried.

    You can find an extremely detailed treatment in Mac Lane and Moerdijk’s Sheaves in Geometry and Logic Chapter VI.8. Understanding this well is probably enough to work out that upper/lower reals correspond to upper/lower semicontinuous functions yourself (which will make a great exercise!). Depending on your experience, you might be helped by my old post on externalizing formulas inside a topos.

    There’s also a great treatment in Johnstone’s Sketches of an Elephant, Chapter D4.7. This works out the case of lower reals and lower semicontinuous functions explicitly, but does so quite quickly. 

  11. In fact we can say exactly which map it is! We want a locale map from $\Gamma(K) \to X$, which means we want a frame homomorphism from $\mathcal{O}(X) \to \mathcal{O}(\Gamma(K)) = \Gamma(\mathcal{O}(K))$.

    But given an open set $U \subseteq X$, we can look at the local sections $\Gamma(\mathcal{O}(K),U)$. This is again a frame, and the restriction map $\rho^X_U : \Gamma(\mathcal{O}(K)) \to \Gamma(\mathcal{O}(K),U)$ has a left adjoint $\sigma_U : \Gamma(\mathcal{O}(K),U) \to \Gamma(\mathcal{O}(K))$.

    The desired map from $\mathcal{O}(X) \to \mathcal{O}(\Gamma(K))$ sends $U \subseteq X$ to $\sigma_U(\top)$, where $\top$ is the top element of $\Gamma(\mathcal{O}(K),U)$. 

  12. I feel kind of bad not proving these facts, since they’re not so hard? But I really am trying to finish this post quickly.

    You can find a lot of this in Johnstone’s Elephant. In particular,

    • Getting a locale map $\pi : \Gamma(K) \to X$ from an internal locale is proposition chapter C1.6.2
    • Compactness of $K$ agrees with properness of $\pi$ is theorem C3.2.8
    • Overtness of $K$ agrees with openness of $\pi$ is lemma C3.1.17
    • We’ve already talked about why positivity means $\pi$ is surjective