Chris Grossack's Blog
https://grossack.site
Embedding Dihedral Groups in Vanishingly Small Symmetric Groups<p>After the long and arduous process of writing my previous posts on
homotopy theories and $\infty$-categories, it’s nice to go back to a
relaxed post based on an <a href="https://math.stackexchange.com/questions/4491025/the-smallest-number-for-faithful-operation/4491030#4491030">mse question</a> I answered the other day.
Nature is healing ^_^.</p>
<p>This post was asking about the smallest set on which certain groups can faithfully
act. I actually wrote about this same idea almost exactly a year ago
in a blog post <a href="/2021/08/16/embedding-dihedral-groups-efficiently.html">here</a>, and while answering this question I remembered
that I wanted to follow up on that post. I mentioned some asymptotic result
at the end of it that’s super hard to use,
and I was sure I could get a more legible result if I wanted to<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>.</p>
<p>Well, when answering this question I was reminded that I want to!</p>
<hr />
<p>First, let’s recall what question we want to answer:</p>
<p>In the <a href="/2021/08/16/embedding-dihedral-groups-efficiently.html">last post</a>, we showed that the dihedral group of an $n$-gon’s symmetries
\(D_{2n}\) can embed into the symmetric group \(\mathfrak{S}_m\) even if $m \lt n$.
For instance, \(D_{2 \cdot 6} \hookrightarrow \mathfrak{S}_5\).</p>
<p>A perfectly natural question, then, is <em>how much smaller</em> can we make this?</p>
<p>We showed that if $n = \prod p_i^{k_i}$ is the prime factorization of $n$,
then $D_{2n} \hookrightarrow \mathfrak{S}_{\sum p_i^{k_i}}$.
For instance, since $6 = 2 \cdot 3$, we have
\(D_{2 \cdot 6} \hookrightarrow \mathfrak{S}_{2+3} = \mathfrak{S}_5\).</p>
<p>With this in mind, it seems entirely believable that we can get the ratio to
be <em>very</em> small.</p>
<p>As is usually the case, before trying to prove this, I wanted to compute
some values and see how quickly things are decreasing. This wasn’t really
necessary in hindsight, but it did make for some pretty pictures!</p>
<hr />
<p>First, here’s a plot of the minimal $m$ so that $D_{2n} \hookrightarrow \mathfrak{S}_m$.</p>
<div class="auto">
<script type="text/x-sage">
N = 100
data = [(n, sum(a^b for (a,b) in list(factor(n)))) for n in range(2,N)]
scatter_plot(data).show(axes_labels=['$n$', '$m$'])
</script>
</div>
<p>You can see that the maximal values are always $m=n$, which occurs at all the
prime powers. However, you can <em>also</em> see that the minimal values can be
substantially smaller.</p>
<p>To get a handle on just how much smaller, let’s plot the ratios $m/n$ instead.</p>
<div class="auto">
<script type="text/x-sage">
N = 100
data = [(n, sum(a^b for (a,b) in list(factor(n)))/n) for n in range(2,N)]
scatter_plot(data).show(axes_labels=['$n$', '$m/n$'])
</script>
</div>
<div class="boxed">
<p>As a (very) quick exercise, for which $n$ do we hit the ratio $1$? How often
do these occur?</p>
<p>As a less quick exercise, set $N = 1000$ in the above code and notice how
we stratify into multiple limiting ratios. Can you tell what some of these
ratios are?</p>
</div>
<p>Lastly, let’s take the minimal ratio we’ve seen so far</p>
<div class="auto">
<script type="text/x-sage">
N = 100
rmin = 1
data = []
for n in range(2,N):
r = sum(a^b for (a,b) in list(factor(n)))/n
if r < rmin:
rmin = r
data += [(n,rmin)]
scatter_plot(data).show(axes_labels=['$n$', '$\\min_{k \\leq n}\\ m(k)/k$'])
</script>
</div>
<p>I would love to find a nice curve upper bounding this last scatter plot,
but it seems like a possibly tricky problem in number theory. Formally:</p>
<div class="boxed">
<p>Define</p>
\[f(n) \triangleq \displaystyle \min_{\prod p_i^{k_i} \leq n} \frac{\sum p_i^{k_i}}{\prod p_i^{k_i}}\]
<p>Can you find asymptotics for $f$?</p>
</div>
<p>If anyone wants to take a stab at this
(or any other problems related to this sequence<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">2</a></sup>),
I would love to hear what you find ^_^.</p>
<hr />
<p>Whatever the asymptotics are, it’s clear that $f(n) \to 0$. That is,
for any $\epsilon$ you like, there’s a dihedral group $D_{2n}$ embedding into
a symmetric group $\mathfrak{S}_{n \epsilon}$. I don’t know why, but this
feels remarkable to me. It says that somehow we can get away
with practically no objects at all in order to faithfully represent a dihedral
group action. Said another way, there’s some $n$-gon whose symmetries can be
faithfully represented in the symmetries of only $\frac{n}{1000000}$ many points.</p>
<p>For completeness, let’s give a quick proof of this fact. It’s fairly easy,
so I encourage you to try it yourself! In fact, it’s quite easy to prove
various strengthenings of this fact. For instance, the proof we’re about
to give shows that we can take $n$ to be a product of $2$ primes, and is
easily tweaked to allow us to put lots of bonus conditions on the prime
factorization of $n$.</p>
<p>$\ulcorner$
Let $\epsilon \gt 0$.</p>
<p>Pick two primes $p,q \gt \frac{2}{\epsilon}$. Then from the results in
the <a href="/2021/08/16/embedding-dihedral-groups-efficiently.html">last post</a> the dihedral group $D_{2pq}$ embeds into the symmetric group
$\mathfrak{S}_{p+q}$. But now</p>
\[\frac{m}{n} = \frac{p+q}{pq} = \frac{1}{q} + \frac{1}{p} \lt \epsilon\]
<p>That is, $f(pq) \lt \epsilon$, so that $f(n) \to 0$.
<span style="float:right">$\lrcorner$</span></p>
<hr />
<p>It’s nice to write up a quick relaxing post for once. I thought about trying
to answer some of the number theoretic questions that I posed throughout
(as well as a few that I didn’t), but I really wanted to keep this quick.
Plus, I suspect a lot of these questions will be somewhat simple, and I might
keep them in my back pocket in case a younger undergraduate comes to me
looking for a project.</p>
<p>Also there’s an <em>extra</em> reason to be excited about this post:
it gave me an excuse to submit another sequence to the OEIS!
The inputs for which $f(n)$ changes are
interesting, and were not in the OEIS already, so I submitted them
while I was writing this up! If all goes well, they should be available as
<a href="https://oeis.org/A354424">A354424</a> in the near future<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">3</a></sup>.</p>
<p>It’s also nice to go back to an older post and give it the closure it really
deserves. The asymptotic result I cited there is borderline unusable, and
this post answers the question that I really <em>wanted</em> to ask in that post<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">4</a></sup>.</p>
<p>Take care, and stay safe all ^_^. Talk Soon!</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>While writing that post all those months ago, I certainly did <em>not</em>
want to. If I remmeber right, I finished writing that post at like
3 or 4 in the morning… <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:4" role="doc-endnote">
<p>For instance, how long can the wait time be before we see a better ratio?</p>
<p>Precisely, let $a_n$ be the sequence of values so that $f(a_n) \neq f(a_n-1)$.
How far apart can $a_n$ be from $a_{n+1}$?</p>
<p>Said another way, how large are the gaps in <a href="https://oeis.org/A354424">A354424</a>? <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>I don’t know why, but submitting to the OEIS always makes me kind of
giddy with excitement! <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>It also brought up a whole host of other questions! This is a huge part
of the fun of math for me – there’s always more to explore. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Sat, 16 Jul 2022 00:00:00 +0000
https://grossack.site/2022/07/16/efficient-embedding-followup.html
https://grossack.site/2022/07/16/efficient-embedding-followup.htmlWhy Care about the "Homotopy Theory of Homotopy Theories"? (Homotopy Theories pt 4/4)<p>It’s time for the last post of the series! Ironically, this is the post that
I meant to write from the start. But as I realized how much background
knowledge I needed to provide (and also internalize myself), various sections
got long enough to warrant their own posts.
Well, three posts and around $8000$ words later, it’s finally time!
The point of this post will be to explain what people mean when they
talk about the “homotopy theory of homotopy theories”, as well as to
explain why we might care about such an object.
After all – it seems incredibly abstract!</p>
<p>Let’s get to it!</p>
<hr />
<p>Let’s take a second to recap what we’ve been talking about over the course
of these posts.</p>
<p>We started with relative categories. These are categories $\mathcal{C}$
equipped with a set of arrows $\mathcal{W}$ (called <em>weak equivalences</em>)
which we think of as morally being isomorphisms, even if they aren’t
<em>actually</em> isos in $\mathcal{C}$. The classical examples are topological
spaces up to homotopy equivalence, and chains of $R$-modules up to
quasiisomorphism.</p>
<p>In the first post, we defined the <em>localization</em> (or the <em>homotopy category</em>)
$\mathcal{C}[\mathcal{W}^{-1}]$, which we get by freely inverting the arrows
in $\mathcal{W}$. We say that a <em>homotopy theory</em> is a category of the form
$\mathcal{C}[\mathcal{W}^{-1}]$ up to equivalence.</p>
<p>Unfortunately, homotopy categories (to use a technical term)
suck. So we introduce <a href="https://en.wikipedia.org/wiki/Model_category">model structures</a> on $(\mathcal{C}, \mathcal{W})$,
which let us do computations in $\mathcal{C}[\mathcal{W}^{-1}]$ using the
data in $\mathcal{C}$. Model structures also give us a notion of
<a href="https://ncatlab.org/nlab/show/Quillen+equivalence">quillen equivalence</a>, which allow us to quickly guarantee that two
relative categories present the same homotopy theory (that is, they have
equivalent localizations)<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>.</p>
<p>Unfortunately again, model categories have problems of their own. While
they’re great tools for computation, they don’t have the kinds of nice
“formal properties” that we would like. Most disturbingly, there’s no good
notion of a functor between two model categories.</p>
<p>We tackled this problem by defining <em>simplicial categories</em>, which are
categories that have a <em>space</em> worth of arrows between any two objects,
rather than just a set. We call simplicial categories
(up to equivalence) $\infty$-categories.</p>
<p>Now, we know how to associate to each relative category $(\mathcal{C}, \mathcal{W})$
an $\infty$-category via <em>hammock localization</em>.
Surprisingly, (up to size issues), <em>every</em> $\infty$-category arises from a
pair $(\mathcal{C}, \mathcal{W})$ in this way.
With this in mind, and knowing how nice the world of $\infty$-categories is,
we might want to say a “homotopy theory” is an $\infty$-category rather than
a relative category.
Intuitively, the facts
in the previous paragraph tell us that we shouldn’t lose any information by
doing this… But the correspondence isn’t <em>actually</em> one-to-one.
Is there any way to remedy this, and put our intuition on solid ground?</p>
<p>Also, in the <a href="quasicategories">previous post</a> we gave a second definition of
$\infty$-category, based on <a href="https://en.wikipedia.org/wiki/Quasi-category">quasicategories</a>!
These have some pros and some cons compared to the simplicial category approach,
but we now have <em>three different definitions</em> for “homotopy theory”
floating around. Is there any way to get our way out of this situation?</p>
<hr />
<p>To start, recall that we might want to consider two relative categories
“the same” if they present the same homotopy theory. With our new,
more subtle tool, that’s asking if</p>
\[L^H(\mathcal{C}_1, \mathcal{W}_1) \simeq L^H(\mathcal{C}_2, \mathcal{W}_2)\]
<p>but wait… There’s an obvious category $\mathsf{RelCat}$ whose objects
are relative categories and arrows
\((\mathcal{C}_1, \mathcal{W}_1) \to (\mathcal{C}_2, \mathcal{W}_2)\) are functors
\(\mathcal{C}_1 \to \mathcal{C}_2\) sending each arrow in \(\mathcal{W}_1\) to
an arrow in \(\mathcal{W}_2\).</p>
<p>Then this category has objects which are <em>morally</em> isomorphic
(since they have equivalent hammock localizations), but which are not
<em>actually</em> isomorphic…</p>
<p>Are you thinking what I’m thinking!?</p>
<p>$\mathsf{RelCat}$ <em>itself</em> forms a relative category, and
in this sense, $\mathsf{RelCat}$ becomes itself a homotopy theory whose
objects are (smaller) homotopy theories!</p>
<p>We can do the same thing with simplicial categories (resp. quasicategories)
to get a relative category of $\infty$-categories. In fact, all three of
these categories admit model structures, and are quillen equivalent!</p>
<p>This makes precise the idea that relative categories and $\infty$-categories
are really carrying the same information<sup id="fnref:25" role="doc-noteref"><a href="#fn:25" class="footnote" rel="footnote">2</a></sup>!</p>
<p>In fact, there’s a <em>zoo</em> of relative categories which all have the
same homotopy category as $\mathsf{RelCat}$. We say that these are
models of the “homotopy theory of homotopy theories”, or equivalently,
that these are models of $\infty$-categories<sup id="fnref:27" role="doc-noteref"><a href="#fn:27" class="footnote" rel="footnote">3</a></sup>.</p>
<div class="boxed">
<p>If you remember earlier, we only gave a tentative definition of a
homotopy theory. Well now we’re in a place to give a proper definition!</p>
<p>A <span class="defn">Homotopy Theory</span>
(equivalently, an <span class="defn">$\infty$-category</span>) is an object
in any (thus every) of the localizations of the categories we’ve just discussed.</p>
</div>
<p>Perhaps unsurprisingly, we can do the same simplicial localization
maneuver to one of these relative categories
in order to get an $\infty$-category of $\infty$-categories!</p>
<p>But why care about all this?</p>
<p>It tells us that (in the abstract) we can make computations with either
simplicial categories or quasicategories – whichever is more convenient
for the task at hand. But are there any more concrete reasons to care?</p>
<hr />
<p>Remember all those words ago in the <a href="model-categories">first post</a> of this series,
I mentioned that hammock localization <em>works</em>, but feels somewhat
unmotivated. Foreshadowing with about as much grace as a young
fanfiction author, I asked if there were some more conceptual way to
understand the hammock construction, which shows us “what’s really going on”.</p>
<p>Well what’s the simplest example of a localization? Think of the category
$\Delta^1$ with two objects and one arrow:</p>
\[0 \longrightarrow 1\]
<p>Inverting this arrow gives us a category with two objects and an isomorphism
between them, but of course this is equivalent to the terminal category
$\Delta^0$ (which has one object and only the identity arrow).</p>
<p>So then how should we invert all of the arrows in $\mathcal{W}$? It’s not
hard to see that this pushout, intuitively, does the job:</p>
<p style="text-align:center;">
<img src="/assets/images/infinity-categories/oo-pushout.png" width="33%" />
</p>
<p>Here the top functor sends each copy of $\Delta^1$ to its corresponding
arrow in $\mathcal{W}$, and the left functor sends each copy of $\Delta^1$
to a copy of $\Delta^0$. Then the pushout should be $\mathcal{C}$, only we’ve
identified all the arrows in $\mathcal{W}$ with the points $\Delta^0$.
This is exactly what we expect the (simplicial) localization to be, and
it turns out that in the $\infty$-category of $\infty$-categories, this
pushout really does the job!</p>
<p>For more about this, I really can’t recommend the youtube series
<a href="https://www.youtube.com/watch?v=3IjAy0gHRyY&list=PLsmqTkj4MGTDenpj574aSvIRBROwCugoB"><em>Higher Algebra</em></a> by Homotopy Theory Münster highly enough.
Their goal is to give the viewer an idea of how we <em>compute</em> with
$\infty$-categories, and what problems they solve, without getting
bogged down in the foundational details justifying exactly why these
computational tools work.</p>
<p>Personally, that’s <em>exactly</em> what I’m looking for when I’m first learning
a topic, and I really appreciated their clarity and insight!</p>
<hr />
<p>With that last example, we’re <em>finally</em> done! This is easily the most
involved (series of) posts I’ve ever written, so thanks for sticking through
it!</p>
<p>I learned a <em>ton</em> about model categories and $\infty$-categories while
researching for this post, and I’m glad to finally have a decent idea of
what’s going on. Hopefully this will be helpful for other people too ^_^.</p>
<p>Stay safe, all, and I’ll see you soon!</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Note, however, that while most examples of two model categories with the
same homotopy theory come from quillen equivalences, this does not have
to be the case. See <a href="https://mathoverflow.net/questions/135717/examples-of-non-quillen-equivalent-model-categories-having-equivalent-homotopy-c">here</a> for an example. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:25" role="doc-endnote">
<p>When I was originally conceiving of this post, I wanted this to be the
punchline.</p>
<p>The “homotopy theory of homotopy theories” is obviously <em>cool</em>, but it
wasn’t clear to me what it actually <em>did</em>. I was initially writing up this
post in order to explain that I’ve found a new reason to care about heavy
duty machinery: Even if it doesn’t directly solve problems, it can allow
us to make certain analogies precise, which we can maybe only see from a
high-abstraction vantage point.</p>
<p>Fortunately for me, but unfortunately for my original outline for this post,
while writing this I’ve found <em>lots</em> of other, more direct, reasons to care about
this theory! So I’ve relegated this original plan to the
footnote you’re reading… <a href="https://youtu.be/YFX7SOLoi1Y?t=246">right. now</a>. <a href="#fnref:25" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:27" role="doc-endnote">
<p>See Juila Bergner’s <em>A Survey of $(\infty,1)$-Categories</em>
(available <a href="https://arxiv.org/pdf/math/0610239.pdf">here</a>) for more. <a href="#fnref:27" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Mon, 11 Jul 2022 04:00:00 +0000
https://grossack.site/2022/07/11/homotopy-of-homotopies.html
https://grossack.site/2022/07/11/homotopy-of-homotopies.htmlAn Interlude -- Quasicategories (Homotopy Theories pt 3/4)<p>In the <a href="infinity-categories">previous post</a>, we defined $\infty$-categories as being
categories “enriched in simplicial sets”. These are nice, and
fairly quick to introduce, but if you start reading the
$\infty$-category theoretic literature, you’ll quickly run into
another definition: <span class="defn">Quasicategories</span>.
In this post, we’ll give a quick introduction to quasicategories,
and talk about how they’re related to the $\mathcal{S}$-enriched
categories we’ve come to know and love.</p>
<hr />
<p>First, then, a reminder:</p>
<p>A <span class="defn">Simplicial Set</span> is a (contravariant)
functor from the <a href="https://en.wikipedia.org/wiki/Simplex_category">simplex category</a> to $\mathsf{Set}$.</p>
<p>This seems strange at first, but it turns out that we can model
many different kinds of behavior using simplicial sets.</p>
<p>Most notably, simplicial sets are a common generalization of both
topological spaces (up to homotopy) and categories!</p>
<p>We’ve already seen how simplicial sets can represent topological spaces up to
homotopy (this is given by a quillen equivalence of model structures on
$\mathsf{Top}$ and $s\mathsf{Set}$). To a simplicial set, we associate its
<a href="https://ncatlab.org/nlab/show/geometric+realization#OfSimplicialSets">geometric realization</a>, and to a topological space we associate its
(singular) simplicial set $\text{Sing}_X$ where $\text{Sing}_X(\Delta^n)$
is the set of continuous maps from the topological $n$-simplex into $X$.</p>
<p>We know that $X$ is determined up to (weak) homotopy equivalence by
$\text{Sing}_X$, so we previously defined a
<span class="defn">Simplicial Category</span> to be a category enriched in
simplicial sets. We wanted to intuitively view this as a category with a
geometric space of arrows between any two objects, as this came up in our
study of $\infty$-categories. This turns out to be
a great definition for intuition, but it can be a bit difficult to work with.</p>
<p>I mentioned earlier that simplicial sets can <em>directly</em> generalize categories…
Maybe there’s some way to define $\infty$-categories directly as simplicial
sets as well?</p>
<p>The answer will be “yes”, but let’s start small. Is there a way for us to
recognize when a simplicial set is $\text{Sing}_X$ for some topological
space $X$?</p>
<hr />
<div class="boxed">
<p>The <span class="defn">$i$th Horn</span> of $\Delta^n$ is (geometrically)
the space we get by removing the interior of $\Delta^n$, along with the
face opposite the $i$th vertex<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>. We denote it $\Lambda^n_i$</p>
</div>
<p>For instance, let’s look at $\Delta^2$. This is a solid triangle, and it
has three horns – one for each vertex.</p>
<p>Concretely, $\Delta^2$ is given by</p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/delta2.png" width="50%" />
</p>
<p>Then the $0$th horn of $\Delta^2$, denoted $\Lambda^2_0$, is what we get by
removing the interior 2-cell, along with the 1-cell opposite $0$:</p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/horn-2-0.png" width="50%" />
</p>
<p>Analogously, we get $\Lambda^2_1$ by removing the interior 2-cell and the
1-cell opposite $1$:</p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/horn-2-1.png" width="50%" />
</p>
<p>and we get $\Lambda^2_2$ by removing the interior 2-cell and the 1-cell
opposite $2$:</p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/horn-2-2.png" width="50%" />
</p>
<p>What about the horns of $\Delta^3$? Well now, we remove the interior 3-cell
(the “volume” of the simplex) as well as the 2-cell opposite your favorite
vertex. Concretely, we see $\Lambda^3_0$ is given by<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">2</a></sup></p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/horn-3-0.png" width="50%" />
</p>
<p>Similarly, $\Lambda^3_1$ is</p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/horn-3-1.png" width="50%" />
</p>
<p>$\Lambda^3_2$ is</p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/horn-3-2.png" width="50%" />
</p>
<p>and $\Lambda^3_3$ is</p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/horn-3-3.png" width="50%" />
</p>
<div class="boxed">
<p>As a (quick?) exercise, you should try to write down a definition of
$\Lambda^n_i$ as a simplicial set.</p>
<p>Remember that, by the yoneda lemma, it (more or less) suffices to say what the
$k$-cells are for each $k$<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">3</a></sup>.</p>
</div>
<hr />
<p>Now then, we come to an important definition</p>
<div class="boxed">
<p>A simplicial set $X$ is called a
<span class="defn">Kan Complex</span>
if every horn $\Lambda^n_i$ in $X$ can be
“filled” by a $\Delta^n$.</p>
</div>
<p>In a commutative diagram, we ask that the following dotted
morphism should always exist:</p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/filling-horns.png" width="33%" />
</p>
<p>Why care about this? Because of the following major theorem:</p>
<div class="boxed">
<p>For every topological space $X$, $\text{Sing}_X$ is a kan complex.</p>
<p>Moreover, (up to weak equivalence), every kan complex arises in this way.</p>
</div>
<p>So we see that we can completely recover the notion of topological space
(up to homotopy) by looking at special simplicial sets… But wasn’t this
all supposed to have something to do with category theory?</p>
<hr />
<p>Just like every topological space $X$ defines a simplicial set
\(\text{Sing}_X\), every category <em>also</em> defines a simplicial set,
called the <span class="defn">Nerve</span> of the category $\mathcal{C}$.</p>
<p>In general, the $n$-cells in the nerve $\mathcal{N}(\mathcal{C})$ will be
given by the “paths” of length $n$ made of arrows in $\mathcal{C}$. That is</p>
<ul>
<li>The 0-cells will be objects of $\mathcal{C}$</li>
<li>The 1-cells will be the arrows, $C_0 \to C_1$</li>
<li>The 2-cells will be the paths of length 2, $C_0 \to C_1 \to C_2$</li>
<li>The 3-cells will be the paths of length 3, $C_0 \to C_1 \to C_2 \to C_3$</li>
<li>etc.</li>
</ul>
<p>Concretely, let’s look at the following category (where $k = hf = hg$):</p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/cat.png" width="50%" />
</p>
<p>Then its nerve should have three 0-cells ($A$, $B$, and $C$),
plus 1-cells for $f$, $g$, $h$, and the composite $k$
(notice this is only <em>one</em> 1-cell, since it’s a single arrow in $\mathcal{C}$).
However, we add <em>two</em> 2-cells:</p>
\[A \overset{f}{\to} B \overset{h}{\to} C\]
\[A \overset{g}{\to} B \overset{h}{\to} C\]
<p>since $k$ arises as a composite in <em>two</em> ways: $hf$ and $hg$.</p>
<p>Thus, the nerve of $\mathcal{C}$ is a <em>cone</em></p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/nerve.png" width="50%" />
</p>
<p>Perhaps a better way to visualize this is as a <em>disk</em> instead:</p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/nerve2.png" width="50%" />
</p>
<p>Of course, it’s easy to guess the next question. Can we tell
<em>which</em> simplicial complexes arise as the nerve of some category?</p>
<p>Again, the answer is <em>yes</em>, and the answer will look shockingly similar
to the case of topological spaces!</p>
<div class="boxed">
<p>A simplicial complex is called a
<span class="defn">Quasicategory</span>
if every “inner horn” has a fill.</p>
<p>That is, every horn $\Lambda^n_i$ should have a fill, except
when $i=0$ or $i=n$.</p>
</div>
<p>This should make sense as a definition, since in a category <em>composition</em>
tells us that we can fill inner horns!</p>
<p>Indeed, consider the inner horn $\Lambda^2_1$:</p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/inner-horn.png" width="50%" />
</p>
<p>If this diagram lives inside the nerve of a category $\mathcal{N}(\mathcal{C})$,
then we can always fill the horn! Indeed, we have a 1-cell from $0 \to 2$
given by $gf$. We also have a 2-cell filling this triangle given by
the path $0 \overset{f}{\to} 1 \overset{g}{\to} 2$.</p>
<p style="text-align:center;">
<img src="/assets/images/quasicategories/filled-inner-horn.png" width="50%" />
</p>
<div class="boxed">
<p>As a cute exercise, you should check that the two inner horns
$\Lambda^3_1$ and $\Lambda^3_2$ have fills in the nerve of a category.</p>
</div>
<hr />
<p>This all brings us to another major theorem:</p>
<div class="boxed">
<p>For every category, the nerve $\mathcal{N}(\mathcal{C})$ is a
quasicategory.</p>
<p>Moreover, if $X$ is a quasicategory where each inner horn has a
<em>unique</em> fill, then $X$ is isomorphic to the nerve of some category.</p>
<p>Moreover again, the nerve construction embeds the category of categories
fully and faithfully into the category of quasicategories.</p>
</div>
<p>Importantly, this means that <em>every kan complex is a quasicategory</em>! This tells us
that quasicategories allow us to treat spaces and categories on equal footing<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">4</a></sup>!</p>
<p>In particular, quasicategories give us a setting where we can
“do homotopy theory” to categories, and if you remember back to
the main post about $\infty$-categories, and how they solve the
formal issues with model categories, that’s exactly what we were looking for!</p>
<div class="boxed">
<p>Here’s another tentative definition. If this reminds you of the tentative
definition of a “homotopy theory” from the last post, you have good instincts.</p>
<p>An <span class="defn">$\infty$-category</span> is a quasicategory, where we
say two quasicategories <em>present the same $\infty$-category</em> if they are
<a href="https://ncatlab.org/nlab/show/model+structure+on+simplicial+sets#joyals_model_structure">homotopy equivalent</a> as simplicial sets.</p>
</div>
<p>This is great because it gives us a super concrete way of working with
$\infty$ categories.</p>
<p>In this framework, our categories literally are geometric objects! The
category of simplicial sets is a topos, so it has as many nice constructions
as we could want, and many of these preserve kan complexes and quasicategories.</p>
<p>For instance, now functors are just simplicial maps, many limits and colimits
of quasicategories can be computed as with geometric objects, exponentials
give us functor quasicategories, and all of these work as well as we could hope.</p>
<p>Because of this concreteness,
Jacob Lurie’s tomes on $\infty$-categories are primarily based on the
language of quasicategories. But in the main post, we defined an
$\infty$-category to be a category enriched in spaces…</p>
<p>How can we reconcile these viewpoints? Is there a way for us to apply
the machinery proven in Lurie’s books to the hammock localization of
a model category? Is there some way for us to
use these nice geometric definitions for computations involving quasicategories,
and leverage them to understand the simplicial categories we’ve been talking
about? Why have we given two seemingly unrelated definitions of an
$\infty$-category in the first place?</p>
<p>For answers to these questions and more, read on to the <a href="homotopy-of-homotopies">last post</a> in this
series<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">5</a></sup>!</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Super concretely, given $n+1$ vertices</p>
\[0 \lt 1 \lt \ldots \lt n-1 \lt n\]
<p>the $i$th horn $\Lambda^n_i$ is $\Delta^n$ minus two cells:</p>
<ul>
<li>the unique $n$-cell</li>
<li>the unique $n-1$ cell which doesn’t contain $i$</li>
</ul>
<p><a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>Sorry if these are hard to understand. Drawing 3d pictures is hard, haha.
Each image is made up of 1-cells (colored in black) as well as 2-cells
(shaded in blue). Moreover, in each picture we’ve omitted exactly one
2-cell from the boundary of the tetrahedron. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:4" role="doc-endnote">
<p>If it’s not clear what role yoneda plays in this situation, see
my answer <a href="https://math.stackexchange.com/questions/4475159/conceptualizing-presheaves-as-generalized-spaces/4475219#4475219">here</a>.</p>
<p>It’s also definitely worth reading Friedman’s
<em>An Elementary Illustrated Introduction to Simplicial Sets</em>
(avaialable <a href="http://arxiv.org/abs/0809.4221">here</a>). <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:5" role="doc-endnote">
<p>I originally put this in the main body, but I ended up deciding it
ruined the flow of the post too much. That said, I still think it’s
a fun (and enlightening) example, so I wanted to include it as a
~bonus exercise~ here:</p>
<div class="boxed">
<p>Show that two isomorphic categories give rise to homeomorphic nerves
(after taking <a href="https://ncatlab.org/nlab/show/geometric+realization">geometric realizations</a>, of course).</p>
<p>Then, show that two <em>equivalent</em> categories give rise to homotopy equivalent
nerves.</p>
</div>
<p><a href="#fnref:5" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>Thank goodness! I have been working on these for <em>so</em> long.
I really didn’t realize how big a project I was in for when I decided
to make a post clarifying the relationship between model categories
and $\infty$-categories…</p>
<p>I love how it’s turning out, but I’m so ready for these posts to be
behind me, haha. I also didn’t want to post any of them until they
were all done. In part because I spent a lot of time moving various
bits back and forth between posts, and once one is public I would want
to consider it (mostly) set in stone. But also because I wanted to make
sure I actually finished them all.</p>
<p>I have a bad habit of starting a series and leaving things unfinished
(rest in peace <a href="/2021/03/01/cohomology-intuitively.html">cohomology part 1</a>),
but it was important that I not do that to these posts. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Mon, 11 Jul 2022 03:00:00 +0000
https://grossack.site/2022/07/11/quasicategories.html
https://grossack.site/2022/07/11/quasicategories.htmlWhy Care About Infinity Categories? (Homotopy Theories pt 2/4)<p>The title of this post is slightly misleading. It will be almost entirely
about $\infty$-categories, but it <em>will</em> have a focus on how
$\infty$-categories solve some of the formal problems with model categories
that we outlined in <a href="model-categories">part 1</a> of this trilogy.</p>
<p>With the intro out of the way, let’s have a quick recap.</p>
<hr />
<p>The setting we’re interested in is a category $\mathcal{C}$ equipped with
some arrows $\mathcal{W}$ that morally <em>should</em> be isomorphisms, but which
aren’t (these are called <em>weak equivalences</em>, and a pair $(\mathcal{C}, \mathcal{W})$
is called a <em>relative category</em>).
The quintessential examples are the (weak) homotopy equivalences in
$\mathsf{Top}$, and the quasi-isomorphisms of chain complexes.</p>
<p>We want to build a new category
(called the <em>homotopy category</em> or <em>localization</em>)
$\mathcal{C}[\mathcal{W}^{-1}]$ where we freely invert all the arrows in
$\mathcal{W}$<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>. The issue is that this category is quite badly behaved.
For instance, even if $\mathcal{C}$ is (co)complete,
$\mathcal{C}[\mathcal{W}^{-1}]$ almost never is.</p>
<p>A <a href="https://en.wikipedia.org/wiki/Model_category">model structure</a> on a pair $(\mathcal{C}, \mathcal{W})$ is a choice of
two new families of arrows, called <em>fibrations</em> and
<em>cofibrations</em>, plus axioms saying how they relate to
each other and to the weak equivalences $\mathcal{W}$.</p>
<p>A model structure on $(\mathcal{C}, \mathcal{W})$ solves many of the
computational problems with $\mathcal{C}[\mathcal{W}^{-1}]$. For instance,
now the homotopy category is guaranteed to be locally small, and we have
effective techniques for computing homotopy (co)limits,
derived functors, and guaranteeing that two relative categories have
equivalent localizations. All of these techniques go through
<em>bifibrant replacement</em>, which is a generalization of the familiar
projective and injective resolutions from homological algebra.</p>
<p>Unfortunately, as we noted at the end of the last post, model categories are
not without flaws. As soon as we want to talk about relationships <em>between</em>
model categories, we start running into trouble. For instance, there’s
no good notion of a functor between model categories.</p>
<p>What, then, are we to do? The answer is to fully accept homotopy theory as
part of our life, and to move to a notion of “category” which is better
equipped to handle the geometry which is implicit in the machinery of
model categories.</p>
<hr />
<p>To start, we need a model of “spaces” that’s more convenient to work with
than topological spaces.</p>
<div class="boxed">
<p>A <span class="defn">Simplicial Set</span> is a functor
$X : \Delta^{\text{op}} \to \mathsf{Set}$.</p>
<p>Here $\Delta$ (the <em>simplex category</em>) is the category whose objects are
the nonemtpy finite totally ordered sets, with order preserving maps.</p>
</div>
<p>Every object in $\Delta$ is isomorphic to an object of the form</p>
\[0 \lt 1 \lt 2 \lt \ldots \lt n\]
<p>which we call $\Delta^n$.</p>
<p>We should think of $\Delta^n$ as representing the <a href="https://en.wikipedia.org/wiki/Simplex">$n$-simplex</a>,
where we’ve chosen a total order the $n+1$ vertices. This allows us
to orient all the edges in a consistent way<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>.</p>
<p>Now, to every topological space $X$ we can associate a simplicial set
$\text{Sing}_X$, where $\text{Sing}_X(\Delta^n)$ is the set of continuous
from the (topological) $n$-simplex into $X$. This construction is familiar
from, for instance, <a href="https://en.wikipedia.org/wiki/Singular_homology">singular homology</a>.</p>
<p>It turns out that \(\text{Sing}_X\) perfectly remembers the homotopy type of $X$,
in the sense that we can recover $X$ (up to homotopy equivalence) knowing
only the simplicial set \(\text{Sing}_X\). This is some justification that
simplicial sets really <em>can</em> be used to model homotopy types.</p>
<p>But there’s a natural follow up question: Can we recognize when a
simplicial set is \(\text{Sing}_X\) for some space $X$? The answer,
of course, is <em>yes</em>, but we’re not going to talk about that here.
This topic deserves its own post, which is <a href="quasicategories">part 3</a>.</p>
<p>For now we’ll content ourselves with the knowledge that
simplicial sets, up to homotopy equivalence, represent spaces
in a convenient way. For those following along, this is the
Quillen equivalence between $s\mathsf{Set}$ and $\mathsf{Top}$ that
we mentioned in the previous post!</p>
<p>We’ll write $\mathcal{S}$ for the homotopy category of $s\mathsf{Set}$
(equivalently of $\mathsf{Top}$), and we’ll call the objects of this
category <em>spaces</em>.</p>
<hr />
<div class="boxed">
<p>A <span class="defn">Simplicial Category</span> is a category
<a href="https://en.wikipedia.org/wiki/Enriched_category">enriched</a> in $\mathcal{S}$.</p>
<p>That is, given two objects $A$ and $B$ in our category, we have a
<em>space</em> of morphisms $\text{Map}(A,B)$.</p>
<p>There is a natural notion of equivalence between two simplicial categories,
called <em>Dwyer-Kan equivalence</em>, then we (tentatively) say an
<span class="defn">$\infty$-category</span> is a simplicial category
up to DK-equivalence.</p>
<p>Given an $\infty$-category $\mathcal{C}$, there’s a natural way to get an
ordinary category back from it. We keep the same objects,
but replace the space of arrows $\text{Map}(A,B)$ by its
set of connected components. We call this the
<span class="defn">Homotopy Category</span> of $\mathcal{C}$.</p>
</div>
<p>As an example, we might have two arrows $f,g : A \to B$,
which are connected by homotopies $H$ and $K$. In this case we have a
“circle’s worth” of arrows from $A \to B$:</p>
<p style="text-align:center;">
<img src="/assets/images/infinity-categories/circle.png" width="30%" />
</p>
<p>When we take connected components, these two maps $f$ and $g$ are
identified, so we only see one arrow in the homotopy category.</p>
<p>We’ll see another, more realistic example, once we know how to build an
$\infty$-category from a relative category $(\mathcal{C}, \mathcal{W})$.</p>
<hr />
<p>So how <em>do</em> we build an $\infty$-category from a relative category?
This is called the
<span class="defn">Hammock Construction</span><sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup>, and while we won’t spend
<em>too</em> much time on this, it’s worth at least a few words!</p>
<p>We want to build up a space of maps $A \to B$.</p>
<p>The $0$-cells will be zigzags of the form:</p>
<ul>
<li>
\[A \to B\]
</li>
<li>
\[A \to C_0 \overset{\sim}{\leftarrow} B\]
</li>
<li>
\[A \to C_0 \overset{\sim}{\leftarrow} C_1 \to B\]
</li>
<li>
\[A \to C_0 \overset{\sim}{\leftarrow} C_1 \to C_2 \overset{\sim}{\leftarrow} B\]
</li>
<li>etc.<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup></li>
</ul>
<p>where the left-pointing arrows are weak equivalences. Notice that, after we
invert the weak equivalences, the left facing arrows will have inverses,
which will compose with the other right facing arrows to give an
honest-to-goodness arrow $A \to B$. That is, the $0$-cells really are
maps $A \to B$ in $\mathcal{C}[\mathcal{W}^{-1}]$.</p>
<p>Now, a $1$-cell from $f$ to $g$ should be a “proof” that $f$ and $g$ are homotopic.
If $f$ and $g$ have the same length, then we can represent this with a
“hammock”<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup>:</p>
<p style="text-align:center;">
<img src="/assets/images/infinity-categories/1hammock.png" width="50%" />
</p>
<p>Similarly, a $k$-cell will be a hammock that is $k$ strands “wide”.
Here’s the picture from the original paper:</p>
<p style="text-align:center;">
<img src="/assets/images/infinity-categories/wide-hammock.png" width="50%" />
</p>
<p>As before, the vertical arrows should all be weak equivalences, as should all
the backwards facing arrows in the zigzags.</p>
<p>Of course, we need the <a href="https://en.wikipedia.org/wiki/Simplicial_set#Face_and_degeneracy_maps_and_simplicial_identities">face maps</a> between simplicies. We get the $i$th face map
by just omitting row $i$ from the hammock. We get the $i$th degeneracy map
by repeating the $i$th row, using the identity map in each column as the weak
equivalence.</p>
<div class="boxed">
<p>The resulting $\infty$-category $L^H(\mathcal{C}, \mathcal{W})$ is called the
(Hammock) Simplicial Localization of $(\mathcal{C}, \mathcal{W})$.</p>
<p>Notice that the homotopy category of $L^H(\mathcal{C}, \mathcal{W})$
is equivalent to the category $\mathcal{C}[\mathcal{W}^{-1}]$, so we can
recover the “classical” localization from the hammock localization
whenever we want.</p>
</div>
<p style="text-align:center;">
<img src="/assets/images/infinity-categories/hammock.gif" width="50%" />
</p>
<hr />
<p>Let’s see an example. Consider the category</p>
<p style="text-align:center;">
<img src="/assets/images/infinity-categories/small-cat.png" width="50%" />
</p>
<p>where $hf = hg = k$.</p>
<p>Say we want to invert $h$. After doing this we see
that $f = g$, since we must have $f = h^{-1} k = g$. Thus we have
<em>lost</em> information in the homotopy category that we didn’t necessarily
plan on losing when we inverted $h$. Does the hammock localization
solve this problem?</p>
<p>Let’s compute $\text{Map}(A,B)$.</p>
<p>It should have a $0$ cell for each zigzag. So we’ll have $0$-cells</p>
<ul>
<li>$A \overset{f}{\to} B$</li>
<li>$A \overset{g}{\to} B$</li>
<li>$A \overset{k}{\to} C \overset{h}{\underset{\sim}{\leftarrow}} B$</li>
</ul>
<p>For $1$-cells, we’ll have two small hammocks:</p>
<p style="text-align:center;">
<img src="/assets/images/infinity-categories/eg-hammock-f.png" width="50%" />
</p>
<p style="text-align:center;">
<img src="/assets/images/infinity-categories/eg-hammock-g.png" width="50%" />
</p>
<p>So that $\text{Map}(A,B)$ is the space:</p>
<p style="text-align:center;">
<img src="/assets/images/infinity-categories/map-a-b.png" width="50%" />
</p>
<p>which is, of course, contractible. Crucially, though, this space is
capable of remembering more information about how the maps $A \to B$
in $\mathcal{C}[\mathcal{W}^{-1}]$ were built from maps in $\mathcal{C}$,
and how these relate to each other.</p>
<div class="boxed">
<p>Consider instead the category</p>
<p style="text-align:center;">
<img src="/assets/images/infinity-categories/another-cat.png" width="50%" />
</p>
<p>where $h_1 f = h_1 g = k = h_2 f = h_2 g$.</p>
<p>Show that, after inverting $h_1$ and $h_2$, $\text{Map}(A,B)$ is a circle.</p>
</div>
<hr />
<p>This is great, but it’s far from clear how we would do this for more
complicated categories! This is ok, though. Remember that
$\infty$-categories are conceptually nice but if you have a
<em>specific</em> category in mind, model categories are
where we actually want to do our computations.</p>
<p>It begs the question, though: just how nice <em>are</em> $\infty$-categories?
After all, this abstraction needs to justify itself somehow.</p>
<p>Let’s start with the most obvious issue with model categories: the lack of
functors between them. Once we pass to $\infty$-categories, then we know
exactly what to do! We have a good theory of functors between enriched
categories, and you can probably guess what the correct idea should be:</p>
<div class="boxed">
<p>If $\mathcal{C}$ and $\mathcal{D}$ are simplicial categories, then a
<span class="defn">Simplicial Functor</span> $F : \mathcal{C} \to \mathcal{D}$
is a pair</p>
<ul>
<li>$F_\text{obj}$ sending each object of $\mathcal{C}$ to an object of $\mathcal{D}$</li>
<li>$F_\text{map}$ which is a map of simplicial sets from each
\(\text{Map}_\mathcal{C}(A,B) \to \text{Map}_\mathcal{D}(F_\text{obj}A, F_\text{obj}B)\)</li>
</ul>
</div>
<p>Importantly, we have exactly one notion of functor between model categories,
namely the quillen equivalences. We would like to know that these are
faithfully preserved when we pass to $\infty$-categories, and indeed they are.
One can show that every quillen equivalence of model categories gives rise to
an equivalence (in the sense of $\infty$-categories) between the two
$\infty$-categories they present.</p>
<p>Next, not only do we <em>have</em> a notion of functor, but we have a notion of
functor which acts the way we expect it to. For instance between any
$\infty$-categories $\mathcal{I}$ and $\mathcal{C}$, the collection of
$\infty$-functors between them assembles into an $\infty$-category
$\text{Fun}(\mathcal{I}, \mathcal{C})$ with a simplicial analogue
of natural transformations as the arrows<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">6</a></sup>.</p>
<p>From here, developing (co)limits is easy!</p>
<div class="boxed">
<p>If $F : \mathcal{I} \to \mathcal{C}$ is a functor, then a
<span class="defn">Cone</span> over $F$ is an object $c \in \mathcal{C}$
plus a (simplicially enriched) natural transformation from the constant
$c$ functor $\text{const}_c$ to $F$.</p>
<p>We say that a cone on $c$ is a <span class="defn">Limit Cone</span> if for
every object $x \in \mathcal{C}$, the natural map</p>
\[\text{Map}_\mathcal{C}(x,c)
\to
\text{Map}_{\text{Fun}(\mathcal{I},\mathcal{C})}(\text{const}_x, F)\]
<p>is an equivalence of simplicial sets<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">7</a></sup>.</p>
</div>
<p>If this looks like an adjunction between a “limit” functor and
the “const” functor, then you’d be right! We can actually develop
adjunctions in this theory as well, and indeed basically any construction
or theorem you would want to use from classical $1$-category theory exists
in some form in $\infty$-category theory!</p>
<p>So, miraculously, in the passage from model categories to $\infty$-categories,
not only have we solved our quibbles about the formal properties of model
categories, but we’ve done so in the most powerful possible way!
We have access, in the $\infty$-category setting, to all the nice formal
properties that made classical $1$-category theory so effective.</p>
<p>What’s even more remarkable is that this notion of a (co)limit
computes homotopy (co)limits as a special case!</p>
<p>In the <a href="quasicategories">next post</a> about quasicategories, we introduce the
nerve construction, which lets us build an
$\infty$-category from a $1$-category. Well if $\mathcal{C}$ is a model
category, then a homotopy (co)limit of some functor
$F : \mathcal{I} \to \mathcal{C}$ is the same thing as the
$\infty$-category theoretic colimit of the induced ($\infty$-)functor
$\tilde{F}$ from the nerve of $\mathcal{I}$ to $L^H(\mathcal{C}, \mathcal{W})$.</p>
<p>Lastly, let’s note that this is all good for something.
Functors on $\infty$-categories can be “truncated” in a way
that induces functors on the homotopy categories
(see <a href="https://kerodon.net/tag/005Z">here</a>, for instance).
This means that, at least in the abstract, we can prove theorems using
the language of $\infty$-categories, and at the end of the day we can
take homotopy categories. After doing this any categories or functors
we’ve built will descend nicely. In particular, since the homotopy
category of $L^H(\mathcal{C}, \mathcal{W})$ is $\mathcal{C}[\mathcal{W}^{-1}]$
this gives us a very powerful mode of argument for dealing with localizations!</p>
<hr />
<p>This still leaves open some questions, though. First, it seems like there are
two competing notions of $\infty$-category. We can use simplicial categories
or we can use quasicategories. Since we want to think about relative
categories and $\infty$-categories as both presenting homotopy theories,
we now have <em>three</em> different ways we could define “homotopy theory”!</p>
<p>Moreover, while the hammock localization is nice, it seems really annoying
to work with. Can we play the game one more time, and end up with a
conceptually clean way to see what the hammock localization is
“really doing”?</p>
<p>For answers to both of these questions, I’ll see you in the
<a href="homotopy-of-homotopies">last post</a> of this <del>trilogy</del>… tetralogy?</p>
<p>Maybe it’s best to call it a trilogy with a long quasi-categorical ~<a href="quasicategories">bonus post</a>~.</p>
<p>Take care, all ^_^</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Think about $\mathsf{Top}$ where we invert all the (weak) homotopy
equivalences, or chain complexes of modules where we invert all the
quasi-isomorphisms. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>Notice this is exactly what we do to make (co)homological computations
with simplicial chains in topology. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>You can read more in the (surprisingly readable!) paper
<em>Calculating Simplicial Localizations</em> by Dwyer and Kan.</p>
<p>Another approach to simplicial localization is explained in another
readable paper by the same authors: <em>Simplicial Localizations of Categories</em>.
It’s a perfectly good definition, and works well in the abstract, but
the “hammock” definition is more amenable to direct computation, since it’s
slightly more explicit.</p>
<p>See <a href="https://ncatlab.org/nlab/show/simplicial+localization">here</a> for more details. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:4" role="doc-endnote">
<p>It’s actually <em>slightly</em> more general. The first arrow is also allowed to
point left, as long as things still alternate. For instance, we allow
a zigzag of the form</p>
\[A \overset{\sim}{\leftarrow} C_0 \to B\]
<p>as well.</p>
<p>For the specifics of the construction, see
<em>Calculating Simplicial Localizations</em> by Dwyer and Kan. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:5" role="doc-endnote">
<p>As before, we also allow the first map to face left instead of right,
as long as our maps strictly alternate. What does this mean for our hammocks?
The horizontal threads of our hammock must be oriented the same way!</p>
<p>For instance, this is a valid hammock:</p>
<p style="text-align:center;">
<img src="/assets/images/infinity-categories/legal-hammock-1.png" width="30%" />
</p>
<p>This is a valid hammock:</p>
<p style="text-align:center;">
<img src="/assets/images/infinity-categories/legal-hammock-2.png" width="30%" />
</p>
<p>But this is <em>not</em> a valid hammock:</p>
<p style="text-align:center;">
<img src="/assets/images/infinity-categories/illegal-hammock.png" width="30%" />
</p>
<p>Also, I can’t express enough how much I love this naming idea!
It’s quirky and apt in equal measure. 10/10. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:7" role="doc-endnote">
<p>We can actually get by with slightly less. $\mathcal{I}$ is allowed to be
<em>any</em> simplicial set. To see why this is more general, I’ll again
point you to the quasicategory post <a href="quasicategories">here</a>. <a href="#fnref:7" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:8" role="doc-endnote">
<p>If you want to learn about actually computing with $\infty$-categories,
I can’t recommend the Homotopy Theory Münster videos
<a href="https://www.youtube.com/watch?v=3IjAy0gHRyY&list=PLsmqTkj4MGTDenpj574aSvIRBROwCugoB">here</a> highly enough!</p>
<p>The second video already shows lots of sample computations for limits. <a href="#fnref:8" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Mon, 11 Jul 2022 02:00:00 +0000
https://grossack.site/2022/07/11/infinity-categories.html
https://grossack.site/2022/07/11/infinity-categories.htmlWhat are Model Categories? (Homotopy Theories pt 1/4)<p>I’m a TA at the <a href="https://uwo.ca/math/faculty/kapulkin/seminars/hottest_summer_school_2022.html">HoTTEST Summer 2022</a>, a summer school about
Homotopy Type Theory, and while I feel quite comfortable with the
basics of HoTT, there’s a <em>ton</em> of things that I should really know
better, so I’ve been doing a lot of reading to prepare. One thing
that I didn’t know anything about was <a href="https://ncatlab.org/nlab/show/infinity-category">$\infty$-categories</a>,
and the closely related <a href="https://en.wikipedia.org/wiki/Model_category">model categories</a>. I knew they had
something to do with homotopy theory, but I didn’t really know how.
Well, after lots of reading, I’ve finally figured it out, and I
would love to share with you ^_^.</p>
<p>This was originally going to be one post, but it ended up having
a lot of tangentially related stuff all mashed together, and it
felt very disorganized and unfocused<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>. So I’ve decided to split it
into <del>three</del> four posts, with each post introducing a new, more abstract
option, and hopefully saying how it solves problems present in
the more concrete settings.</p>
<p>Let’s get to it!</p>
<hr />
<p>First of all, what even <em>is</em> a “homotopy theory”?
Let’s look at the primordial example:</p>
<div class="boxed">
<p>Whatever a “Homotopy Theory” is, it should encompass the category $\mathsf{Top}$
of topological spaces where we identify spaces up to <a href="https://ncatlab.org/nlab/show/weak+homotopy+equivalence">(weak) homotopy equivalence</a></p>
</div>
<p>But there’s another motivating example, which we also call “homotopy”:</p>
<div class="boxed">
<p>Whatever a “Homotopy Theory” is, it should encompass the chains of modules,
where we identify two chains up to <a href="https://en.wikipedia.org/wiki/Chain_complex#Chain_homotopy">quasi-isomorphism</a></p>
</div>
<p>Obviously these are related – after all, from a topological space we can
get an associated “singular cochain” of $R$-modules.
Then a homotopy of spaces induces a homotopy of cochains, and indeed the
<a href="https://en.wikipedia.org/wiki/Cohomology">cohomology</a> of the cochain complex agrees with the cohomology of the space
we started with.</p>
<p>More abstractly, what links these situations? Well, we have some objects that
we want to consider “the same up to homotopy”, and we capture this
(as the category inclined are liable to do) by picking out some special arrows.
These are the “homotopy equivalences” – and they’re maps that we want to
think of as isomorphisms… but which might not <em>actually</em> be.</p>
<p>So, in $\mathsf{Top}$, we have the class of weak homotopy equivalences<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>,
which we want to turn into isomorphisms. And in $\mathsf{Ch}(R\text{-mod})$,
the category of chains of $R$-modules, we want to turn the <a href="https://en.wikipedia.org/wiki/Quasi-isomorphism">quasi-isomorphisms</a>
into isomorphisms.</p>
<p>With these examples in mind, what should a <em>homotopy theory</em> be?</p>
<div class="boxed">
<p>A <span class="defn">Relative Category</span> is a small<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup> category $\mathcal{C}$
equipped with a set of arrows
$\mathcal{W}$ (called the <span class="defn">Weak Equivalences</span>)
that contains all the isomorphisms in $\mathcal{C}$<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup>.</p>
</div>
<p>Following our examples, we want to think of the arrows in $\mathcal{W}$ as
being <em>morally</em> isomorphisms, even though they might not <em>actually</em> be
isomorphisms.</p>
<p>Now, a (small) category is an algebraic structure. It’s just a set
with some operations defined on it, and axioms those operations satisfy.
So there’s nothing stopping us from just… adding in new arrows,
plus relations saying that they’re inverses for
the arrows we wanted to be isomorphisms. By analogy with ring theory,
we call this new category the <span class="defn">Localization</span>
$\mathcal{C}[\mathcal{W}^{-1}]$. This is also called the
<span class="defn">Homotopy Category</span> of $(\mathcal{C}, \mathcal{W})$.</p>
<p>For example, if we localize $\mathsf{Top}$ at the weak homotopy equivalences,
we get the classical homotopy category $\mathsf{hTop}$. If we invert the
quasi-isomorphisms of chains of $R$ modules, we get the
derived category<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup> of $R$ modules $\mathcal{D}(R\text{-mod})$.</p>
<div class="boxed">
<p>Tentatively, then, we’ll say that a <em>homotopy theory</em> is a category of the
form $\mathcal{C}[\mathcal{W}^{-1}]$.</p>
<p>Notice that the choice of $(\mathcal{C}, \mathcal{W})$ is far from unique. It’s
entirely possible for two relative categories to have the same homotopy category<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">6</a></sup></p>
\[\mathcal{C_1}[\mathcal{W_1}^{-1}] \simeq \mathcal{C_2}[\mathcal{W_2}^{-1}]\]
<p>In this situation we say that \((\mathcal{C}_1, \mathcal{W}_1)\) and
\((\mathcal{C}_2, \mathcal{W}_2)\) <em>present the same homotopy theory</em>.</p>
</div>
<p>There are many important examples of two relative categories
presenting the same homotopy theory. To start, let’s consider the
category $s\mathsf{Set}$ of <a href="https://en.wikipedia.org/wiki/Simplicial_set">simplical sets</a>, equipped with a notion
of weak equivalence. It turns out that this presents the same homotopy
theory as $\mathsf{Top}$ with weak homotopy equivalences!</p>
<p>This means that if we have a question about topological spaces up to homotopy,
we can study simplicial sets instead, with <em>no loss of information</em>! This is
fantastic, since simplicial sets are purely combinatorial objects, so
(in addition to other benefits)
it’s much easier to tell a computer how to work with them<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">7</a></sup>!</p>
<hr />
<p>This is great, but there’s one hitch…</p>
<p>The homotopy category $\mathcal{C}[\mathcal{W}^{-1}]$ might be
<em>terribly behaved</em>. For instance, even if $\mathcal{C}$ is (co)complete,
the homotopy category almost never is<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">8</a></sup>!
Even worse, it’s possible that we end up with a <em>proper class</em> of arrows between
two objects, even if $\mathcal{C}$ started out locally small. Lastly, it’s
difficult to tell when two relative categories present the same homotopy
theory.</p>
<p>This is all basically because arrows in $\mathcal{C}[\mathcal{W}^{-1}]$ are
<em>really hard</em> to understand! After all, a general arrow from $A$ to $B$ in
$\mathcal{C}[\mathcal{W}^{-1}]$ looks like this:</p>
\[A \overset{\sim}{\leftarrow}
C_0 \to
C_1 \overset{\sim}{\leftarrow}
\cdots
C_k \to B\]
<p>where all the $\overset{\sim}{\leftarrow}$s are in $\mathcal{W}$.
After we invert $\mathcal{W}$, these arrows all acquire right-facing
inverses, which we can compose in this chain to get an honest arrow
$A \to B$.</p>
<p>There’s a <em>zoo</em> of techniques for working with homotopy categories, but
they all come down to trying to tame this unwieldy definition of arrow<sup id="fnref:9" role="doc-noteref"><a href="#fn:9" class="footnote" rel="footnote">9</a></sup>.
In this post, we’ll be focusing on a very flexible approach by endowing
$(\mathcal{C}, \mathcal{W})$ with a <span class="defn">Model Structure</span>.</p>
<hr />
<p>Roughly, to put a model structure on $(\mathcal{C}, \mathcal{W})$
(which we now assume to be complete and cocomplete) is to
choose two new subfamilies of arrows: the
<span class="defn">fibrations</span> and the <span class="defn">cofibrations</span>.</p>
<p>From these, we define some “nice” classes of objects:</p>
<ul>
<li>$X$ is called <span class="defn">fibrant</span> if the unique arrow to the
terminal object is a fibration</li>
<li>$X$ is called <span class="defn">cofibrant</span> if the unique arrow from
the initial object is a cofibration</li>
<li>$X$ is called <span class="defn">bifibrant</span> if it is both fibrant and cofibrant</li>
</ul>
<p>Let’s see some examples:</p>
<p>A <em>fibration</em> $f$ is an arrow that’s easy to “lift” into from cofibrant objects $A$:</p>
<p style="text-align:center;">
<img src="/assets/images/model-categories/fibration.png" width="30%" />
</p>
<p>For instance, covering spaces and bundles are examples of fibrations in
topology. If we restrict attention to the <a href="https://en.wikipedia.org/wiki/CW_complex">CW-complexes</a> then every
object is cofibrant, and this statement is basically the
<a href="https://ncatlab.org/nlab/show/homotopy+lifting+property">homotopy lifting property</a> for covering spaces.</p>
<p>Algebraically, there’s a model structure where a map of chains of
$R$-modules $f : A_\bullet \to B_\bullet$ is a fibration if each $f_n$ is a surjection.
The cofibrant objects in this model structure are exactly the levelwise
projective complexes, so this lifting property becomes the usual
lifting property for projective modules.</p>
<p>Dually, a <em>cofibration</em> $i$ is an arrow that’s easy to “extend” out of
provided our target $Y$ is fibrant:</p>
<p style="text-align:center;">
<img src="/assets/images/model-categories/cofibration.png" width="30%" />
</p>
<p>We should think of a cofibration as being a subspace inclusion
$A \hookrightarrow X$ where $A$ “sits nicely” inside of $X$.</p>
<p>For instance, the inclusion arrow of a “good pair” is a cofibration. Thus
inclusion maps of subcomplexes of a simplicial/CW/etc. complex are cofibrations.
More generally, any subspace inclusion $A \hookrightarrow X$ where $A$ is a
<a href="https://ncatlab.org/nlab/show/neighborhood+retract">nieghborhood deformation retract</a> in $X$ will be a cofibration.</p>
<p>Algebraically, a map
$f : A_\bullet \to B_\bullet$ is a cofibration exactly when each $f_n$ is an
injection whose cokernel is projective<sup id="fnref:17" role="doc-noteref"><a href="#fn:17" class="footnote" rel="footnote">10</a></sup>. This is a somewhat subtle condition,
which basically says that each $B_n \cong A_n \oplus P_n$ where $P_n$ is
projective. It should be intuitive that given a map $A_n \to Y_n$, it’s
easy to extend this to a map $B_n \to Y_n$ under these hypotheses.</p>
<p><br /></p>
<p>Precisely, these triangles are really special cases of squares. In the
fibration case, the left side of the square is the unique map from the
initial object to $A$. Dually, in the cofibration case the right
hand side of the square should really be the unique map from $Y$ to the
terminal object.</p>
<p>This lets us unify these diagrams into a single axiom, which says that
a square of the form</p>
<p style="text-align:center;">
<img src="/assets/images/model-categories/full-square.png" width="30%" />
</p>
<p>has a lift (the dotted map $X \to E$)
whenever $i$ is a cofibration, $f$ is a fibration, and one of $i$
or $f$ is a weak equivalence.</p>
<hr />
<p>The model category axioms<sup id="fnref:10" role="doc-noteref"><a href="#fn:10" class="footnote" rel="footnote">11</a></sup> imply that every object is weakly equivalent to a
bifibrant object. Since, after localizing, our
weak equivalences become isomorphisms, this means we can restrict attention to
the bifibrant objects… But why bother?</p>
<p>Well in any model category we have a notion of “homotopy” between maps
$f,g : A \to B$ which is entirely analogous to the topological notion.
Then by studying homotopy-classes of maps, we’ll be able to get a great
handle on the arrows in $\mathcal{C}[\mathcal{W}^{-1}]$!</p>
<p>Precisely, each object $A$ is weakly equivalent to a <a href="https://ncatlab.org/nlab/show/cylinder+object">cylinder object</a>
$A \times I$ which acts like $A \times [0,1]$ in topology<sup id="fnref:13" role="doc-noteref"><a href="#fn:13" class="footnote" rel="footnote">12</a></sup>. In particular,
it has two inclusions
$\iota_0 : A \to A \times I$ and $\iota_1 : A \to A \times I$.</p>
<div class="boxed">
<p>⚠ This is <em>not</em> in general an actual product with some element $I$.
It’s purely notational. Some authors use $A \wedge I$ instead, but
I don’t really like that either.</p>
<p>The best notation is probably $\text{Cyl}(A)$ or something similar, but
in this post I want to emphasize the relationship with the classical
topological case, so I’ll stick with $A \times I$.</p>
</div>
<p>Now if $f, g : A \to B$, then a homotopy between $f$ and $g$ is
a map $H : A \times I \to B$ so that the following triangle commutes:</p>
<p style="text-align:center;">
<img src="/assets/images/model-categories/homotopy-triangle.png" width="33%" />
</p>
<p>If there is a homotopy between $f$ and $g$, we say that $f$ and $g$ are
<em>homotopic</em> or <em>homotopy equivalent</em>, often written $f \sim g$.</p>
<p>This brings us to the big punchline:</p>
<div class="boxed">
<p>Let $(\mathcal{C}, \mathcal{W})$ be a model category.</p>
<ol>
<li>
<p>If $A$ and $B$ are bifibrant then homotopy equivalence really is
an equivalence relation on $\text{Hom}_\mathcal{C}(A,B)$. Moreover,
composition is well defined on the equivalence classes.</p>
</li>
<li>
<p>$\mathcal{C}[\mathcal{W}^{-1}]$ is equivalent to the category whose
objects are bifibrant objects of $\mathcal{C}$, and whose arrows are
homotopy equivalence classes of arrows in $\mathcal{C}$.</p>
</li>
</ol>
</div>
<p>Thus, a very common way we use model structures to perform computations is by
first replacing the objects we want to compute with by weakly equivalent
bifibrant ones. For instance, we might replace a module by a projective
resolution<sup id="fnref:15" role="doc-noteref"><a href="#fn:15" class="footnote" rel="footnote">13</a></sup>. Then maps in the homotopy category $\mathcal{C}[\mathcal{W}^{-1}]$
are just maps in $\mathcal{C}$ up to homotopy<sup id="fnref:11" role="doc-noteref"><a href="#fn:11" class="footnote" rel="footnote">14</a></sup>!</p>
<p>Notice this, off the bat, solves one of the problems with homotopy categories.
Maybe $\mathcal{C}[\mathcal{W}^{-1}]$ isn’t locally small, but it’s
<em>equivalent</em> to something locally small.</p>
<p>Moreover, model structures give us a very flexible way to tell when two
relative categories have the same homotopy theory. Indeed, say we we have
a pair of adjoint functors $L \dashv R$ between $\mathcal{C}_1$ and
$\mathcal{C}_2$ that respect the weak equivalences in the sense that
$f : c_1 \to R(c_2)$ is in $\mathcal{W}_1$ if and only if its adjoint
$\tilde{f} : L(c_1) \to c_2$ is in $\mathcal{W}_2$.
Then $\mathcal{C}_1[\mathcal{W}_1^{-1}] \simeq \mathcal{C}_2[\mathcal{W}_2^{-1}]$,
and moreover, this equivalence can be computed from the adjunction $L \dashv R$.</p>
<p>This is called a <a href="https://ncatlab.org/nlab/show/Quillen+equivalence">Quillen Equivalence</a> between \((\mathcal{C}_1, \mathcal{W}_1)\)
and \((\mathcal{C}_2, \mathcal{W}_2)\)<sup id="fnref:12" role="doc-noteref"><a href="#fn:12" class="footnote" rel="footnote">15</a></sup>.</p>
<p>Moreover again, even if $\mathcal{C}[\mathcal{W}^{-1}]$ doesn’t have (co)limits,
we <em>can</em> always construct <a href="https://en.wikipedia.org/wiki/Homotopy_colimit">homotopy (co)limits</a>, which we can compute
using the same cylinder objects from before. For instance, the homotopy pushout
of</p>
<p style="text-align:center;">
<img src="/assets/images/model-categories/homotopy-pushout.png" width="30%" />
</p>
<p>will be the colimit of the related diagram:</p>
<p style="text-align:center;">
<img src="/assets/images/model-categories/homotopy-pushout-full.png" width="50%" />
</p>
<p>Geometrically, rather than gluing $X$ to $Y$ along $A$ directly,
we’re adding a <em>path</em> from $f(a)$ to $g(a)$. This is the same up to
homotopy, but is better behaved.
We do this by taking disjoint copies of $X$ and $Y$, and then gluing
one side of the cylinder $A \times I$ to $X$ along $f$, and gluing the
other side of $A \times I$ to $Y$ along $g$.</p>
<p>Something like this will always work, but knowing how to modify our diagram
(and why) can be quite involved<sup id="fnref:14" role="doc-noteref"><a href="#fn:14" class="footnote" rel="footnote">16</a></sup>. Thus we’ve succeeded in
computationally solving the (co)limit issue, but it would be nice to have a
more conceptual framework, in which it’s obvious why this is the
“right thing to do”…</p>
<hr />
<p>We’ve seen that a model structure on a relative category helps to make
computations in the localized category $\mathcal{C}[\mathcal{W}^{-1}]$
effective (or even possible). But model categories have a fair number
of problems themselves.</p>
<p>For one, there’s no <em>great</em> notion of a
functor between model categories. In the case that a functor
$F : (\mathcal{C}_1, \mathcal{W}_1) \to (\mathcal{C}_2, \mathcal{W}_2)$
comes from and adjoint pair, then we can <a href="https://ncatlab.org/nlab/show/derived+functor">derive</a> it to get a
functor on the homotopy categories, but this is too restrictive to
be the general notion of functor between model categories.</p>
<p>Related to functors, if $\mathcal{C}$ has a model structure and $\mathcal{I}$
is an indexing category, then $\mathcal{C}^\mathcal{I}$, the category of
$\mathcal{I}$-diagrams in $\mathcal{C}$ (with the pointwise weak equivalences)
may not have a model structure. See <a href="https://ncatlab.org/nlab/show/model+structure+on+functors">here</a>, for instance<sup id="fnref:18" role="doc-noteref"><a href="#fn:18" class="footnote" rel="footnote">17</a></sup>.</p>
<p>With this in mind, we would like to have a version of the theory of model
categories which has better “formal properties”. While working with a
<em>single</em> model category is often quite easy to do, as soon as one looks
for relationships <em>between</em> model categories, we’re frequently out of luck.
In fact, since (co)limits come from functors, as soon as we’re interested in
(co)limits we already run into this problem! This is one philosophical reason
for the complexity of homotopy (co)limits.</p>
<p>The solution lies upwards, in the land of $\infty$-categories. These are categories
with homotopy theoretic structure (in the classical sense of topological spaces)
built in from the start. Miraculously, these will solve all the above problems
and more – while the category of model categories is a terrible place to live
(if we can define it at all…) the category of $\infty$-categories<sup id="fnref:16" role="doc-noteref"><a href="#fn:16" class="footnote" rel="footnote">18</a></sup>
is extremely similar to the category of categories. Then, since every
model category presents an $\infty$-category, we’ll be able to use this
machinery to solve our formal problems with model categories!</p>
<p>How exactly does this work? You’ll have to read more in <a href="infinity-categories">part 2</a>!</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>This is one of the biggest problems with trying to explain things you know.
Especially while you’re still trying to sort them out yourself!</p>
<p>All of this material was scattered in my head with messy interconnections,
but of course words have to be linearly ordered on a page, and it took
a lot of work to figure out how to put these ideas into a fixed order.
Especially one which has a nice narrative.</p>
<p>It’s currently my fifth time restarting this post (now <del>trilogy</del> tetralogy),
but I think I’m finally happy with my outline.
I don’t know why I’m writing this footnote, to be honest. But it felt
like something I wanted to say, so here we are.</p>
<p>On with the show! <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>It turns out there’s <em>also</em> a model category structure whose homotopy
category gives homotopy equivalence, rather than weak homotopy equivalence.</p>
<p>But the model structure on $\mathsf{Top}$ which gives weak homotopy
equivalence is the “standard” one, so that’s what I’m listing as the
motivating example. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>A better word would probably be “strict”. Since we’re going to be
treating $\mathcal{C}$ like an algebraic structure pretty soon, it
should have fixed <em>sets</em> of objects and arrows, which we will manipulate
like we might manipulate the underlying set of a group, ring, etc. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:4" role="doc-endnote">
<p>Be careful, though, I’ve seen a handful of other definitions too!</p>
<p>I like this one because it means the obvious way to turn a
category into a relative category is to take just the isomorphisms.
Then localization from $\mathsf{RelCat} \to \mathsf{Cat}$ is
left adjoint to the functor sending $\mathcal{C}$ to
$(\mathcal{C}, {\text{isos}})$.</p>
<p>I’m not sure if I should require that $\mathcal{W}$ be closed under
composition… This doesn’t <em>feel</em> like it should break anything,
but I’m not 100% sure. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:5" role="doc-endnote">
<p>This was a pleasant surprise for me. I’ve heard a lot of talk about
derived categories, and they always seemed quite scary. It’s been very
exciting to feel like I’m getting a two-for-one deal every time I notice
another concept in this subject start to make sense – after all, it
means I’m learning about both homotopy theories <em>and</em> derived categories!
^_^ <a href="#fnref:5" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:6" role="doc-endnote">
<p>As an aside, as a topos theorist, this all feels a bit familiar.</p>
<p>Just like a model structure (which we’ll define later) is some
structure that presents a homotopy
theory in a way that lets us do concrete computation, a <a href="https://mathoverflow.net/questions/10364/categorical-homotopy-colimits">site</a> is
a structure that presents a (grothendieck) topos and lets us do
concrete computations.</p>
<p>Now, in the topos theory world,
Olivia Caramello’s bridges program is based on the idea that we can
find nontrivial relationships between two sites presenting the same
topos… I wonder if there are any theorems that let us relate two
model categories presenting the same homotopy theory. <a href="#fnref:6" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:7" role="doc-endnote">
<p>See, for instance, the sage documentation <a href="https://doc.sagemath.org/html/en/reference/categories/sage/categories/simplicial_sets.html">here</a> and <a href="https://doc.sagemath.org/html/en/reference/topology/sage/topology/simplicial_complex.html">here</a>. <a href="#fnref:7" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:8" role="doc-endnote">
<p>Even in the case of $\mathsf{hTop}$, we don’t have all (co)limits!
See <a href="https://mathoverflow.net/questions/10364/categorical-homotopy-colimits">here</a>, for instance! <a href="#fnref:8" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:9" role="doc-endnote">
<p>For instance you can only look at families $\mathcal{W}$ satisfying
the <a href="https://ncatlab.org/nlab/show/Ore+condition">Ore Conditions</a>. These say exactly that we can “commute”
weak equivalences past other arrows. Then, up to homotopy, every
arrow in the localization is of the form</p>
\[A \overset{\sim}{\leftarrow} C \to B\]
<p>and these are quite easy to manipulate.</p>
<p>See Sasha Polishchuk’s lectures on Derived Categories
<a href="https://www.youtube.com/watch?v=qFlt1XBNf4k&list=PLCe-H2N8-ny5nIYCQevWaJsO3PP44uz03&index=29">here</a>, for a really nice treatment using this language. <a href="#fnref:9" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:17" role="doc-endnote">
<p>Really we’re describing the <em>projective</em> model structure here.
There’s a dual model structure with the same weak equivalences
where we work with injectives instead. <a href="#fnref:17" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:10" role="doc-endnote">
<p>Which I still haven’t <em>really</em> told you, haha.</p>
<p>I don’t want to get into the precise details of a model structure here,
but you can (and should!) read more in Dwyer and Spalinski’s excellent
introduction <em>Homotopy Theory and Model Categories</em>, available
<a href="https://math.jhu.edu/~eriehl/616-s16/DwyerSpalinski.pdf">here</a>, for instance.</p>
<p>There’s a lot of good places to get intuition for model structures as well.</p>
<p>For instance, Mazel-Gee’s <em>The Zen of $\infty$-Categories</em>, avaialable
<a href="https://etale.site/writing/zen-of-infty-cats.pdf">here</a>, Kantor’s survey <em>Model Categories: Theory and Applications</em>,
available <a href="http://math.uchicago.edu/~may/REU2016/REUPapers/Kantor.pdf">here</a>, and of course, the <a href="https://en.wikipedia.org/wiki/Model_category">nlab</a>.</p>
<p>While we’re at it, there’s also Goerss and Schemmerhorn’s
<em>Model Categories and Simplicial Methods</em> (<a href="https://sites.math.northwestern.edu/~pgoerss/papers/ucnotes.pdf">here</a>), Hovey’s book
(<a href="https://people.math.rochester.edu/faculty/doug/otherpapers/hovey-model-cats.pdf">here</a>), and you can find a lot of good intuition in the
MO questions <a href="https://mathoverflow.net/questions/361191/applications-of-model-categories">here</a>, <a href="https://mathoverflow.net/questions/84381/computations-in-infty-categories">here</a>, <a href="https://mathoverflow.net/questions/169187/what-non-categorical-applications-are-there-of-homotopical-algebra?noredirect=1&lq=1">here</a>, and
<a href="https://mathoverflow.net/questions/78400/do-we-still-need-model-categories?noredirect=1&lq=1">here</a>. There’s also Ponto and May’s <em>More Concise Algebraic Topology</em>
(<a href="http://www.math.uchicago.edu/~may/TEAK/KateBookFinal.pdf">here</a>)… I could keep going, but I should probably get back to
writing the main body of the post. <a href="#fnref:10" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:13" role="doc-endnote">
<p>We build $A \times I$ by factoring the codiagonal
$A \coprod A \to A$ as</p>
\[A \coprod A \to A \times I \overset{\sim}{\to} A\]
<p>where $A \coprod A \to A \times I$ is a cofibration and
where $A \times I \overset{\sim}{\to} A$ is both a fibration and a
weak equivalence. <a href="#fnref:13" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:15" role="doc-endnote">
<p>This is made even more useful by the existence of <em>multiple</em> model structures
on $(\mathcal{C}, \mathcal{W})$. Depending on the computation, we might choose
one model structure over another in order to make our lives as simple as possible.
For instance, we have two model structures on chain complexes, one based on
<a href="https://en.wikipedia.org/wiki/Projective_module">projectives</a> and one based on <a href="https://en.wikipedia.org/wiki/Injective_module">injectives</a>. Then computations
involving these model structures reduce to the classical projective or
injective resolutions which you may recognize from homological algebra! <a href="#fnref:15" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:11" role="doc-endnote">
<p>For an example of this idea in action, see <a href="https://math.stackexchange.com/questions/4461610/maps-in-the-homotopy-category-and-derived-category-to-and-from-concentrated-in/4461656#4461656">this</a> answer of mine. <a href="#fnref:11" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:12" role="doc-endnote">
<p>For example, the earlier example of $\mathsf{Top}$ and $s\mathsf{Set}$,
which have the same homotopy theory, comes from a quillen equivalence.
See <a href="https://ncatlab.org/nlab/show/classical+model+structure+on+simplicial+sets#quillen_equivalence_with_">here</a>, for instance.</p>
<p>In fact, quillen equivalence is stronger
tronger way than we currently have the language to describe.
Not only are the localizations (read: homotopy categories) equivalent,
but actually the presented $\infty$-categories are equivalent too! <a href="#fnref:12" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:14" role="doc-endnote">
<p>Even though it’s complicated, this <em>is</em> a solved problem. We understand
how to take a diagram and massage it into a new, “homotopy coherent”
diagram.
The idea is again one of bifibrant replacement!</p>
<p>In many cases we can put a model structure on the category of
functors into a model category $\mathcal{C}$. Then instead of taking the
(co)limit of a diagram $F$, we cake the (co)limit of its bifrant replacement.</p>
<p>See either the notes by Dugger <a href="https://pages.uoregon.edu/ddugger/hocolim.pdf">here</a> or by Hirschhorn <a href="https://math.mit.edu/~psh/notes/hocolim.pdf">here</a>
for specifics.</p>
<p>Also, after reading those, notice that already the best way to organize
this data is with some kind of simplicial object… and keep a pin in that. <a href="#fnref:14" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:18" role="doc-endnote">
<p>Though, thankfully, most model categories that arise in practice are
(quillen equivalent to) a (simplicial) combinatorial model category.</p>
<p>In particular, this means that we can usually put a model structure on
the category of diagram in $\mathcal{C}$. In fact, this is one way to
effectively compute homotopy (co)limits in practice. We replace our
functor $F : I \to \mathcal{C}$ by a weakly equivalent bifibrant
functor $\tilde{F} : I \to \mathcal{C}$ and then output the
(weak equivalence class of) the (co)limit of $\tilde{F}$.</p>
<p>This is basically the derived functor approach to homotopy (co)limits,
and while it’s effective, it requires us to <em>choose</em> a bifibrant
replacement. Much like choosing coordinates or a basis makes some proofs
more annoying in the setting of differential geometry or linear algebra
(since we then have to prove our results are independent of this choice),
we would like to have a choice-free way of defining homotopy (co)limits. <a href="#fnref:18" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:16" role="doc-endnote">
<p>In fact, it eats its own tail, and we have an $\infty$-category of
$\infty$-categories. But more on that later. <a href="#fnref:16" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Mon, 11 Jul 2022 01:00:00 +0000
https://grossack.site/2022/07/11/model-categories.html
https://grossack.site/2022/07/11/model-categories.htmlChain Homotopies Geometrically<p>The definition of a <a href="https://en.wikipedia.org/wiki/Homotopy_category_of_chain_complexes">Chain Homotopy</a> has always felt a bit weird to me.
Like I know <em>that</em> it works, but nobody made it clear to me <em>why</em> it worked.
Well, the other night I was rereading part of Aluffi’s excellent
<em>Algebra: Chapter 0</em>, and I found a picture that totally changed my life!
In this post, we’ll talk about two ways of looking at chain homotopies that
make them <em>feel</em> more like their topological namesake.</p>
<p>(As an aside, I was reading through Aluffi because I’ve been thinking a
lot about derived categories lately, and I wanted to see how he motivated
them in an introductory algebra book. I’ll be coming out with a blog post<sup id="fnref:11" role="doc-noteref"><a href="#fn:11" class="footnote" rel="footnote">1</a></sup>,
hopefully quite soon, where I talk about model categories and their close
friends $\infty$-categories. It’s been easily my hardest ever post to write,
but I think it’s <em>almost</em> ready! So get excited ^_^)</p>
<hr />
<p>I want this to be a genuinely quick post, so I’ll be assuming some
familiarity with algebraic topology, mainly homology and chain complexes.
As a quick refresher, though:</p>
<div class="boxed">
<p>A <span class="defn">Chain Complex</span> of abelian groups is a sequence</p>
\[\cdots \overset{\partial_{n+1}}{\to} A_n \overset{\partial_n}{\to} A_{n_1} \overset{\partial_{n-1}}{\to} A_{n-2} \overset{\partial_{n-2}}{\to} \cdots\]
<p>so that $\partial_{n-1} \circ \partial_n = 0$ for all $n$.</p>
<p>We’ll refer to the entire complex by $A_\bullet$, and we’ll
(abusively) suppress the indices of the “boundary maps” $\partial_n$,
calling them all $\partial$<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">2</a></sup>.</p>
<p>In this notation, the chain condition is that $\partial^2 = 0$.</p>
</div>
<p>These first arose in topology. For instance, in order to study the (singular)
<a href="https://en.wikipedia.org/wiki/Homology_(mathematics)">homology</a> of a topological space $X$, we build a chain of abelian groups</p>
\[\cdots \to C_{n+1}(X) \to C_n(X) \to C_{n_1}(X) \to \cdots\]
<p>where $C_n(X)$ is the free abelian group on the set of maps from the
<a href="https://en.wikipedia.org/wiki/Simplex">$n$-simplex</a> into $X$ (and is $0$ for negative $n$), and if
$\sigma : \Delta^{n+1} \to X$, then $\partial \sigma$
is the (alternating) sum of the
$\sigma \restriction \Delta^n_i$, where $\Delta^n_i$ is the $i$th face
of the simplex.</p>
<p>Then the <span class="defn">Homology Groups</span> of $X$ are defined to be</p>
\[H_n(X) \triangleq \text{ker}(\partial_n) \big / \text{im}(\partial_{n+1})\]
<p>I won’t says any more about this here, but if you’re interested you should
read <a href="/2021/03/01/cohomology-intuitively.html">my old post</a> on cohomology<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">3</a></sup>.
What matters is that this “boundary operator” literally comes from the
boundary of a geometric object!</p>
<p>I also won’t try to motivate chain complexes in this post<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">4</a></sup>. It turns out they’re
extremely useful algebraic gadgets, and show up naturally in every branch of
modern geometry, as well as in “pure” algebra such as group theory,
module theory, representation theory, etc. Chain complexes are ubiquitious
in math nowadays, and if you haven’t met them yet, you’ll surely meet them soon!</p>
<p>Motivation aside, what matters is that this construction is
<em>functorial</em>, and so a map $f : X \to Y$ induces a “chain map”
$C_\bullet f : C_\bullet(X) \to C_\bullet(Y)$,
where $C_n f : C_n(X) \to C_n(Y)$ is the map that sends</p>
\[(\sigma : \Delta^n \to X) \mapsto (f \circ \sigma : \Delta^n \to Y)\]
<p>Moreover, it’s not hard to check that this $C_n f$ is well defined on
homology classes, so descends to a map $H_n f : H_n(X) \to H_n(Y)$.</p>
<p>Next, topologically we have a notion of “homotopy”:</p>
<div class="boxed">
<p>We say that two maps
$f, g : X \to Y$ are <span class="defn">Homotopic</span> if there is a map
$H : X \times [0,1] \to Y$ (the <em>homotopy</em> from $f$ to $g$) so that</p>
<ol>
<li>$H(x,0) = f(x)$</li>
<li>$H(x,1) = g(x)$</li>
</ol>
<p>We think of $H(x_0,t)$ as giving a path in $Y$ from $f(x_0)$ to $g(x_0)$,
so that we can “smoothly deform” $f$ into $g$.</p>
</div>
<p>For geometric reasons, we expect that two homotopic maps $f$ and $g$ should
induce the same map on homology<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">5</a></sup>, so we would like a way of translating
the homotopy $H$ into an algebraic object.</p>
<p>Enter the chain homotopy!</p>
<div class="boxed">
<p>Say $f$ and $g$ are two chain maps $A_\bullet \to B_\bullet$.</p>
<p>A <span class="defn">Chain Homotopy</span> $H$ from
$f$ to $g$ is a series of maps $H_n : A_n \to B_{n+1}$
so that<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">6</a></sup></p>
<p>\(f - g = \partial H + H \partial\)</p>
</div>
<p>At this point I’m legally required to show the following diagram:</p>
<p style="text-align:center;">
<img src="/assets/images/chain-homotopies/chain-homotopy.png" width="75%" />
</p>
<p>Now it’s a theorem that two chain homotopic maps induce the same maps
on homology. It’s then pretty easy to show that if $f, g : X \to Y$
are homotopic (witnessed by a homotopy $H$), then the map $C_\bullet f$
and $C_\bullet g$ are chain homotopic (witnessed by a chain homotopy
that we can build from $H$).</p>
<p>So this all works… but this condition has always seemed a bit odd to me.
Where is this definition of chain homotopy coming from? It must be
some translation from the topological notion of homotopy into the algebraic
world… But how?</p>
<p>Let’s find out!</p>
<hr />
<p>The first approach is the one that inspired me to write this post. You
can read the original on page $611$ of Aluffi’s <em>Algebra: Chapter 0</em>.
In case a new edition comes out, this is Section IX.4.3.</p>
<p>As an homage, let’s steal Aluffi’s diagram:</p>
<p style="text-align:center;">
<img src="/assets/images/chain-homotopies/homotopy.png" width="50%" />
</p>
<p>Let $a$ be an $n$-chain in $X$. Since high dimensions are hard to draw,
we picture it as a $1$-chain. That is, as a path in $X$.</p>
<p>Then $f(a)$ (formally $f \circ a$) is the $1$-chain in $Y$ shown on the
left of the diagram. Similarly $g(a)$ is the $1$-chain on the right of the
diagram. Now $H : X \times [0,1] \to Y$, so $H(a,t)$ is a map from
$\Delta^1 \times [0,1] \to Y$, that is, from the square into $Y$.
Aluffi is calling this $h(a)$, and it’s the “filled in” $2D$ square
that we see<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">7</a></sup>.</p>
<p>Now for the magic! Let’s look at the boundary of $h(a)$. This is
given by the (oriented) sum of the four sides of our square. So if we
read counterclockwise starting at the bottom right, we see</p>
\[\partial h(a) = g(a) - \delta_+ - f(a) + \delta_-\]
<p>But what is $\delta_+ - \delta_-$? Well, a moment’s thought shows that it’s
$h(\partial a)$! After all, since $a$ is a path in $X$, $\partial a$ is
just the endpoints of the path. So $h(\partial a)$ is $h$ applied to the
endpoints of $a$. That is, it’s the paths in $Y$ connecting $f(\partial a)$
to $g(\partial a)$, and these are exactly $\delta_+$ and $\delta_-$.</p>
<p>So then</p>
\[\partial h(a) = g(a) - f(a) - h(\partial a)\]
<p>or</p>
\[g - f = \partial h + h \partial\]
<p>which we recognize as the definition of a chain homotopy!</p>
<hr />
<p>There’s another, more conceptual, way to relate the notion of a chain
homotopy to the classical topological notion of a homotopy. This time,
we’ll go through the machinery of a <a href="https://en.wikipedia.org/wiki/Model_category">model category</a>.</p>
<p>Now, if I didn’t want to motivate <em>homology</em> in this post, I certainly
don’t want to motivate model categories! Or even define them for that matter.
I fully expect this material to be less approachable than what came before,
but that’s not so big a deal<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">8</a></sup>. To quote Eisenbud:</p>
<blockquote>
<p>You should think of [this] as something to return to when more of the
pieces in the vast puzzle of mathematics have fallen into place for you.</p>
</blockquote>
<p>Now, remember what a homotopy was in topology. If $f,g : X \to Y$, then
a homotopy is a map $H : X \times [0,1] \to Y$ so that $H(x,0) = f(x)$
and $H(x,1) = g(x)$.</p>
<p>Is there a way to <em>directly</em> imitate this definition in the category
of chain complexes? Surprisingly, the answer is <em>yes</em>!</p>
<p>What about the interval $[0,1]$ is really useful for this definition?
Well, obviously it has two special points $0$ and $1$. But moreover, it’s
contractible, so it’s equivalent to a point itself.
This is best made
precise through the language of model categories (or better, $\infty$-categories)
but we can find an object in the category of chains which plays the
same role!</p>
<p>How do we do it? Well, let’s build the (simplicial) chain complex of the interval!</p>
<p>We have two $0$-cells, so $C_0 = \mathbb{Z}^2$. Then we have one $1$-cell
connecting them, so $C_1 = \mathbb{Z}$. Then a moment’s thought about the
orientation shows that the boundary map should be</p>
\[\begin{align}
\mathbb{Z} &\overset{\partial}{\to} \mathbb{Z}^2 \\
1 &\mapsto (1,-1)
\end{align}\]
<p>So we can think of this chain complex
(with $0$s going off to the left and right) as a kind of
“<a href="https://ncatlab.org/nlab/show/interval+object+in+chain+complexes">interval object</a>” inside the category of chains<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">9</a></sup>. With that in mind,
let’s call it $I_\bullet$.</p>
<p>Now, if we have two maps $f,g : A_\bullet \to B_\bullet$, what’s the
most natural way to build a homotopy $H$ from $f$ to $g$? Well,
imitating topology, we should look at<sup id="fnref:9" role="doc-noteref"><a href="#fn:9" class="footnote" rel="footnote">10</a></sup></p>
\[H : A_\bullet \otimes I_\bullet \to B_\bullet\]
<p>Now here’s a fun (only slightly tricky) exercise<sup id="fnref:10" role="doc-noteref"><a href="#fn:10" class="footnote" rel="footnote">11</a></sup>:</p>
<div class="boxed">
<p>$H : A_\bullet \to B_{\bullet + 1}$ is a chain homotopy from $f$ to $g$ if and only if</p>
\[(H,f,g) : A_\bullet \otimes I_\bullet \to B_\bullet\]
<p>makes the following diagram commute</p>
<p style="text-align:center;">
<img src="/assets/images/chain-homotopies/tensor.png" width="50%" />
</p>
<p>Honestly, it’s a good enough exercise to just construct $\iota_0$
and $\iota_1$, plus check that $(H,f,g)$ really does define a map
from $A_\bullet \otimes I_\bullet \to B_\bullet$.</p>
<p>If you get stuck, you can find a proof in proposition 3.2 <a href="https://ncatlab.org/nlab/show/chain+homotopy#RelationToLeft">here</a></p>
</div>
<hr />
<p>I’ve been working with chain homotopies for years now, and out of familiarity
I’d stopped wondering what the definition really meant. Pragmatically this
was probably good for me, but I remember in my first algebraic topology class
being horribly confused by the origins of this definition, and I’m glad that
I finally figured out how this definition relates to the underlying geometry!
Hopefully you found this helpful
too if you’re still early in your time learning about chain homotopies. To me,
this makes the definition feel much more natural.</p>
<p>Now, though, it’s time for bed. Take care, all, and I’ll see you soon ^_^</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:11" role="doc-endnote">
<p>Really a trilogy of blog posts… There’s a reason it’s taking so long. <a href="#fnref:11" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:1" role="doc-endnote">
<p>Believe me, I sympathize if you’re new to the subject. It gets worse,
we often use $\partial$ to be the boundary maps (also called <em>differentials</em>)
for different complexes! Indeed, this is exactly the convention I’ll
take later in this post.</p>
<p>Thankfully, with experience you get used to this, and in the short term
there’s always a unique way to assign any fixed $\partial$ you happen to see
a “type” so that the entire expression <a href="https://en.wikipedia.org/wiki/Type_system#Type_checking">typechecks</a> <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>Wow… I sure did say I would write a sequal to this post… Over a year
ago…</p>
<p>I’ll get to it one day, haha. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>After working on the model category and $\infty$-category posts for
<em>so long</em>, I really want to write a quick post, haha. My goal is to
get this whole thing written in less than an hour. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:4" role="doc-endnote">
<p>If $X$ is a manifold, then we think of elements of $H_n(X)$ as
$n$-dimensional submanifolds of $X$ (though this is <a href="https://mathoverflow.net/questions/21171/when-is-a-homology-class-represented-by-a-submanifold">not quite true</a>)
where we identify two submanifolds if we can deform one into the other
inside of $X$.</p>
<p>For this reason, if $f$ and $g$ are the same up to deformation, then
they should do the same thing to submanifolds up to deformation! <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:5" role="doc-endnote">
<p>Again, apologies to those new to the subject. More precisely, for each $n$
we have</p>
\[f_n - g_n = \partial_{n+1}^B \circ H_n + H_{n-1} \circ \partial_n^A\]
<p>notice, as promised in an earlier footnote, we’re using $\partial$ for
the boundary maps on $A_\bullet$ and on $B_\bullet$. You really do get
used to it. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:6" role="doc-endnote">
<p>Astute readers will notice that a square is not a simplex, so we can’t
<em>really</em> be reasoning about it using homology.</p>
<p>But anyone who’s made a grilled cheese knows we can divide a square
into two triangles by cutting along a diagonal. Since $C_{2}(X)$ is
comprised of <em>formal sums</em> of $2$-simplices (triangles) in $X$, we can
identify this square with the sum of the two triangles inside it
and nothing goes wrong.</p>
<p>The fact that we can always cut up $\Delta^n \times [0,1]$ into
a bunch of $n+1$-simplicies is called the “prism argument”,
and you can read about it on page $112$ of Hatcher.
Again, to future proof this, this is in Section $2.1$.</p>
<p>It’s also worth noting that some people sidestep this issue by basing
homology on “cubical chains” in $X$ rather than simplicial chains.
Then we don’t need to go through the prism argument, since the
$n$-cube times an interval is already an $n+1$-cube!</p>
<p>While this idea has gained a lot of traction in the <a href="https://en.wikipedia.org/wiki/Homotopy_type_theory">HoTT</a>
community (since cubical foundations allow univalence to compute),
it’s still somewhat nonstandard in the context of broader
algebraic topology. See <a href="https://mathoverflow.net/questions/3656/cubical-vs-simplicial-singular-homology">here</a> for some discussion. <a href="#fnref:6" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:7" role="doc-endnote">
<p>You also shouldn’t have to wait <em>too</em> long if you really want to see
me motivate model categories. Like I said at the start of the post,
I’m super close to finishing a trilogy of blog posts, the first of which
is about model categories!</p>
<p>If I remember (or when someone reminds me) I’ll link it here once
the post is up. <a href="#fnref:7" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:8" role="doc-endnote">
<p>Obviously if we’re working with chains of $R$-modules rather than
abelian groups, we should use $R$ here instead of $\mathbb{Z}$.</p>
<p>Also, I feel like there should be an “algorithmic” way of finding an
interval object inside a model cateogry
(again, really an $\infty$-category)… I know model categories can be
pathological (for instance, depending on your definitions, certain
constructions might not be functorial) but for most model categories that
arise in pratice we don’t have these issues.</p>
<p>I want this to be a quick post, so I don’t want to look into it too much
right now, but I would love to know if, say, every combinatorial
model category has an interval object, and if we can reconstruct it
(agian, maybe in the $\infty$-category setting) from the “walking interval”,
namely the $\infty$-category with two objects joined by an equivalence.</p>
<p>If someone happens to know, I would love to hear about it ^_^ <a href="#fnref:8" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:9" role="doc-endnote">
<p>We can see that tensor products are the right notion of product
(rather than the direct product) in a few ways:</p>
<p>First, the direct product would leave most of our chain complex
unchanged. Contrast this with the tensor product, which successfuly
gives us “off by one” pairs in each dimension that we expect
(since we know we’re eventually going to recover the notion of chain
homotopy)</p>
<p>Second, if you remember the <a href="https://en.wikipedia.org/wiki/K%C3%BCnneth_theorem">Künneth Theorem</a>, then you’ll remember
that $H_n(X \times I)$ is related to $H_n(X) \otimes H_n(I)$.</p>
<p>Third, it seems like there’s a more general approach here that I don’t
really understand. But again, I want this to be a quick post, so I won’t
be doing that research myself (at least not tonight :P).</p>
<p>It looks like it has something to do with the tensor-hom adjunction,
since $[0,1]$ is exponentiable, and we use both $X \times [0,1]$
and $X^{[0,1]}$ when studying the model structure on $\mathsf{Top}$.</p>
<p>Then we should be interested in some kind of monoidal
closed type structure on the category of chains, where we use
$A_\bullet \otimes I_\bullet$ and $[I_\bullet, A_\bullet]$ for the
same purpose.</p>
<p>It seems like <em>someone</em> has made this precise
(see remark 2.4 <a href="https://ncatlab.org/nlab/show/homotopy">here</a>, for instance), but I haven’t found any
references myself. <a href="#fnref:9" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:10" role="doc-endnote">
<p>My instinct is to prove this, but I’m <em>well</em> over my planned one hour
time limit on writing this post, and I have to wake up early tomorrow,
so I should <em>really</em> get to bed.</p>
<p>So instead I’m taking the cop out as old as math itself, and I’m leaving
this as an exercise.</p>
<p>I don’t feel <em>too</em> bad, though, since this is in the section that I
said upfront would be a bit more technical, and the proof really
isn’t hard (it’s a matter of unpcaking definitions more than anything else).
Plus there’s a full proof on the nlab that I link to at the bottom
of the exercise. <a href="#fnref:10" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Fri, 24 Jun 2022 00:00:00 +0000
https://grossack.site/2022/06/24/chain-homotopies.html
https://grossack.site/2022/06/24/chain-homotopies.htmlLocale Basics<p>This quarter I’ve been running a reading group for some undergraduates on
<a href="https://en.wikipedia.org/wiki/Pointless_topology">pointless topology</a>. I want to take some time to write down a summary
of the basics of locales (which are the object of study of pointless topology),
both so that my students can have a reference for what we’ve covered, and also
because I think there’s still space for a really elementary presentation of
some of these topics, with a focus on concrete examples.</p>
<p>In particular, it took me a <em>long</em> time to find examples of how
people actually <em>build</em> a locale (that doesn’t obviously come from a
geometric theory). Hopefully this post will help fill that gap! In the
near future, I’d like to write a follow-up post about how we can build
new locales from old ones too, but that’s a problem for another day.</p>
<p>Enough build up, though! Let’s get to it ^_^</p>
<hr />
<p>Our first few weeks were a crash course on the kind of category theory
that I usually assume my readers know already, with a focus on
partial orders and lattices<sup id="fnref:13" role="doc-noteref"><a href="#fn:13" class="footnote" rel="footnote">1</a></sup>.
After spending time getting to adjoint functors, we defined a topological
space, and then observed that the lattice of open sets has a certain
algebraic structure:</p>
<ul>
<li>Opens are closed under finite intersection</li>
<li>Opens are closed under arbitrary union</li>
</ul>
<p>We quickly abstracted this into a definition<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">2</a></sup>:</p>
<div class="boxed">
<p>A <span class="defn">Frame</span> is a lattice $(F, 0, 1, \land, \lor)$
which is closed under <em>arbitrary</em> joins, which we write as $\bigvee a_\alpha$.</p>
<p>Moreover, we have to explicitly require that</p>
\[a \land \left ( \bigvee b_\alpha \right ) = \bigvee (a \land b_\alpha)\]
<p>(since lattices are not automatically distributive).</p>
<p>A <span class="defn">Homomorphism of Frames</span> is a map $\varphi : F \to G$
which preserves finite meets and arbitrary joins<sup id="fnref:19" role="doc-noteref"><a href="#fn:19" class="footnote" rel="footnote">3</a></sup>.</p>
</div>
<p>Now, inspired by the fact that continuous maps of topological spaces
$(X,\tau) \to (Y,\sigma)$
are in bijection with frame homomorphisms from $\sigma \to \tau$, we
<em>define</em> the category of <span class="defn">Locales</span> to be opposite
the category of frames.</p>
<p>Of course, there’s no sense in having a definition without examples!</p>
<hr />
<h2 id="geometric-examples">Geometric Examples</h2>
<p><br /></p>
<p>The first important class of examples are the “geometric” ones.
Every topological space<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">4</a></sup> is specified in terms of either a <a href="https://en.wikipedia.org/wiki/Base_(topology)">base</a>
or a <a href="https://en.wikipedia.org/wiki/Subbase">subbase</a>, and we can imitate these constructions exactly in the
world of locales.</p>
<p>Let’s do the real numbers to start. These have a basic open for each interval
$(a,b)$. Classically we can take all such intervals, but constructively it’s
more hygienic to take only the intervals with rational endpoints. Obviously
these generate the same topology, so we’re not losing anything by doing this.</p>
<p>Now, the intervals $(a,b)$ form a <a href="https://en.wikipedia.org/wiki/Semilattice">meet semilattice</a> where we define</p>
\[(a,b) \wedge (a', b') = (\max(a,a'), \min(b,b'))\]
<p>with the convention that $(a,b) = \bot$ whenever $b \lt a$.</p>
<div class="boxed">
<p>As an easy exercise, draw a picture of two intervals $(a,b)$ and $(a’,b’)$
and convince yourself that their intersection is really given by the meet
as defined above</p>
</div>
<p>This is telling us that our open intervals really are a <em>basis</em> in the
strong sense that the intersection of two basic opens is basic open.</p>
<p>From here, we want to define a topology by declaring an open set to be an
arbitrary union of the basic opens. In the locale-theoretic world, we
take this meet-semilattice and (freely) close it under arbitrary joins!</p>
<p>To do that, we recall the adjunction</p>
<p style="text-align:center;">
<img src="/assets/images/locale-basics/downset-adjunction.png" width="25%" />
</p>
<p>where (as usual) $U$ is the forgetful functor and $\mathcal{D}$ is the
<em>downset functor</em>, which sends a poset $P$ to the free (sup)lattice
$\mathcal{D} P$ whose elements are the downwards-closed subsets of $P$<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">5</a></sup>.
The unit of this adjunction sends $p \in P$ to
\(\downarrow \! p = \{ x \in P \mid x \leq p \}\).</p>
<p>Then, at the end of the day, we can define the locale $\mathbb{R}$ to be
generated by the basic opens $(a,b)$ with $a \lt b \in \mathbb{Q}$. These
assemble into a meet-semilattice $P$, and the downset lattice
$\mathcal{D} P$ is a frame.</p>
<div class="boxed">
<p>As a nice exercise, you should check that $\mathcal{D}P$ really is a
frame. Moreover, that the inclusion $P \hookrightarrow \mathcal{D}P$
preserves $\land$.</p>
</div>
<p>Of course, there’s nothing special about $\mathbb{R}$ here!</p>
<p>Given any meet-semilattice of basic open sets<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">6</a></sup> $(M, \wedge)$,
the downsets $\mathcal{D}M$ form a locale where every open is the
union of basic opens in $m$ (here, as usual, we identify $m \in M$ with
the principal downset \(\downarrow \! m \in \mathcal{D}M\)).</p>
<div class="boxed">
<p>Can you modify the construction of $\mathbb{R}$ to build a locale for
$\mathbb{R}^2$, or more generally $\mathbb{R}^n$?</p>
<p>In the next blog post of the series, we’ll talk about product locales,
and we’ll show that $\mathbb{R} \times \mathbb{R} \cong \mathbb{R}^2$.</p>
</div>
<hr />
<p>For another example of building locales from a geometric space, let’s
build $S^1$ together! We have a few options for how to do this.</p>
<p>In practice, we use the fact that $S^1$ as a topological space is hausdorff,
thus <a href="https://en.wikipedia.org/wiki/Sober_space">sober</a>. Then we appeal to an important theorem<sup id="fnref:14" role="doc-noteref"><a href="#fn:14" class="footnote" rel="footnote">7</a></sup>
that the full subcategory of sober topological
spaces is equivalent to the full subcategory of “spatial locales”.</p>
<p>In particular, the usual topology on $S^1$, viewed as a frame, gives a locale
for $S^1$ with no changes necessary.</p>
<p>If we need to, though, we can also build $S^1$ by hand. After all, we can
describe the open arcs and their intersections as basic opens, then take
downsets just as we did for $\mathbb{R}$.</p>
<p>However, it would be nice if we could somehow define $S^1$ to be
$\mathbb{R} / \mathbb{Z}$. I’ve never actually seen
anybody talk about how to do this, so this blog post seems like a perfect place!</p>
<p>If $\mathbb{Z}$ is acting on $\mathbb{R}$ by translation,
that means we have an action of $\mathbb{Z}$ on the frame of opens of
$\mathbb{R}$, where a basic open $(p,q)$ pulls back to the basic open
$(p-1,q-1)$.</p>
<p>Since locales are dual to frames, a <em>quotient locale</em> (that is, an
epimorphism) should correspond to a <em>subframe</em> (that is, a monomorphism),
and a moment’s meditation on the <a href="https://en.wikipedia.org/wiki/Quotient_space_(topology)">quotient topology</a> shows that
we want to consider the sublocale of open sets fixed by the $\mathbb{Z}$ action.
So a typical open in this sublocale looks like</p>
\[\bigvee_{n \in \mathbb{Z}} (p-n, q-n)\]
<p>Intuitively, we’re identifying an open set of $S^1$ with its preimage in $\mathbb{R}$,
which lets us identify the frame of opens of $S^1$ with a subframe of the
opens of $\mathbb{R}$.</p>
<p style="text-align:center;">
<img src="/assets/images/locale-basics/circle.png" width="75%" />
</p>
<hr />
<p>But what, I hear you asking, if our basic opens are just basic in the
usual sense? For instance, how can we describe a locale associated to a
general metric space?</p>
<p>The answer, dear reader, is <span class="defn">Sites</span><sup id="fnref:15" role="doc-noteref"><a href="#fn:15" class="footnote" rel="footnote">8</a></sup>.</p>
<p>Johnstone (in section II.2.11 of <em>Stone Spaces</em>) requires his sites to be
meet semilattices, but this is still not as general as I would like<sup id="fnref:16" role="doc-noteref"><a href="#fn:16" class="footnote" rel="footnote">9</a></sup>.
After all, the open balls in a metric space are <em>very</em> rarely closed under
intersection! It took me a surprisingly long time to find a satisfying
construction, but Vickers (in <em>Compactness in Locales and in Formal Topology</em>)
gives the following definition, which does exactly what I want:</p>
<div class="boxed">
<p>A <span class="defn">Flat Site</span> is a <a href="https://en.wikipedia.org/wiki/Preorder">preorder</a> $(P, \leq)$ equipped
with a <em>coverage relation</em> $\vartriangleleft$ between $P$ and the powerset
$\mathcal{P}(P)$ which satisfies the following “flat stability” axiom:</p>
<p>If $p \vartriangleleft A$ and $q \leq p$, there should be a
$B \subseteq (\downarrow q) \cap (\downarrow A)$ so that $q \vartriangleleft B$.</p>
<p>We think of the relation $p \vartriangleleft A$ as saying that “$A$ covers $p$”</p>
</div>
<p>Then the frame <em>presented</em> by a flat site is</p>
\[\mathcal{D}P \Bigg /
p \leq \bigvee A \text{ (for every $p \vartriangleleft A$)}\]
<p>That is, we we force the coverage relation $p \vartriangleleft A$ to become
an actual covering in the frame<sup id="fnref:17" role="doc-noteref"><a href="#fn:17" class="footnote" rel="footnote">10</a></sup>.</p>
<div class="boxed">
<p>Build a frame associated to a metric space $(X,d)$ by constructing a
flat site associated to $X$.</p>
<p>This might feel slightly tautologous, since metric spaces are hausdorff,
thus their usual topology already gives us a frame.</p>
</div>
<hr />
<h2 id="logical-examples">Logical Examples</h2>
<p><br /></p>
<p>Of course, locale theory (like most of my interests) has a dual nature
as a geometirc subject and a logical subject. The previous section was
about locales which arise from geometric objects, but what about locales
arising logically?</p>
<p>Let’s start with the notion of a (propositional) <a href="https://ncatlab.org/nlab/show/geometric+theory">geometric theory</a>.</p>
<div class="boxed">
<p>Fix a family of “primitive propositions” $P_\alpha$. Then
<span class="defn">geometric formulas</span> are described by the
following grammar:</p>
\[\begin{align}
\varphi
::=& \ P_\alpha \\
&| \ \top \\
&| \ \bot \\
&| \ \varphi \land \psi \\
&| \ \varphi \lor \psi \\
&| \ \bigvee \varphi_\alpha
\end{align}\]
<p>Next, a <span class="defn"> geometric sequent</span> is an object of the form</p>
\[\varphi_1, \ldots, \varphi_n \vdash \psi\]
<p>where $\varphi$ and $\psi$ are geometric formulas. We interpret this as
expressing “whenever all of the $\varphi_k$ are true, $\psi$ is true too”.</p>
<p>Finally, a <span class="defn">geometric theory</span> is a set $\mathbb{T}$ of
geometric sequents, which we think of as <em>axioms</em> for $\mathbb{T}$.</p>
</div>
<p>There is a <a href="https://en.wikipedia.org/wiki/Sequent_calculus">sequent calculus</a> for geometric logic, which, for those in the
know, is exactly what you expect<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">11</a></sup>. All that we need to know is that it lets
us <em>derive</em> the truth of certain sequents in a way that formally mirrors the
way we prove things as mathematicians. For instance, one can derive the sequent
$\varphi \land \psi \vdash \psi \land \varphi$, which tells us that $\land$
is commutative.</p>
<p>This is important because the language<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">12</a></sup>
\(L(\{ P_\alpha \})\)
is a preorder if we define $\varphi \leq \psi$ by $\varphi \vdash \psi$.
Intuitively, elements “higher up the poset” are “more true”, with $\top$ and
$\bot$ as the top and bottom elements.</p>
<p>If we then quotient by the equivalence relation of $x \leq y \leq x$
(so we consider two provably-equivalent formulas to be “the same”) then we get a
frame, where the lattice operations are given by the logical ones<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">13</a></sup>!
In fact, this construction gives the <em>free</em> frame generated by the set
${ P_\alpha }$ of primitive propositions.</p>
<p>Now, given a theory $\mathcal{T}$ with primitive propositions ${ P_\alpha }$,
we define the <span class="defn">Lindenbaum Algebra</span> $\mathcal{L}[\mathbb{T}]$
to be the frame of formulas in ${ P_\alpha }$ modulo the relations</p>
\[\varphi_1, \ldots, \varphi_n \leq \psi\]
<p>for each $\big ( \varphi_1 \ldots, \varphi_n \vdash \psi \big ) \in \mathbb{T}$.</p>
<p>This is again a frame, which we should think of as the formulas in \(L(\{P_\alpha\})\)
up to provable equivalence, where now we may use the sequents in $\mathbb{T}$
as axioms in our proofs<sup id="fnref:9" role="doc-noteref"><a href="#fn:9" class="footnote" rel="footnote">14</a></sup>! In the case we view this frame as a locale, we
call it the <span class="defn">Classifying Locale</span> of $\mathbb{T}$.</p>
<p>This was all very abstract, so let’s see some concrete examples:</p>
<hr />
<p>Let $X$ be a set, and consider a theory $\mathbb{T}$ with one primitive proposition
$\sigma_{x,y}$ for each $x,y \in X$, and axioms</p>
\[\begin{align}
\top &\vdash \bigvee_{y \in X} \sigma_{x,y}
&&\text{(for each $x \in X$)} \\
\sigma_{x,y}, \sigma_{x,z} &\vdash \bot
&& \text{(for $y \neq z \in X$)} \\
\sigma_{x,y} &\vdash \sigma_{y,x}
\end{align}\]
<p>What’s the use of this theory? Well recall a (classical) model of a
(propositional) theory $\mathbb{T}$ is exactly a frame homomorphism
\(\mathcal{L}[\mathbb{T}] \to \{ \bot, \top \}\).</p>
<p>Now, a model of this theory assigns each $\sigma_{x,y}$ to $\bot$ or $\top$,
thus a model is really picking out a subset of $X \times X$ –
namely those pairs $(x,y)$ so that $\sigma_{x,y} = \top$.</p>
<p>Through this lens, the first two axioms express that for every $x$,
there is <em>exactly one</em> $y$ with $\sigma_{x,y} = \top$, so that our model
is naming a <em>function</em> (which I’ll call $f_\sigma : X \to X$). The last
axiom is asserting that $f_\sigma(x) = y$ implies $f_\sigma(y) = x$. That is,
$f_\sigma$ should be an involution!</p>
<p>So (classical) models of $\mathbb{T}$ are exactly $\mathbb{Z}/2$ actions on $X$.</p>
<p>We can push this further, though!</p>
<p>Consider a theory $\mathbb{T}$ with primitive propositions $\theta_{n,x}$
for $n \in \mathbb{N}$ and $x \in \mathbb{R}$, and the axioms</p>
\[\begin{align}
\top &\vdash \bigvee_{x \in \mathbb{R}} \theta_{n,x}
&& \text{(for each $n \in \mathbb{N}$)} \\
\theta_{n,x}, \theta_{n,y} &\vdash \bot
&& \text{(for $x \neq y \in \mathbb{R}$)} \\
\top &\vdash \bigvee_{n \in \mathbb{N}} \theta_{n,x}
&& \text{(for each $x \in \mathbb{R}$)}
\end{align}\]
<div class="boxed">
<p>To get some practice interpreting geometric theories, you should try
to figure out what a (classical) model of this theory looks like.</p>
<p>As before, the first two axioms assert that a model should define a
function $f_\theta : \mathbb{N} \to \mathbb{R}$. But what does the
third axiom assert?</p>
<p>Do you see why this is a nontrivial theory with <em>no</em> classical models?</p>
</div>
<hr />
<p>The locales $\mathcal{L}[\mathbb{T}]$ are surprisingly flexible. For instance,
say $B$ is a seminormed space. Then we can describe its (localic) dual space
by considering a geometric theory with primitive propositions “$\varphi(a) \in (r,s)$”
for each $a \in B$, $r,s \in \mathbb{Q}$. These should satisfy axioms:</p>
\[\begin{align}
\top &\vdash \varphi(0) \in (r,s)
&& \text{(for $r \lt 0 \lt s$)} \\
\varphi(0) \in (r,s) &\vdash \bot
&& \text{(otherwise)} \\
\varphi(a) \in (r,s) &\vdash \varphi(-a) \in (-s,-r) \\
\varphi(a) \in (r,s) &\vdash \varphi(ta) \in (tr, ts)
&& \text{(for $t > 0 \in \mathbb{Q}$)} \\
\varphi(a) \in (r,s), \varphi(a') \in (r',s') &\vdash \varphi(a + a') \in (r+r', s+s') \\
\varphi(a) \in (r,s) &\vdash \varphi(a) \in (r,s') \lor \varphi(a) \in (r', s)
&& \text{(whenever $r \lt r' \lt s' \lt s$)} \\
\top &\vdash \varphi(a) \in (-1,1)
&& \text{(whenever $\lVert a \rVert \lt 1$)} \\
\varphi(a) \in (r,s) &\vdash \bigvee_{r \lt r' \lt s' \lt s} \varphi(a) \in (r',s') \\
\bigvee_{r \lt r' \lt s' \lt s} \varphi(a) \in (r',s') &\vdash \varphi(a) \in (r,s)
\end{align}\]
<p>Then a (classical) model of this theory is exactly the graph
of a norm-decreasing linear functional on $B$! Moreover, if we use this
formulation of the dual space of $B$, we can get a completely constructive
proof of the Hahn-Banach theorem<sup id="fnref:18" role="doc-noteref"><a href="#fn:18" class="footnote" rel="footnote">15</a></sup>!</p>
<hr />
<p>Now, we’ve said that this is a very flexible appraoch for building locales…
but just how flexible is it?</p>
<p>Well, it turns out that <em>every</em> locale arises in this way!
That is, for each locale $L$, there is a geometric theory which it classifies<sup id="fnref:10" role="doc-noteref"><a href="#fn:10" class="footnote" rel="footnote">16</a></sup>.</p>
<p>As our primitive propositions, we’ll take the elements $U_\alpha \in L$
(viewed as a frame). Then our theory will be exactly the “cayley table” of $L$.
As axioms, we’ll take \(U_1 \wedge U_2 \vdash U_3\) whenever $U_3$ is the
meet of $U_1$ and $U_2$ in $L$. Similarly, we’ll add axioms saying that
\(\bigvee U_\alpha \vdash U^*\) whenever $U^*$ really is the join of the
$U_\alpha$ as computed in $L$.</p>
<p>Now, tautologically, it’s clear that the lindenbaum algebra for this theory
is $L$, as desired.</p>
<hr />
<h2 id="points">Points</h2>
<p><br /></p>
<p>You’ve probably heard locale theory referred to as “pointless topology”
(especially since I called it that at the start of the post!). But what does that
mean?</p>
<p>Well, in topological spaces, we can identify points of $X$ with maps
\(\{ \star \} \to X\), since these maps are completely determined by the
point $f(\star) \in X$ (and conversely, every point determines a map).</p>
<p>Then we perform a tried and true maneuver, used by mathematicians everywhere:
we take a theorem in the old setting, and turn it into a <em>definition</em> in our
new setting! So for us, a <span class="defn">Point</span> of a locale $L$ is
a continuous map $1 \to L$, where $1$, the terminal locale, is the one point
space (that is, the locale whose frame of opens is \(\{ \bot, \top \}\))</p>
<div class="boxed">
<p>If it’s not obvious, you should take a moment to think about why the one point
space \(\{ \star \}\) in topology corresponds to the locale whose frame of
opens is \(\{\bot, \top \}\).</p>
</div>
<p>Now we get to one of the most important punchlines in locale theory!</p>
<p>If you recall, the lattice \(\{\bot, \top\}\) showed up before too…
We saw that <em>frame</em> homomorphisms \(\mathcal{L}[\mathbb{T}] \to \{\bot, \top\}\)
were exactly classical models of the theory $\mathbb{T}$!</p>
<p>But turning the arrows around, we see that (classical) models of
$\mathbb{T}$ are nothing but <em>points</em> of $\mathcal{L}[\mathbb{T}]$!</p>
<p>In this way, we can view $\mathcal{L}[\mathbb{T}]$ as the “space of models”
of $\mathbb{T}$. Of course, some theories don’t <em>have</em> any classical models:
for instance, the theory of surjections \(\mathbb{N} \to \mathbb{R}\) that we
built earlier.
In this case, our locale has no points! But nonetheless it has
nontrivial topological structure.</p>
<p>One way to see that this nontrivial structure is interesting is by
considering <em>more general</em> kinds of point. For instance, instead of assigning
each primitive proposition the value $\bot$ or $\top$, you might assign them
values in some other frame $F$. These “$F$-valued models” are interesting
already, but they <em>really</em> shine when we see the analogue in topos theory<sup id="fnref:11" role="doc-noteref"><a href="#fn:11" class="footnote" rel="footnote">17</a></sup>.</p>
<p>Again, though, we dualize, and for a locale $X$ we say that the
“$X$-valued points of $L$” (equivalently the $X$-valued models of $\mathbb{T}$)
are exactly the continuous maps $X \to L$ (equivalently $X \to \mathcal{L}[\mathbb{T}]$).</p>
<hr />
<p>So now we have two different ways of thinking about a continuous map
$f : X \to Y$. The first way is just what it is: a continuous map.</p>
<p>But logically, we know that we can think of these as $\mathbb{T}$ models in a
world with more than two truth values. Is there some geometric interpretation
of this idea?</p>
<p>The answer is “yes”!</p>
<p>For each point $p : 1 \to X$, we get a map $fp : 1 \to Y$.
So then we might think of $f$ as giving us points in $Y$ which
<em>vary continuously in $X$</em>. That is, “points” $p_x$ of $Y$ which continuously
depend on a parameter from $x$.
This seems like a kind of silly thing to do, but it turns out to be a very
useful way of thinking.</p>
<p>But there’s <em>another</em>, third way of viewing a map $f : X \to Y$ –
namely as a <a href="https://en.wikipedia.org/wiki/Bundle_(mathematics)">bundle</a>. For each point $y$ of $Y$ we get a pullback</p>
<p style="text-align:center;">
<img src="/assets/images/locale-basics/fibres.png" width="30%" />
</p>
<p>here $X_y = f^{-1}(y)$ is a locale (since the category of locales is complete),
and we think of $X$ as being a <em>family</em> of locales $X_y$ which varies
continuously in $Y$. This is another very fruitful idea which I’ll probably
talk about in a future blog post<sup id="fnref:12" role="doc-noteref"><a href="#fn:12" class="footnote" rel="footnote">18</a></sup>.</p>
<hr />
<p>Ok, this was a <em>long one</em>, but I think there’s a lot to say about frames
and locales. I’m surprised how difficult it is to find concrete examples
of certain calculations on frames, and ways of building them
(though maybe I’m just not sure where to look). Either way, I hope my
students find this a useful (if somewhat fleshed out) review of the localic
material we’ve covered so far, and I hope that future readers find this an
approachable introduction to the basics of locale theory.</p>
<p>Take care, everyone, and I’ll see you soon ^_^</p>
<hr />
<p>PS: I’ve been reading a lot of locale theory lately, particularly while
looking for references that showcase either explicit constructions or
interesting use cases. I usually include references either in the text or
in footnotes, but I’ll actually include a short bibliography here for the
papers that I found particularly insightful:</p>
<ul>
<li>Anel, Mathieu, and André Joyal. “Topo-Logie.” In New Spaces in Mathematics, edited by Mathieu Anel and Gabriel Catren, 1st ed., 155–257. Cambridge University Press, 2021. https://doi.org/10.1017/9781108854429.007.</li>
<li>Banaschewski, Bernhard, and Christopher J. Mulvey. “A Constructive Proof of the Stone-Weierstrass Theorem.” Journal of Pure and Applied Algebra 116, no. 1 (March 28, 1997): 25–40. https://doi.org/10.1016/S0022-4049(96)00160-0.</li>
<li>———. “A Globalisation of the Gelfand Duality Theorem.” Annals of Pure and Applied Logic 137, no. 1 (January 1, 2006): 62–103. https://doi.org/10.1016/j.apal.2005.05.018.</li>
<li>Blechschmidt, Ingo. “Generalized Spaces for Constructive Algebra.” ArXiv:2012.13850 [Math], December 26, 2020. http://arxiv.org/abs/2012.13850.</li>
<li>Borceux, Francis. Categories of Sheaves. Digitally printed version. Handbook of Categorical Algebra / Francis Borceux 3. Cambridge: Cambridge Univ. Press, 2008.</li>
<li>Chen, Xiangdong. “On Binary Coproducts of Frames,” n.d., 14.</li>
<li>Day, B. J. “Locale Geometry.” Pacific Journal of Mathematics 83, no. 2 (August 1, 1979): 333–39. https://doi.org/10.2140/pjm.1979.83.333.</li>
<li>He, Wei, and MaoKang Luo. “Completely Regular Proper Reflection of Locales over a given Locale.” Proceedings of the American Mathematical Society 141, no. 2 (June 5, 2012): 403–8. https://doi.org/10.1090/S0002-9939-2012-11329-2.</li>
<li>Isbell, John. “Product Spaces in Locales.” Proceedings of the American Mathematical Society 81, no. 1 (1981): 116–18. https://doi.org/10.2307/2044000.</li>
<li>———. “Atomless Parts of Spaces.” MATHEMATICA SCANDINAVICA 31 (June 1, 1972): 5. https://doi.org/10.7146/math.scand.a-11409.</li>
<li>Johnstone, Peter T. Stone Spaces. Paperback ed., Repr. Cambridge Studies in Advanced Mathematics 3. Cambridge: Cambridge Univ. Press, 1992.</li>
<li>Pelletier, Joan Wick. “Locales in Functional Analysis.” Journal of Pure and Applied Algebra 70, no. 1 (March 15, 1991): 133–45. https://doi.org/10.1016/0022-4049(91)90013-R.</li>
<li>Picado, Jorge, and Aleš Pultr. Frames and Locales: Topology without Points. Frontiers in Mathematics. Basel: Birkhäuser, 2012.</li>
<li>———. “Notes on the Product of Locales.” Mathematica Slovaca 65, no. 2 (April 1, 2015): 247–64. https://doi.org/10.1515/ms-2015-0020.</li>
<li>Vickers, Steven. “Geometric Logic in Computer Science.” In Theory and Formal Methods 1993, edited by Geoffrey Burn, Simon Gay, and Mark Ryan, 37–54. Workshops in Computing. London: Springer London, 1993. https://doi.org/10.1007/978-1-4471-3503-6_4.</li>
<li>———. “Compactness in Locales and in Formal Topology.” Annals of Pure and Applied Logic 137, no. 1–3 (January 2006): 413–38. https://doi.org/10.1016/j.apal.2005.05.028.</li>
<li>———. “Continuity and Geometric Logic.” Journal of Applied Logic, Logic Categories Semantics, 12, no. 1 (March 1, 2014): 14–27. https://doi.org/10.1016/j.jal.2013.07.004.</li>
</ul>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:13" role="doc-endnote">
<p>I feel like this is particularly economical for a few reasons:</p>
<p>First, posets and lattices are simple combinatorial objects that you can
understand without much background in abstract algebra or topology. We also
get two “levels” of categories out of this, since every poset is a category
in its own right, but we can also consider the category of posets, or any
one of the zoo of categories of lattices.</p>
<p>In fact, the zoo of lattice categories is nice, because it forces you to
acknowledge that it’s the <em>arrows</em> that matter. After all, frames
(which we’ll define soon) and complete heyting algebras are the same thing!
We can only distinguish between the categories because the homomorphisms are different.</p>
<p>This means that we can see categorical constructions in multiple lights
very quickly: Products in a poset are meets, while products in the <em>category</em>
of posets are given by a poset of tuples. Adjoint functors between posets
are <a href="https://en.wikipedia.org/wiki/Galois_connection">galois connections</a>, while adjoint functors between the categories
of posets and meet semilattices might be a free/forgetful pair, etc. <a href="#fnref:13" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:1" role="doc-endnote">
<p>The definition given on the <a href="https://ncatlab.org/nlab/show/frame">nlab</a> is nice because it make the analogy
with topoi incredibly clear. It’s an easy exercise that our definitions
are equivalent. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:19" role="doc-endnote">
<p>As a nice exercise:</p>
<p>Show that every frame is also a <a href="https://en.wikipedia.org/wiki/Heyting_algebra">Heyting Algebra</a>. You can do this
with abstract nonsense (use the <a href="https://ncatlab.org/nlab/show/adjoint+functor+theorem">adjoint functor theorem</a>) or directly
(give an explicit expression for the arrow $a \implies b$).</p>
<p>You should do whichever comes less naturally to you
(or both, if neither comes naturally).</p>
<p>As a part 2, you should show that <em>defining</em> frames to be complete
heyting algebras is incorrect, since morphisms of heyting algebras
should preserve $\implies$, whereas frame homomorphisms might not.</p>
<p>(You can see <a href="https://mathoverflow.net/questions/248711/an-example-of-a-frame-homomorphism-which-does-not-preserve-heyting-implication">this</a> mathoverflow question if you need a hint) <a href="#fnref:19" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>That I can think of right now <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>There’s actually a <em>lot</em> to say about this construction! It turns out that
$\mathcal{D}P$ is the same thing as $[P^\text{op},2]$, the monotone maps from
$P^\text{op}$ to the two element poset \(\{0,1\}\) with $0 \leq 1$.</p>
<p>Then $\mathcal{D}P$ is nothing but the “presheaves on $P$” if we enrich
over $2$ instead of $\mathsf{Set}$. This makes it clear that it should be
the free way of adding joins (colimits) to $P$, by analogy to the presheaves on a
category $\mathcal{C}$ being the “free cocompletion” of $\mathcal{C}$.</p>
<p>Again, there’s a <em>lot</em> to say here, since frames/locales are analogous in <em>many</em>
was to grothendieck topoi, and this method of getting a locale by looking
at “presheaf posets” is one instance of that analogy! <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:4" role="doc-endnote">
<p>Notice that we’re assuming the intersection of two basic opens is itself
basic open. The usual definition is that the intersection of two basic
opens merely <em>contains</em> a basic open – this seems like a property that
should have a name, but I don’t know of one. Hopefully someone can
leave a comment clarifying this ^_^</p>
<p>Either way, we’ll relax this assumption later on in the post! <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:14" role="doc-endnote">
<p>Which I’d like to prove in its own blog post… hopefully a shorter one,
since the proof really isn’t that bad.</p>
<p>For the impatient reader (or if I never get to it) you can find a proof in
Johnstone’s <em>Stone Spaces</em>, section II.1.7 <a href="#fnref:14" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:15" role="doc-endnote">
<p>This is yet another analogy to topos theory. See <a href="https://ncatlab.org/nlab/show/posite">here</a> for more <a href="#fnref:15" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:16" role="doc-endnote">
<p>Though it <em>does</em> make the construction go slightly more smoothly.
This, again, is analogous to the setting of sites on a category in
topos theory, where assuming your category has pullbacks makes the
construction a bit simpler. <a href="#fnref:16" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:17" role="doc-endnote">
<p>Notice this quotient is taken in the category of suplattices, so it’s not
obvious that we should actually get a frame out the other side.</p>
<p>This is where the flat stability axiom comes into play, since it tells us
(roughly) that we can “pull back” a cover $A$ of $p$ to a cover $B$ of $q$
whenever $q \leq p$. Since for any $p$ and $r$ we’ll have $p \wedge r \leq p$,
this will let us show the distributivity axiom for frames.</p>
<p>For a full proof, see theorem 5 in Vickers’
<em>Compactness in Locales and in Formal Topology</em> <a href="#fnref:17" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:6" role="doc-endnote">
<p>We have the usual <a href="https://en.wikipedia.org/wiki/Structural_rule">structural rules</a>,
and all the connectives work in the traditional way, with the rules for
infinite disjunction being the natural generalization of the usual disjunction
rules. We won’t need to know the intricacies of the system here, so I’ll omit
a full description, but the interested reader can find one in section D1.3
of the elephant. <a href="#fnref:6" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:7" role="doc-endnote">
<p>That is, the set of geometric formulas with the given primitive propositions <a href="#fnref:7" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:8" role="doc-endnote">
<p>So, for instance, given two equivalence classes of formulas
$[\varphi]$ and $[\psi]$, their meet in the frame will be given by
the class of their conjunction $[\varphi \land \psi]$.</p>
<p>Similarly, for (infinitary) joins and disjunction.</p>
<p>This is why we need to quotient by provable equivalence, because
$\varphi \land (\psi \land \theta)$ and $(\varphi \land \psi) \land \theta$
are <em>technically</em> two different formulas. Though they’re provably equivalent,
which is all we need to claim that $\land$ is associative! <a href="#fnref:8" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:9" role="doc-endnote">
<p>You’ll often hear logicians talk about formulas
“up to $\mathbb{T}$-provable equivalence”.</p>
<p>This is what we’re talking about! <a href="#fnref:9" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:18" role="doc-endnote">
<p>See, for instance, Pelletier’s excellent survey
<em>Locales in Functional Analysis</em>. <a href="#fnref:18" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:10" role="doc-endnote">
<p>Though this sounds much more exciting than it is… Prepare to be underwhelmed :P <a href="#fnref:10" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:11" role="doc-endnote">
<p>In the same way that propositional geometric theories all have a classifying
locale, more general <em>predicate</em> geometric theories admit a classifying
<em>topos</em>!</p>
<p>Since every algebraic theory is geometric, this tells us that there’s a
topos $\mathcal{G}$ so that for any topos $\mathcal{E}$, the
“$\mathcal{E}$-points of $\mathcal{G}$” are exactly group objects in $\mathcal{E}$!</p>
<p>This also plays nicely with the intution that $\mathcal{E}$-points of
$\mathcal{G}$ are “points of $\mathcal{G}$ varying continuously in $\mathcal{E}$”,
since a group object in a sheaf topos (say) is exactly a sheaf of groups.
That is, a family of groups varying continuously in the base space of the
topos! <a href="#fnref:11" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:12" role="doc-endnote">
<p>And this is one of the big reasons to favor locales over topological spaces!
Locales behave <em>much</em> better under this interpretation, basically because
their theorems are provable constructively. Since the slice category of
locales over $Y$ is equivalent to the category of locales internal to
$\mathsf{Sh}(Y)$, this means that anything we prove (constructively)
about locales is <em>immediately</em> true for bundles of locales as well! <a href="#fnref:12" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Sun, 22 May 2022 00:00:00 +0000
https://grossack.site/2022/05/22/locale-basics.html
https://grossack.site/2022/05/22/locale-basics.htmlHow many symbols can $f'(x)$ have if $f$ has $n$ symbols?<p>The other day <a href="https://www.smbc-comics.com/">SMBC</a> put up a lovely comic which did a great job
<a href="https://xkcd.com/356/">nerdsniping</a> me. I knew that I wanted to try to solve it as soon
as I saw it, but I didn’t have the time for a little while
(it’s midterm season and I had grading to do). It’s a cute problem, and
I want to share my solution with all of you! First, here’s the comic
that started it all:</p>
<p style="text-align:center;">
<a href="https://www.smbc-comics.com/comic/derivative">
<img src="/assets/images/diff-growth/smbc-derivative.png" width="50%" />
</a>
</p>
<p>Now, my old advisor (Klaus Sutner) used to say that whenever you’re faced with a
problem, you can hack or you can think, but you can’t do both. <del>Today</del>
Multiple weeks ago I was in more
of a hacking mood, so I wrote up some haskell code to just <em>try</em> all the
“reasonable” functions I could think of. By this, of course, I mean the
<a href="https://en.wikipedia.org/wiki/Elementary_function">elementary functions</a>.</p>
<p>There’s an obvious recursive way to build up the elementary functions
(which you should think of as those functions which might show up in a calculus class):</p>
<ul>
<li>$f(x) = x$ should probably be a function, as should the constants</li>
<li>If $f(x)$ has previously been defined, $\sin(f(x))$, etc. should be functions</li>
<li>If $f(x)$ and $g(x)$ have previously been defined, $f + g$, etc. should be functions</li>
</ul>
<p>We can formalize this with a datatype<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup></p>
<div class="no_eval">
<script type="text/x-sage">
data Expr = Const Rational
| X
| Square Expr
| Sqrt Expr
| Sin Expr
| Cos Expr
| Tan Expr
| ASin Expr
| ACos Expr
| ATan Expr
| Sinh Expr
| Cosh Expr
| Tanh Expr
| ASinh Expr
| ACosh Expr
| ATanh Expr
| Exp Expr
| Log Expr
| Add Expr Expr
| Sub Expr Expr
| Mult Expr Expr
| Div Expr Expr
| Pow Expr Expr
deriving (Show, Eq)
</script>
</div>
<p>Obviously this list, while exhausting the elementary functions, is
still somewhat arbitrary.
For instance, $\sec$ is nowhere on this list, but we can build it using
the functions that <em>are</em> on this list<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">2</a></sup>. Conversely, we added a builtin
function for $\tan$, even though we can express it using $\sin$ and $\cos$.
The decision to add squaring and square roots as primitive, while relegating
cubes and cube roots, etc. to a definition using <code class="language-plaintext highlighter-rouge">Const</code> and <code class="language-plaintext highlighter-rouge">Pow</code> was
similarly arbitrary.</p>
<p>I went for this list basically because it’s what the <a href="https://en.wikipedia.org/wiki/Elementary_function">wikipedia article</a>
names explicitly. Later on we’ll show that our solution doesn’t
depend on the exact list chosen, so we don’t need to worry about this.</p>
<p>Next up, we need to tell haskell how to compute the derivative of a function.
Thankfully, derivatives can be computed recursively, so this is quite easy
to code up:</p>
<div class="no_eval">
<script type="text/x-sage">
diff :: Expr -> Expr
diff (Const n) = Const 0
diff X = Const 1
diff (Square e) = Mult (Mult (Const 2) e) (diff e)
diff (Sqrt e) = Div (diff e) (Mult (Const 2) (Sqrt e))
diff (Sin e) = Mult (Cos e) (diff e)
diff (Cos e) = Mult (Const (-1)) (Mult (Sin e) (diff e))
diff (Tan e) = Mult (Add (Const 1) (Square (Tan e))) (diff e)
diff (ASin e) = Div (diff e) (Sqrt (Sub (Const 1) (Square e)))
diff (ACos e) = Div (Mult (Const (-1)) (diff e)) (Sqrt (Sub (Const 1) (Square e)))
diff (ATan e) = Div (diff e) (Add (Const 1) (Square e))
diff (Sinh e) = Mult (Cosh e) (diff e)
diff (Cosh e) = Mult (Sinh e) (diff e)
diff (Tanh e) = Mult (Sub (Const 1) (Square (Tanh e))) (diff e)
diff (ASinh e) = Div (diff e) (Sqrt (Add (Const 1) (Square e)))
diff (ACosh e) = Div (diff e) (Sqrt (Sub (Square e) (Const 1)))
diff (ATanh e) = Mult (Square (Cosh e)) (diff e)
diff (Exp e) = Mult (Exp e) (diff e)
diff (Log e) = Div (diff e) e
diff (Add e1 e2) = Add (diff e1) (diff e2)
diff (Sub e1 e2) = Sub (diff e1) (diff e2)
diff (Mult e1 e2) = Add (Mult (diff e1) e2) (Mult e1 (diff e2))
diff (Div e1 e2) = Div (Sub (Mult e2 (diff e1)) (Mult e1 (diff e2))) (Square e2)
diff (Pow e1 e2) = Mult (Pow e1 e2) (Add (Mult (Log e1) (diff e2)) (Div (Mult e2 (diff e1)) (e1)))
</script>
</div>
<p>This isn’t perfect. For instance, it doesn’t simplify multiplication by $1$, etc.
But I wanted a quick and dirty approximation, and importantly, I didn’t want to
spend too long on this project<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">3</a></sup>.</p>
<div class="boxed">
<p>As a (fun?) exercise, write a <code class="language-plaintext highlighter-rouge">prune</code> function which makes some easy
simplifications after differentiating. Does that change which functions
grow the most in complexity?</p>
</div>
<p>Next, we need a way to figure out how many symbols are in a given expression.
This is also easy to implement:</p>
<div class="no_eval">
<script type="text/x-sage">
size :: Expr -> Int
size (Const n) = 1
size X = 1
size (Square e) = 1 + size e
size (Sqrt e) = 1 + size e
size (Sin e) = 1 + size e
size (Cos e) = 1 + size e
size (Tan e) = 1 + size e
size (ASin e) = 1 + size e
size (ACos e) = 1 + size e
size (ATan e) = 1 + size e
size (Sinh e) = 1 + size e
size (Cosh e) = 1 + size e
size (Tanh e) = 1 + size e
size (ASinh e) = 1 + size e
size (ACosh e) = 1 + size e
size (ATanh e) = 1 + size e
size (Exp e) = 1 + size e
size (Log e) = 1 + size e
size (Add e1 e2) = 1 + size e1 + size e2
size (Sub e1 e2) = 1 + size e1 + size e2
size (Mult e1 e2) = 1 + size e1 + size e2
size (Div e1 e2) = 1 + size e1 + size e2
size (Pow e1 e2) = 1 + size e1 + size e2
</script>
</div>
<p>Lastly, we need a way to build up every expression with $n$ symbols. This way
we can differentiate them all, and see which gives us the largest output!</p>
<div class="no_eval">
<script type="text/x-sage">
build :: Int -> [Expr]
build 1 = [X]
build n =
[comb e | comb <- unary, e <- (build (n-1))] ++
[comb e1 e2 | comb <- binary, e1 <- (build (n-1)), e2 <- build((n-1))] ++
(build (n-1))
where
unary = [Square, Sqrt,
Sin, Cos, Tan, ASin, ACos, ATan,
Sinh, Cosh, Tanh, ASinh, ACosh, ATanh, Exp, Log]
binary = [Add, Sub, Mult, Div, Pow]
-- compute the largest size of diff e as e ranges over exprs of size n
b :: Int -> [Expr] -> (Int, Expr)
b n = maximumBy cmp . fmap (\e -> (size (diff e), e)) . filter (\e -> size e == n)
where
cmp (s1,_) (s2,_) = compare s1 s2
</script>
</div>
<hr />
<p>Now, my laptop can fully exhaust every function with $\leq 4$ symbols,
and we see that our best bets are</p>
<ul>
<li>$x$, whose derivative has $1$ symbol</li>
<li>$\arccos(x)$, whose derivative has $9$ symbols</li>
<li>$\arccos(\arccos(x))$, whose derivative has $18$ symbols</li>
<li>$\arccos(\arccos(\arccos(x)))$, whose derivative has $28$ symbols</li>
</ul>
<p>(note that the innermost $x$ counts as a symbol).</p>
<p>Moreover, it’s pretty easy to see that we’ll never use a unary function
other than $\arccos$. Indeed, if we write $\lvert f \rvert$ for <code class="language-plaintext highlighter-rouge">size f</code>,
it’s easy to see that</p>
\[\lvert \arccos(f)' \rvert = \lvert f \rvert + \lvert f' \rvert + 7.\]
<p>More generally, $\lvert \text{blah}(f)’ \rvert = \lvert f \rvert + \lvert f’ \rvert + k$
where $k$ is the number of symbols in the derivative of $\text{blah}(x)$<sup id="fnref:9" role="doc-noteref"><a href="#fn:9" class="footnote" rel="footnote">4</a></sup>.
Since this is biggest for $\arccos$, and we’re trying to maximize the size of
$f’$, there’s no reason to use a unary constructor other than $\arccos$.</p>
<p>This is fairly good evidence that repeatedly composing $\arccos(x)$ with
itself is the winner, and even though we can’t test <em>every</em> function with
$\geq 5$ symbols, we can test a lot of them, and after letting the code run
for just over $24$ hours, iterating $\arccos$ was still the winner.</p>
<p>So, in light of our computational evidence, we might <em>conjecture</em> that
$\lvert f’ \rvert$ is maximized (among functions with $\lvert f \rvert = n$)
for $f = \arccos(\arccos(\ldots(x)\ldots))$.</p>
<p>At this point, it’s time to stop hacking, and start thinking! Let’s try to
<em>prove</em> that this is the best option. Notice we can easily compute
$\lvert \arccos(\arccos(\ldots(x)\ldots))’ \rvert = \frac{n^2}{2} + \frac{13n}{2} - 6$
(either by solving some recurrence, or by checking <a href="https://oeis.org/search?q=1%2C9%2C18%2C28%2C39%2C51%2C64&language=english&go=Search">oeis</a>), so we should
have some simple proof by induction ahead of us.</p>
<hr />
<div class="boxed">
<p>If $f$ has $n$ symbols in its definition, then $f’$ has at most
$\frac{n^2}{2} + \frac{13n}{2} - 6$ symbols, and moreover this maximum is
attained for $f = \arccos(\arccos(\ldots(x)\ldots))$.</p>
</div>
<p>$\ulcorner$</p>
<p>We’ll induct on $\lvert f \rvert$.</p>
<p>If $\lvert f \rvert = 1,2$ then we’ve already seen that $\lvert f’ \rvert$
satisfied the desired inequality.</p>
<p>If $\lvert f \rvert \geq 3$, then we case on the outermost constructor.</p>
<p>If it’s unary, say $f = g(h)$, where $\lvert h \rvert = n-1$, then we compute</p>
\[\lvert f' \rvert = \lvert g'(h) \cdot h' \rvert = \lvert h' \rvert + (n-1) + k\]
<p>where $k$ is a constant depending on $g$, which is maximized as $k = 7$ when
$g = \arccos(-)$. Then</p>
\[\lvert f' \rvert
\leq \lvert h' \rvert + n + 6
\overset{IH}{\leq} \frac{(n-1)^2}{2} + \frac{13(n-1)}{2} - 6 + n + 6
= \frac{n^2}{2} + \frac{13n}{2} - 6.\]
<p>If instead the outermost constructor of $f$ is binary, then we have</p>
<ul>
<li>$f = g + h$</li>
<li>$f = g - h$</li>
<li>$f = g \cdot h$</li>
<li>$f = g \div h$</li>
<li>$f = g^h$</li>
</ul>
<p>where $\lvert f \rvert = n = \lvert g \rvert + \lvert h \rvert + 1$.</p>
<p>In each of these cases we compute $\lvert f’ \rvert$, and find</p>
<ul>
<li>$\lvert (g+h)’ \rvert = \lvert g’ \rvert + \lvert h’ \rvert + 1$</li>
<li>$\lvert (g-h)’ \rvert = \lvert g’ \rvert + \lvert h’ \rvert + 1$</li>
<li>$\lvert (g \cdot h)’ \rvert = \lvert g \rvert + \lvert h \rvert + \lvert g’ \rvert + \lvert h’ \rvert + 3$</li>
<li>$\lvert (g \div h)’ \rvert = \lvert g \rvert + 2 \lvert h \rvert + \lvert g’ \rvert + \lvert h’ \rvert + 5$</li>
<li>$\lvert (g^h)’ \rvert = 3 \lvert g \rvert + 2 \lvert h \rvert + \lvert g’ \rvert + \lvert h’ \rvert + 7$</li>
</ul>
<p>Clearly these are maximized for $g^h$, so let’s put $\lvert g \rvert = k$
and $\lvert h \rvert = n-1-k$. Then we see</p>
\[\begin{align}
\lvert (g^h)' \rvert
&=
3 \lvert g \rvert + 2 \lvert h \rvert + \lvert g' \rvert + \lvert h' \rvert + 7 \\
&\overset{IH}{\leq}
3k + 2(n-1-k) + \frac{k^2}{2} + \frac{13k}{2} - 6 +
\frac{(n-1-k)^2}{2} + \frac{13(n-1-k)}{2} - 6 + 7 \\
&=
\frac{n^2}{2} + \frac{(15-2k)n}{2} + k^2 + 2k - 13
\end{align}\]
<p>So we want this to be $\leq \frac{n^2}{2} + \frac{13n}{2} - 6$ for
every choice of $1 \leq k \leq n-2$.</p>
<p>Aaaaaand…. ruh roh!</p>
<p>You can see by <a href="https://www.desmos.com/calculator/0eyfyqovj2">this</a> desmos graph that this fails in general.
Indeed, the earliest failure happens when $n=8$ and $k=6$. Of course,
this is <em>outside</em> of the $n \leq 4$ range that I was able to exhaustively test,
and even the $n \leq 6$ range that I had tested a lot of<sup id="fnref:15" role="doc-noteref"><a href="#fn:15" class="footnote" rel="footnote">5</a></sup>.</p>
<p><span style="float:right">$\lrcorner$</span></p>
<hr />
<p>This is a perfect example of Klaus’s “Magic Spiral”, which he shows in the
first CDM lecture every year.</p>
<p style="text-align:center;">
<img src="/assets/images/diff-growth/magic-spiral.png" width="50%" />
</p>
<p>In this particular situation, there wasn’t a ton to visualize, so we jumped
straight from “compute/experiment” to “conjecture”. Indeed, our computations
seemed to suggest that iterating $\arccos$ was the right approach, but when
we tried to prove it we failed.</p>
<p>This is ok, though! Good, even, because our failure is instructive!
We know where our proof failed, and this tells us where we should focus our
computational effort on our next trip around the spiral.</p>
<p>Indeed, knowing that
we want $k = \lvert g \rvert = 6$ and $n = 8$ says we should try something like</p>
\[f = g^h = \arccos(\arccos(\arccos(\arccos(\arccos(x)))))^x\]
<p>and indeed, haskell tells us that $\lvert f’ \rvert = 79$, which is
bigger than the $78$ we would get by iterating $\arccos$.</p>
<hr />
<p>Now with <code class="language-plaintext highlighter-rouge">Pow</code> <em>and</em> <code class="language-plaintext highlighter-rouge">ACos</code> to worry about, it’s much less clear what the
optimal function will be. After all, we’ll need to balance the two, and I don’t
have the processing power to do an exhaustive search of $n=8,9,10$ (say)
to try and guess at a pattern<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">6</a></sup>.</p>
<p>Thankfully, this problem still admits an <em>asymptotic</em> solution, and our
earlier proof attempt is easily adapted to this setting.</p>
<p>Now, the most important skill a mathematician should learn is how to cover
their tracks<sup id="fnref:10" role="doc-noteref"><a href="#fn:10" class="footnote" rel="footnote">7</a></sup>. So when presenting a result like this to journals, we should
never say that we’re presenting an asymptotic solution because we didn’t
have the time to get a closed form.</p>
<p>Instead, we should argue that the choice of constructors for the elementary
functions was arbitrary, and any closed form for the maximal size of
$\lvert f’ \rvert$ <em>necessarily</em> depends on the choice of constructors!
Indeed, there are other conventions one could make, such as deciding to
not count multiplication towards the symbol count, since we often denote
multiplication by juxtaposition, which doesn’t require a “symbol” at all.</p>
<p>Of course, one can show that the <em>asymptotics</em> of $\lvert f’ \rvert$ are
independent of these choices, which makes the asymptotics a better
object of study.</p>
<p>… sounds good, doesn’t it<sup id="fnref:11" role="doc-noteref"><a href="#fn:11" class="footnote" rel="footnote">8</a></sup>?</p>
<p>Now let’s prove it!</p>
<div class="boxed">
<p>If $f$ has $n$ symbols in its definition, then $f’$ has $O(n^2)$
symbols in its definition, and this is optimal.</p>
<p>Moreover, our proof shows that this is independent of the choice of
presentation of the elementary functions.</p>
</div>
<p>$\ulcorner$</p>
<p>Again, we induct on $\lvert f \rvert$, the number of symbols in $f$.</p>
<p>Since we’re only interested in asymptotics, there’s nothing interesting to
prove about the base case.</p>
<p>For the inductive case, we case on the outermost constructor of $f$.</p>
<p>If it’s unary, say $f = c(g)$, then we see that</p>
\[\lvert f' \rvert =
\lvert c'(g) \cdot g'(x) \rvert =
\lvert c'(x) \rvert + O \left ( \lvert g \rvert \right ) +
O \left ( \lvert g' \rvert \right ) \pm O(1)\]
<p>where the $O(1)$ term is independent of $c$ and keeps track of the symbols
involved in representing the multiplication, etc. The big-ohs
around $\lvert g \rvert$ and $\lvert g’ \rvert$ account for the fact that
we might use each of these a constant number of times<sup id="fnref:13" role="doc-noteref"><a href="#fn:13" class="footnote" rel="footnote">9</a></sup>.</p>
<p>Next we see that $\lvert c’(x) \rvert = O(1)$,
since we can uniformly bound these by the size of the largest one,
as we did with $\arccos$ earlier in this post<sup id="fnref:12" role="doc-noteref"><a href="#fn:12" class="footnote" rel="footnote">10</a></sup>. So we see</p>
\[\begin{align}
\lvert f' \rvert
&= \lvert c(g)' \rvert \\
&\leq O \left ( \lvert g \rvert \right ) + O \left ( \lvert g' \rvert \right ) + O(1) \\
&\overset{IH}{\leq} O \left ( n-1 \right ) + O \left ((n-1)^2 \right ) + O(1) \\
&\leq O(n^2)
\end{align}\]
<p>If instead the outermost constructor is binary, say $f = c(g,h)$,
where $c(g,h)$ might be $g+h$, $gh$, $g^h$, etc. then we similarly compute</p>
\[\lvert f' \rvert =
\lvert c(g,h)' \rvert =
O \left ( \lvert g \rvert \right ) + O \left ( \lvert h \rvert \right ) +
O \left ( \lvert g' \rvert \right ) + O \left ( \lvert h' \rvert \right ) + O(1)\]
<p>and since $\lvert g \rvert + \lvert h \rvert = n-1$, we see that this is
bounded by</p>
\[O(n-1) + O \left ( (n-1)^2 \right ) + O(1) = O(n^2)\]
<p>and the claim follows.</p>
<p>As for the tightness of this bound, any presentation of the elementary
functions must have at least one trig function (since we cannot build
the trig functions from the others), say $\sin$. Then the $n$-fold
composition $\sin(\sin(\cdots(\sin(x) \cdots)))$ is easily seen to have
a derivative with quadratically many symbols.</p>
<p><span style="float:right">$\lrcorner$</span></p>
<hr />
<p>So we see that the precise question posed in the comic has no answer!
It asks for the maximal ratio of $\frac{\lvert f’ \rvert}{\lvert f \rvert}$,
but we’ve just shown that this ratio is unbounded. Of course, it’s still a
fun problem, and a natural variant <em>does</em> admit a nice solution
(which we found).</p>
<p>Moreover, this was a good way to showcase the back and forth between
computational experimentation and proof. Sometimes you get things wrong,
and that’s ok! We learn, and we form new conjectures that are more likely
to be correct with every trip around the spiral.</p>
<div class="boxed">
<p>As a cute project idea, while I was writing this one of my friends
(<a href="https://rahulrajkumar.github.io/">Rahul</a>) sent me <a href="https://iagoleal.com/posts/calculus-symbolic/">a blog post</a> where Iago Leal de Freitas built
a calculus evaluator in haskell that does simplification properly!</p>
<p>A better hacker than me can probably modify this code to push things a bit
further (especially with some parallel computation) to try and find
a family of functions $(f_n)$ attaining the maximum ratios
$\frac{\lvert f_n’ \rvert}{\lvert f_n \rvert}$.</p>
<p>This should be a pretty approachable problem for an enthusiastic
combinatorics student, and I would love to see somebody do it ^_^</p>
</div>
<hr />
<p>This was a lot of fun! It’s been in the works for a while now
(since April 28, apparently), but I really only worked on it
for a few days. I’m busy working on a lot of other stuff<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">11</a></sup>, and I’ll
hopefully share some of it soon.</p>
<p>One of the biggest things I’ve been spending time on
(which probably also qualifies as an announcement)
has been the <a href="https://www.uwo.ca/math/faculty/kapulkin/seminars/hottest_summer_school_2022.html">HoTTEST Summer 2022</a>,
where I’ll be TAing this summer. I’m already pretty active answering questions
in the discord, and I’ve been brushing up on my HoTT to get
ready<sup id="fnref:14" role="doc-noteref"><a href="#fn:14" class="footnote" rel="footnote">12</a></sup>. I can <em>not</em> express how excited I am to be working on this, and
if anybody wants to show up, you’re more than welcome! We’re quickly coming
up on 1000 participants (of all experience levels),
and it’s sure to be a great time!</p>
<p>For now, though, I’m off to bed. Goodnight all, and I’ll see you in the next one ^_^</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Incidentally, this is why I chose haskell instead of sage. Python really
doesn’t handle algebraic datatypes with any sort of alacrity, and I wanted
to exploit the recursive structure of the problem.</p>
<p>Plus, it’s been a hot second since I got to use haskell, and it’s one of
my favorite languages to work in, so I didn’t spend very long on the decision :P. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:8" role="doc-endnote">
<p>Namely as <code class="language-plaintext highlighter-rouge">Div (Const 1) (Cos X)</code> <a href="#fnref:8" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:4" role="doc-endnote">
<p>… and regrettably I failed in that regard. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:9" role="doc-endnote">
<p>Up to an additive constant, at least. If you want to be super precise,
then <code class="language-plaintext highlighter-rouge">size $ diff $ C e = size e + size (diff e) + size (C X) - 2</code> is
true for every unary constructor <code class="language-plaintext highlighter-rouge">C</code>. <a href="#fnref:9" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:15" role="doc-endnote">
<p>Of course, we could simply <em>remove</em> <code class="language-plaintext highlighter-rouge">Pow</code> as a constructor, since we
can simulate it using <code class="language-plaintext highlighter-rouge">Exp</code> and <code class="language-plaintext highlighter-rouge">Log</code>. It’s not hard to show that the other
binary operations <em>will</em> let this proof go through, so we could have
“covered our tracks” by acting like we never even considered <code class="language-plaintext highlighter-rouge">Pow</code>!</p>
<p>I thought it would make for a better narrative (and it might be more
instructive) to go the asymptotic approach instead. Plus, it really is
more hygenic to prove a result that doesn’t depend on a particular
choice of “basic” constructors. <a href="#fnref:15" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:7" role="doc-endnote">
<p>Looking at the formulas, we can tell that eventually <code class="language-plaintext highlighter-rouge">Pow</code> will win out
over <code class="language-plaintext highlighter-rouge">ACos</code>, and it probably wouldn’t take <em>too</em> much work to sort this out…</p>
<p>Maybe some reader with some free time wants to take this on as a project? <a href="#fnref:7" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:10" role="doc-endnote">
<p>I’m only half joking <a href="#fnref:10" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:11" role="doc-endnote">
<p>It helps that this is actually a perfectly reasonable thing to do, and
jokes aside my original plan was to get asymptotics for exactly this reason
(also because I anticipated that an exact solution might be hard to get).</p>
<p>I thought we had gotten lucky with the iterated $\arccos$ construction,
and if you <em>can</em> get a closed form, you might as well. But with those
dreams dashed, it’s back to the asymptotics at the end of the day. <a href="#fnref:11" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:13" role="doc-endnote">
<p>For example, we might choose to represent the derivative of $g^2$ by
$(g + g) g’$, in which case $\lvert (g^2)’$ would refer to $\lvert g \rvert$
twice.</p>
<p>I haven’t actually thought much about how badly things break if you do
something silly like this, but take it to an extreme (can we find a way to
make it so that there’s <em>no</em> uniform bound on this constant?), but I’m also
ok to leave that particular avenue unexplored.</p>
<p>Officially I should probably add some hypotheses explicitly forbidding this –
for instance, it should be enough to ask that we allow at most finitely many
constructors. That said, I think it’s ok to leave this a bit imprecise for
the purposes of a blog post. <a href="#fnref:13" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:12" role="doc-endnote">
<p>You might worry that there <em>is</em> no largest unary constructor. But the
only infinite families of constructors
(at least that are listed on <a href="https://en.wikipedia.org/wiki/Elementary_function">wikipedia</a>)
are the rational powers and the bases for $\exp$ and $\log$.</p>
<p>It’s clear, though, that the contributions of each of these derivatives
can be uniformly bounded as long as we’re counting a constant as a single
symbol. <a href="#fnref:12" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>I’m reading a series of papers on <a href="https://en.wikipedia.org/wiki/Model_category">model categories</a> with
<a href="https://sites.google.com/view/syeakel/">Sarah Yeakel</a> (who recently got a permanent position at UCR!),
as well as continuing my own readings on topos theory (which have filtered
into a reading course on locale theory that I’m teaching some undergrads).
I’m also in a class on riemann surfaces which has been really enlightening
for me. I have a few ideas for blog posts of the “I wish someone had shown
me this example sooner” variety, and hopefully I can get to them soon!</p>
<p>On top of all this, I’ve been talking with <a href="https://sites.google.com/site/patriciogallardomath/">Patricio Gallardo</a> about becoming an
algebraic geometer, and he wants me to start spending a serious amount of
time working through Hartshorne and Vakil’s notes. This makes sense,
of course, and I’m having a ton of fun doing it, but it means I have less
time to work on silly projects like this. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:14" role="doc-endnote">
<p>Plus trying to gain some serious familiarity with model categories and
$\infty$-categories before we start. This lined up quite nicely with my
conversations with Sarah about model categories. Sometimes you just get
lucky! <a href="#fnref:14" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Tue, 10 May 2022 00:00:00 +0000
https://grossack.site/2022/05/10/diff-growth.html
https://grossack.site/2022/05/10/diff-growth.htmlUsing Geometry in Logic<p>One thing that I talk a lot about is the (surprisingly tight) connection
between geometry and logic. I feel like this is something that one usually
gains an appreciation for by seeing lots of examples, and I found a particularly
simple example today <a href="https://math.stackexchange.com/q/4430107/655547">on mse</a>.</p>
<p>For completeness, OP wanted to know how to formally derive</p>
<div class="boxed">
<p>\(B \leftrightarrow A \land B, \ A \leftrightarrow \lnot B \vdash A\)</p>
</div>
<p>and when I first saw this, I thought it looked vaguely <a href="https://en.wikipedia.org/wiki/Law_of_excluded_middle">LEM</a>-y, so my first
question was whether it was true intuitionistically. If it <em>isn’t</em> true
intuitionistically, I would also want to find an intuitionistic model which
invalidates it in order to give a complete answer (since I like to justify my
uses of LEM for problems like this).</p>
<p>But how, you might ask, does geometry come into the picture? Well, <a href="https://en.wikipedia.org/wiki/Intuitionistic_logic#Heyting_algebra_semantics">we know</a> that
a sequent is provable intuitionistically if and only if it’s valid on all
topological spaces with the following semantics<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>:</p>
<ul>
<li>primitive propositions $A,B,C,\ldots$ are open sets</li>
<li>$\varphi \land \psi$ is the intersection of $\varphi$ and $\psi$</li>
<li>$\varphi \lor \psi$ is the union of $\varphi$ and $\psi$</li>
<li>$\lnot \varphi$ is the interior of the complement of $\varphi$</li>
<li>$\varphi$ is “true” exactly when it’s the whole space</li>
<li>$\varphi$ is “false” exactly when it’s the empty set</li>
</ul>
<p>In fact, we can say more: it’s enough<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">2</a></sup> to check when the primitive propositions
are open subsets of $\mathbb{R}$. To summarize this situation, cool kids will
say that the topological semantics of $\mathbb{R}$ are
<span class="defn">complete</span> for intuitionistic logic.</p>
<p>By this completeness theorem,</p>
\[B \leftrightarrow A \land B, \ A \leftrightarrow \lnot B \vdash A\]
<p>is provable intuitionistically if and only if<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">3</a></sup></p>
<ul>
<li>for any two open subsets $A$ and $B$ of $\mathbb{R}$</li>
<li>if $B = A \cap B$ and $A$ is the interior of $B^c$</li>
<li>then $A = \mathbb{R}$</li>
</ul>
<p>But now we see that we can start applying our geometric intuition to this
problem! After all, we know what open subsets of $\mathbb{R}$ look like, and
(at least for me), it’s much faster to show that $A$ must be all of $\mathbb{R}$
in the above example than to look for a formal derivation.</p>
<p>Of course, to really answer OP’s question, we <em>should</em> provide a derivation.
It’s not enough to argue abstractly that one should exist<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">4</a></sup>, and thankfully
this is also not hard. Since we now know that the claim is true
intuitionistically, we can switch over to a programming interpretation by
<a href="https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence">curry-howard</a>! Since I have a lot of experience with functional programming,
this is <em>also</em> easier for me than working with the logic directly. The idea is
that writing programs is the same thing as writing proofs, and there’s a
totally algorithmic way to convert some code (which I’ll outline below) into
the desired proof tree:</p>
<p>Say we have programs</p>
<ul>
<li>$f_1 : B \to A \times B$</li>
<li>$f_2 : A \times B \to B$</li>
<li>$g_1 : A \to B \to 0$</li>
<li>$g_2 : (B \to 0) \to A$</li>
</ul>
<p>You’ll recognize these as
our assumptions (where I’ve unpacked the $\leftrightarrow$s). We want to build
a program of type $A$.</p>
<p>By $g_2$, if we can build a program of type $B \to 0$, we’ll be done! But
if we’re given a $b:B$, then it’s almost immediate to build the desired term
as follows</p>
\[B
\overset{f_1}{\longrightarrow} A \times B
\overset{g_1 \times \text{id}_B}{\longrightarrow} \lnot B \times B
\longrightarrow 0\]
<div class="boxed">
<p>As a quick exercise, you might try to write down the actual code of type $A$
in your favorite functional programming language, assuming the existence of
these functions $f_1$, $f_2$, $g_1$, and $g_2$.</p>
<p>If you don’t have anything better to do (or if you’ve never done it before)
you might then convert this program into the proof tree that OP asked for.</p>
</div>
<hr />
<p>This all worked out quite smoothly, since it turned out that the claim
was actually true intuitionistically. If we got a claim that <em>isn’t</em>
intuitionistically valid, can we use geometry in order to find a model
where it’s false?</p>
<p>The answer, of course, is “yes”!</p>
<p>As an easy example, let’s take double negation elimination</p>
\[\vdash \lnot \lnot A \leftrightarrow A\]
<p>under our topological interpretation, this says that
“the interior of the complement of the interior of the complement of $A$ is $A$”.
A moment’s thought shows that this is the same thing as
“the interior of the closure of $A$ equals $A$”, and there are well known
open sets which don’t have this property<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">5</a></sup>.</p>
<div class="boxed">
<p>As another quick exercise, find an open set $A$ which is <em>not</em> the
interior of its closure!</p>
<p>Then the heyting algebra of open subsets of $\mathbb{R}$ equipped with this
open set $A$ provides a countermodel for double negation elimination.</p>
</div>
<hr />
<p>Another quick one tonight! I know I talk a lot about how my interests lie in
the intersection of geometry and logic, but I think it can be tricky to really
understand what that means. When I answered this mse question, I realized it
would make a good expository post to give the general flavor of my interests.
The fact that these things are <em>also</em> related to functional programming and
PL theory is not an accident, and I’m also interested in those fields for
similar reasons!</p>
<p>Obviously the rabbit hole goes much deeper than this. First via locales,
which are geometric objects that “classify” propositional theories, and later
via toposes, which are geoemtric objects that classify predicate (and higher order)
theories in an analogous way.
For more details about this, see Vicker’s excellent paper
<em>Locales and Toposes and Spaces</em>, available <a href="https://www.cs.bham.ac.uk/~sjv/LocTopSpaces.pdf">here</a>, for instance.</p>
<p>Stay warm, and I’ll see you all soon ^_^</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Here, for simplicity, I’m identifying a formula $\varphi$ with its
interpretation. If you want to be less sloppy than me, you should
write \([ \! [ \varphi ] \! ]\), but this is too annoying for me to
type in mathjax – there aren’t enough hours in the day to write</p>
<p><code class="language-plaintext highlighter-rouge">[ \! [ \varphi ] \! ]</code></p>
<p>the number of times that would be required of me. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:4" role="doc-endnote">
<p>It’s also enough to check
$\mathbb{R}^n$ for any fixed $n$ (I often have $\mathbb{R}^2$ in mind),
or $2^\omega$ cantor space, or many other concrete spaces. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:5" role="doc-endnote">
<p>Again we may take, for example, $\mathbb{R}^2$, $2^\omega$, etc. instead
of $\mathbb{R}$ if we prefer. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>I read a great (albeit somewhat aggressive) <a href="https://www.hedonisticlearning.com/posts/the-pedagogy-of-logic-a-rant.html">blog post</a> a while ago which
gave an analogy that now lives in my head rent free. If there are any readers
confused by the (admittedly subtle!) distinction between giving a derivation
and checking semantically that a sequent must be valid,
hopefully this analogy helps!</p>
<div class="boxed">
<p>When asked to derive a sequent $\Gamma \vdash \varphi$, it’s not enough
to just check that it’s valid semantically.</p>
<p>This would be like being asked to compute the inverse of a matrix, and
instead checking that the determinant is nonzero. Yes, this is equivalent
to the <em>existence</em> of an inverse, but finding the inverse itself carries
more information!</p>
</div>
<p><a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>Indeed, those open sets which satisfy this property are called
<a href="https://en.wikipedia.org/wiki/Regular_open_set">regular</a>. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Sun, 17 Apr 2022 00:00:00 +0000
https://grossack.site/2022/04/17/geometry-and-logic.html
https://grossack.site/2022/04/17/geometry-and-logic.htmlHow Holomorphic Functions are Just Like Polynomials<p>I took a complex exam <del>last week</del> a while ago, and while I was studying I
realized that a lot of the theorems were saying that holomorphic functions
behave like polynomials. This makes sense, since a holomorphic function,
which locally has a power series, looks like a polynomial of infinite degree,
but there’s actually quite a bit to say here! With that in mind, I decided to
write up some quick thoughts about this, in line with my post from a while ago
talking about banach space theorems generalizing finite dimensional linear
algebra (see <a href="/2021/09/09/banach-spaces.html">here</a>). Now, on with the show!</p>
<hr />
<p>I mentioned this in the introduction, but it’s worth stating the obvious.
A holomorphic function locally looks like a power series</p>
\[a_0 + a_1 (z - \xi) + a_2 (z - \xi)^2 + \ldots\]
<p>that is, a “polynomial of infinite degree”. With this in mind, there’s lots
of well known formulas that work for polynomials that continue to the
holmorphic setting. For instance, we can differentiate and integrate
term by term (provided we stay inside the radius of convergence), and if we
have two series, we can add and multiply them exactly as we would polynomials
(term by term for addition, and via the <a href="https://en.wikipedia.org/wiki/Cauchy_product">cauchy product</a> formula for products)
provided we stay inside the radius of convergence for <em>both</em> series.</p>
<p>In fact, there are deeper ways in which holomorphic functions act like polynomials.
For a start, polynomials over $\mathbb{C}$ always factor as some constant times
a product of roots. That is, we always have</p>
\[p(z) = c \prod_{\rho} (z - \rho)\]
<p>(where the roots $\rho$ are counted with multiplicity, of course).</p>
<p>This says that, up to <a href="https://en.wikipedia.org/wiki/Unit_(ring_theory)">units</a>, polynomials are in bijection with
finite (multi)sets of points in the complex plane.</p>
<p>The situation for holomorphic functions is more delicate, but only
slightly. The <span class="defn">Weierstrass Factorization Theorem</span>
tells us that every holomorphic function factors as</p>
\[f(z) = e^g z^m \prod_{\rho} E_{n_\rho} \left ( \frac{z}{\rho} \right )\]
<p>Here $e^g$ is nowhere zero, thus is a unit in the ring of holomorphic functions,
and the function $E_{n_\rho} \left ( \frac{z}{\rho} \right )$ is zero only at a
nonzero root $\rho$, so is analogous to the factor $(z - \rho)$, and we also
have $m$ factors of $z$ which correspond to the order of vanishing of $f$ at $0$.</p>
<p>These functions $E_k \left ( \frac{z}{\rho} \right )$, which differ from
$(z - \rho)$ only by units, are cleverly chosen to force the infinite product
to converge, which is an issue we don’t have in the case of polynomials<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">1</a></sup>.</p>
<p>Conversely, we would like to know that to any family of points in the plane,
we can associate a holomorphic function with precisely those points as roots.
This is possible, but the key insight is a hidden assumption in the case of
polynomials: a finite set of points ia always <em>discrete</em>! If we want to allow
for infinitely many zeroes, we have to explicitly demand discreteness of the
set of zeroes<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">2</a></sup>.</p>
<p>But this is the only obstruction! Say $(a_n)$ is a discrete set of points
in the plane, and $(r_n)$ is a sequence of natural numbers.
Then there exists a holomorphic function, unique up to units,
which vanishes to order $r_n$ at $a_n$, and nowhere else<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">3</a></sup>.</p>
<div class="boxed">
<p>As a quick exercise, this data is often repackaged by saying that
$(a_n)$ is a <em>sequence</em> of points in the plane with $|a_n| \to \infty$.</p>
<p>Show that this condition is equivalent to ours.</p>
</div>
<p>Next up is the argument principle, which lets us count the number of zeroes
$f$ has in some region.</p>
<p>For polynomials, the key insight is that the <a href="https://en.wikipedia.org/wiki/Logarithmic_derivative">logarithmic derivative</a>
turns products into sums. That is,</p>
\[\frac{(uv)'}{uv} = \frac{u'}{u} + \frac{v'}{v}\]
<p>So if we factor our polynomial as $c \prod (z - \rho)$, we can use this
formula to compute the logarithmic derivative:</p>
\[\frac{p'}{p} =
\frac{c \left ( \prod (z - \rho) \right )'}{c \prod (z-\rho)} =
\sum \frac{(z - \rho)'}{z-\rho} =
\sum \frac{1}{z-\rho}\]
<p>Of course, it’s easy to compute the integral of this sum along a (simple) closed
contour! We pick up a $2\pi i$ is $\rho$ is inside the contour, and a $0$ otherwise.</p>
<p>So integrating both sides, we see that</p>
\[\oint_\gamma \frac{p'}{p}\ dz = 2 \pi i \left ( \# \text{roots inside $\gamma$} \right )\]
<p>The remarkable thing is that this formula goes through entirely unchanged when
we pass to holomorphic functions<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup>!</p>
<hr />
<p>Lastly, let’s give an important property that <em>does</em> change when we move from
the polynomial setting to the holomorphic one: growth rates!</p>
<div class="boxed">
<p>As a preemptive exercise, you should show that any holomorphic function $f$
with $|f| = O(|z|^n)$ for some $n$ must itself be a polynomial.</p>
<p>You’ll want to use <a href="https://en.wikipedia.org/wiki/Liouville%27s_theorem_(complex_analysis)">Liouville’s theorem</a>!</p>
</div>
<p>So a holomorphic function which grows polynomially quickly is itself a polynomial.
Taking contrapositives, we see that any nonpolynomial holomorphic function must
grow faster than any polynomial! This gives rise to an obvious question:</p>
<p>How quickly <em>can</em> holomorphic functions grow?</p>
<p>Well, in the last section we said that for any discrete sequence $(a_n)$, and for
any values $A_n$ we like, we can find a holomorphic function $f$ so that
$f(a_n) = A_n$.</p>
<p>For simplicity, let’s take $a_n = n$ to be integers. Now let’s take
$A_n \triangleq n!$. Or better yet, $A_n = (n!)!$. What the hell, let’s let
$A_n \triangleq \mathtt{Ack}(n,n)$ the diagonal of the <a href="https://en.wikipedia.org/wiki/Ackermann_function">ackermann function</a>!</p>
<p>The theorem from the last section says that there’s a holomorphic function which
grows at least as quickly as $\mathtt{Ack}(n,n)$, but it’s easy to see that
we can make functions which grow as quickly as we like by modifying this argument<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup>!</p>
<p>This should be some mild warning that, despite many other similarities, there
is still a marked difference between the behavior of holomorphic (even entire!)
functions and polynomials.</p>
<hr />
<p>Finally a truly quick one! Can you believe a post this short has been in my
drafts for almost a month now? I still have plans for some more exciting posts
coming up, but I’ve been really overwhelmed with work lately.</p>
<p>One fun, thing is that I’m running a reading course for some undergraduates on
<a href="https://en.wikipedia.org/wiki/Pointless_topology">locale theory</a>, and I might try to keep a running series where I summarize
what we do on any given week.</p>
<p>So far we’ve mainly been reviewing the definitions of categories, which I don’t
think I need to go into here, but once we start doing more interesting things
I’d like to post my thoughts here as we go. No promises, though!</p>
<p>If nothing else, I have another post which is <em>almost</em> done, and I should
hopefully post it soon! I’ll see you all there ^_^</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:3" role="doc-endnote">
<p>More details can be found in Conway’s <em>Functions of One Complex Variable</em>
section $VII.5$, but basically</p>
\[E_k(z) \triangleq
(1 - z)
\exp \left ( z + \frac{z^2}{2} + \frac{z^3}{3} + \ldots + \frac{z^k}{k} \right )\]
<p>so in particular, $E_k \left ( \frac{z}{\rho} \right )$ is zero precisely when
$z = \rho$.</p>
<p>Notice we could just as easily factor a polynomial $p$ as</p>
\[p(z) = c z^m \prod \left ( 1 - \frac{z}{\rho} \right )\]
<p>where the $\rho$ are the nonzero roots of $p$, and this differs from
the usual factorization only by a unit.</p>
<p>Writing $p$ in this way makes the analogy with the weierstrass factorization
theorem much more obvious. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:1" role="doc-endnote">
<p>After all, by uniqueness of analytic continuation, if the zeroes of a
holomorphic function contain a limit point, then that function <em>must</em> be
identically zero!</p>
<p>So if we want to be able to say the zeroes are on our specified points
and <em>nowhere else</em>, then the desired set of zeroes cannot contain a limit
point. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>Remarkably <em>much</em> more is true!</p>
<p>We can specify nonzero values at some discrete family of points,
and we can even specify the first finitely many derivatives at those
points!</p>
<p>Formally, if $a_n$ is some discrete set of points, and $k$ is some fixed integer,
then for any values $A_n^0, A_n^1, A_n^2, \ldots, A_n^k$,
there’s a holomorpic function (unique up to units) so that for every
$0 \leq j \leq k$, and for every $n$, we have</p>
\[\left . \frac{d^j f}{dz^j} \right |_{a_n} = A_n^j\]
<p>This is <em>incredible</em>, since it seems to fly in the face of the rigidity
of holomorphic functions. It’s wild to me that there should be <em>such</em> a
wealth of holomorphic functions which we can create to our specifications.
I (unsurprisingly) first heard about this result on <a href="https://math.stackexchange.com/questions/1627388/is-there-an-upper-bound-on-the-growth-rate-of-analytic-functions">mse</a>, and I don’t
actually have a reference besides that… If someone happens to know one,
I would love to hear about it! <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:4" role="doc-endnote">
<p>This is basically because the region bounded by $\gamma$ is compact. Thus
it can contain only finitely many roots (do you see why?) so we can write
$f$ as a <em>finite</em> product of roots inside $\gamma$ (just like our polynomial!)
times some function $g$ which is nonzero inside $\gamma$. Then we apply our
formula for the logarithmic derivative exactly as we did for polynomials,
but we’ll get some final term of the form $\frac{g’}{g}$, whose integral
is also $0$. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:5" role="doc-endnote">
<p>This doesn’t really fit in with the rest of the post, but I want to mention it,
so… footnote :P</p>
<p>It turns out that we can <em>lower</em> bound the growth rate of a holomorphic function
by understanding the density of its zeroes. This amounts to
<a href="https://en.wikipedia.org/wiki/Jensen%27s_formula">jensen’s formula</a>, which you can read about on Terry Tao’s blog
<a href="https://terrytao.wordpress.com/2020/12/23/246b-notes-1-zeroes-poles-and-factorisation-of-meromorphic-functions/">here</a>.</p>
<p>As a cute problem of this format, here’s a homework question from UCR’s
complex analysis class:</p>
<div class="boxed">
<p>Let $t > 0$ be fixed, and let</p>
\[f(z) \triangleq \prod_{n=1}^\infty
\left (
1 - \exp(-2\pi n t) \exp (2 \pi i z)
\right )\]
<p>In particular, $f$ has zeroes at exactly $m - int$ for $m,n \in \mathbb{Z}$.</p>
<p>Show that</p>
<p>\(\max_{|z| < R} |f(z)| =
\Omega \left ( \exp \left ( \frac{\pi R^2}{4t} \right )\right )\)</p>
</div>
<p>In proving this (using jensen’s formula), you’ll want to estimate the
number of zeroes in a circle of radius $R$, which you can (and should)
do geometrically. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Wed, 13 Apr 2022 00:00:00 +0000
https://grossack.site/2022/04/13/hol-poly.html
https://grossack.site/2022/04/13/hol-poly.html