https://grossack.site
Finite Calculus, Stirling Numbers. and Cleverly Changing Basis<p>I’m TAing a linear algebra class right now, and the other day a student came
to my office hours asking about the homework. Somehow during this discussion
I had a flash of inspiration that, if I ever teach a linear algebra class of
my own, I would want to use as an example of changing basis “in the wild”.
When I took linear algebra, all the example applications were to diagonalization
and differential equations – but I”m mainly a discrete mathematician, and I
would have appreciated something a bit closer to my own wheelhouse.</p>
<p>The observation in this post was first pointed out in a combinatorics class
I took with <a href="https://www.math.cmu.edu/~clintonc/">Clinton Conley</a>. I was aware of the theorem, but I hadn’t
thought of it as a change of basis theorem until that point. I remember feeling
like this was incredibly obvious, and simultaneously quite enlightening. I hope
you all feel the same way about it ^_^. At the very least, this will be a nice
change of pace from all the thinking I’ve been doing about power series
(which should be a follow up to <a href="/2021/05/05/initial-polynomial-proofs">my post</a> the other day) as well as a few
other tricky things I’m working on. It’s nice to talk about something
(comparatively) easy for a change!</p>
<hr />
<p>Let’s take a second to talk about <a href="https://en.wikipedia.org/wiki/Finite_difference">finite calculus</a>. That wikipedia link
is only so-so (at least at the time of writing), but there’s a great intro
by David Gleich <a href="https://www.cs.purdue.edu/homes/dgleich/publications/Gleich%202005%20-%20finite%20calculus.pdf">here</a> and you can read more in
Knuth, Graham, and Patashnik’s <em>Concrete Mathematics</em> (Ch 2.6) as well as
the (encyclopedic) <em>Calculus of Finite Differences</em> by Jordan<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>.</p>
<p>There’s a lot to say, but the tl;dr is this:
Finite Calculus’s raison dêtre is to compute sums with the same facility
we compute integrals (and indeed, with analogous tools). If you’ve ever
been mystified by <a href="https://en.wikipedia.org/wiki/Summation_by_parts">Summation by Parts</a><sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>, you’ve already encountered
part of this machinery. I won’t go into much detail in this post, because
I want to keep this short. But I highly encourage you to look into it if
you spend a lot of time computing sums. Nowadays I mainly use
<a href="https://sagemath.org">sage</a>, but it’s nice to know how to do some of these
things by hand.</p>
<p>We start with discrete differentiation:</p>
<div class="boxed">
<p>For a function $f$, we define $\Delta f$
(the <span class="defn">Forward Difference</span> of $f$) to be</p>
\[\Delta f = \frac{f(x+1) - f(x)}{1} = f(x+1) - f(x).\]
<p>Obviously most people write it the second way, but I like to show the
first to emphasize the parallel with the classical derivative.</p>
</div>
<p>This satisfies variants of the nice rules you might want a “derivative” to satisfy:</p>
<div class="boxed">
<p>As an exercise, show the following<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">3</a></sup>:</p>
\[\begin{align}
\text{(Linearity)} && \Delta(\alpha f + \beta g) &= \alpha \Delta f + \beta \Delta g \\
\text{(Leibniz)} && \Delta(f \cdot g) &= (\Delta f) \cdot g + f \cdot (\Delta g) + (\Delta f) \cdot (\Delta g) \\
\end{align}\]
<p>As a tricky challenge, can you find a quotient rule?
As a <em>very</em> tricky challenge, can you find a chain rule<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">4</a></sup>?</p>
</div>
<p>We also get a fundamental theorem of calculus (with a <em>much</em> easier proof!):</p>
<div class="boxed">
<p>Theorem (Fundamental Theorem of Finite Calculus):</p>
<p>\(\sum_a^b \Delta f = f(b+1) - f(a)\)</p>
</div>
<hr />
<p>Of course, these give us ways of <em>combining</em> facts we already know. In a
calculus class we have a toolbox of “basic” functions that we know how to
differentiate. Are there any such functions here? The answer is <em>yes</em>, and
that leads us to the linear algebraic point of this post!</p>
<div class="boxed">
<p>Define the <span class="defn">$n$th falling power</span> to be</p>
\[x^{\underline{n}} = (x-0) (x-1) (x-2) \cdots (x-(n-1))\]
<p>(at least when $n \gt 0$).</p>
</div>
<p>Then we have the following “power rule” for forward differences<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup>:</p>
<div class="boxed">
<p>\(\Delta x^{\underline{n}} = n x^{\underline{n-1}}\)</p>
</div>
<p>This plus the fundamental theorem lets us quickly compute
sums of “falling polynomials”. As an example:</p>
\[\begin{align}
\sum_a^b 4 x^\underline{3} - 2 x^\underline{2} + 4
&= \sum_a^b 4 x^\underline{3} - \sum_a^b 2 x^\underline{2} + \sum_a^b 4 \\
&= \sum_a^b \Delta x^\underline{4}
- \frac{2}{3} \sum_a^b \Delta x^\underline{3}
+ 4 \sum_a^b \Delta x^\underline{1} \\
&= \left . x^\underline{4} - \frac{2}{3} x^\underline{3} + 4 x^\underline{1} \right |_a^{b+1} \\
&= \left ( (b+1)^\underline{4} - a^\underline{4} \right )
- \frac{2}{3} \left ((b+1)^\underline{3} - a^\underline{3} \right )
+ 4 \left ( (b+1) - a \right )
\end{align}\]
<p>This is great, but we don’t often see $x^{\underline{k}}$ in the wild.
Most of the time we want to sum “classical” polynomials with terms like $x^k$.
If only we had a way to easily convert back and forth between “classical”
polynomials and “falling” polynomials…</p>
<p>Of course, that’s the punchline! We know the space of polynomials
has a standard basis \(\{x^0, x^1, x^2, x^3, \ldots \}\). But notice the
polynomials \(\{x^\underline{0}, x^\underline{1}, x^\underline{2}, x^\underline{3}, \ldots \}\)
<em>also</em> form a basis!</p>
<div class="boxed">
<p>If this isn’t obvious, you should do it as an easy exercise. As a hint,
what is the degree of each $x^\underline{n}$?</p>
</div>
<p>And now we have a very obvious reason to care about change of basis, which
I think a lot of young mathematicians would appreciate. There’s a lot
of good pedagogy that one can do with this, since the new basis isn’t contrived
(it comes naturally out of a desire to compute sums), and it’s an easy to
understand example. Plus it’s obvious that we’re representing the
<em>same polynomial</em> in multiple ways. In my experience a lot of students struggle
with the idea that changing bases doesn’t actually change the vectors themselves,
only the names we give them (i.e., their coordinates). This gives us an
understandable example of that.</p>
<p>As a sample exercise, we might ask our students to compute
$\sum_{x=1}^n x^2$. Once they know $x^2 = x^\underline{2} + x^\underline{1}$,
(which can be worked out by hand without much effort) they can compute</p>
\[\sum_1^n x^2
= \sum_1^n x^\underline{2} + x^\underline{1}
= \left . \frac{x^\underline{3}}{3} + \frac{x^\underline{2}}{2} \right |_1^{n+1}
= \frac{(n+1)^\underline{3} - 1^\underline{3}}{3} + \frac{(n+1)^\underline{2} - 1^\underline{2}}{2}\]
<p>They can then check (with sage, say) that this agrees with the <a href="https://math.stackexchange.com/questions/48080/sum-of-first-n-squares-equals-fracnn12n16">usual formula</a>.</p>
<hr />
<p>At this point, we’re probably sold on the idea that this alternate basis is
useful for computing these sums. But it’s not yet clear how effective this is.
If I ask you to compute, say, $\sum_a^b x^5$, how would you go about doing it?
We need to know how to actually <em>compute</em> this change of basis<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">6</a></sup>.</p>
<p>Enter the <a href="https://en.wikipedia.org/wiki/Stirling_number">stirling numbers</a>. There’s a lot of very pretty combinatorics
here, but let’s focus on what’s relevant for our linear algebra.
We write ${n \brace k}$ for the “stirling numbers of the second kind”, and
it turns out that</p>
\[x^n = \sum_k {n \brace k} x^\underline{k}\]
<p>which is almost usable! All we need now is a way to quickly compute
${n \brace k}$. Thankfully, there’s jn analogue of Pascal’s Triangle
that works for these coefficients!</p>
<p>Just like pascal’s triangle, we have $1$s down the outside, and we build
the $n+1$th row by adding the two terms from the previous row which you
sit between.
The only difference is the stirling numbers keep track of what <em>column</em> you’re
in as well. Concretely, the recurrence is</p>
\[{n+1 \brace k} = {n \brace k-1} + k {n \brace k}\]
<p>So you add the number above you and to your left to $k$ times the number
above you and to your right. You increase $k$ with every step. Let’s do some
sample rows together:</p>
<p>Say our previous row was</p>
\[1 \quad 7 \quad 6 \quad 1\]
<p>Then our next row will be</p>
\[{\color{blue}1}
\quad
1 + {\color{blue}2} \times 7
\quad
7 + {\color{blue}3} \times 6
\quad
6 + {\color{blue}4} \times 1
\quad
1\]
<p>which is, of course</p>
\[1 \quad 15 \quad 25 \quad 10 \quad 1.\]
<p>Then the next row will be</p>
\[{\color{blue}1}
\quad
1 + {\color{blue}2} \times 15
\quad
15 + {\color{blue}3} \times 25
\quad
25 + {\color{blue}4} \times 10
\quad
10 + {\color{blue}5} \times 1
\quad
1\]
<p>In the above example you can see the blue multiplier is just increasing by $1$
each time. We’re always combining the two entries above the current one, just
like in pascal’s version.</p>
<p><br /></p>
<p>Finally, to be super clear, if we know the $4$th row of our triangle is</p>
\[1 \quad 7 \quad 6 \quad 1\]
<p>that tells us that</p>
\[x^4 = x^\underline{4} + 7 x^\underline{3} + 6 x^\underline{2} + x^\underline{1}.\]
<div class="boxed">
<p>There’s no substitute for doing: As an exercise, you should write out the
first $10$ or so rows of the triangle. Use this to compute $\sum_a^b x^5$.</p>
</div>
<hr />
<p>Another good exercise I might give students one day is to
explicitly write down change of basis matrices for, say, polynomials of degree
$4$. This more or less amounts to writing the above triangle as a matrix,
but hopefully it will give students something to play with to better understand
how the change of basis matrices interact with the vectors.</p>
<p>I really think this example has staying power throughout the course as well.
Once we know $\Delta$ is linear, we know it must have a representation as a
matrix. Computing that representation in the falling power basis and in the
standard basis would be another good exercise. One could also introduce
<a href="https://en.wikipedia.org/wiki/Indefinite_sum">indefinite summation</a> (say by picking constant term $0$).
Again, we know what its matrix looks like in the falling powers basis,
but it’s not at all clear what it looks like in the standard basis.
After conjugating by a change of basis matrix, though, we can figure this
out! And the cool thing? Next time you want to compute a sum, you can just
multiply by (a big enough finite part) of this matrix and evaluate at the
endpoints!</p>
<p>If you’re a teacher and end up using this, or maybe you already <em>were</em> using
this, definitely let me know! I’d be excited to hear about this and other
ways that you try to make linear algebra feel more concrete to people
learning it for the first time.</p>
<p>If you’re a student, hopefully this was exciting! I know I get
geekier about this kind of stuff than a lot of people, but I think
finite calculus is a really cool idea. Hopefully this post encourages you
to go looking for other information about this technique, and maybe
shows that linear algebra is never very far away ^_^.</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>This book is actually <em>super</em> cool. It’s fairly old, and that shows in the
language (which can be kind of hard to read sometimes). What’s really cool,
though, is that it’s written for working statisticians in a pre-computer
era. So there’s a ton of pages with detailed tables, and a ton <em>more</em>
pages about how to go about making your own tables should you need some
family of constants that isn’t included. Obviously I’ll never have use
for those particular skills, so I haven’t read those parts too closely,
but I find it <em>so</em> interesting to see how things like that used to be done! <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>And who among us <em>wasn’t</em> when we first heard about it? I remember seeing it
in Baby Rudin, at which point I got really excited. Then really confused.
Then (after some deep thinking) really excited again. It took me a long time
to understand some quirks of the formula, though. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:4" role="doc-endnote">
<p>This actually isn’t how you often see the leibniz rule written. Even though
it’s objectively better than the alternative. Almost every reference I’ve
seen writes the leibniz rule as</p>
\[\Delta(f \cdot g) = (\Delta f) \cdot (Eg) + f \cdot (\Delta g)\]
<p>where $(Eg)(x) = g(x+1)$ is the “shift operator”.</p>
<p>I assume this is because summing both sides of this equation gives
the summation by parts formula, but the fact that the left hand side
is symmetric in $f$ and $g$ while the right hand side isn’t is… offputting. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>I’m not sure if there’s a good answer to this one, actually. There’s an
mse question about it <a href="https://math.stackexchange.com/questions/235680/chain-rule-for-discrete-finite-calculus">here</a>, but it’s pretty unsatisfying.</p>
<p>If you’ll indulge me in some philosophical waxing: The
classical chain rule witnesses the functoriality of the derivative
(really functoriality of the <a href="https://en.wikipedia.org/wiki/Tangent_bundle">tangent bundle</a>, but the induced map on
tangent bundles is exactly the derivative). I’m curious if the nonexistence
of a nice chain rule for us comes down to the fact that this isn’t actually
a functorial thing to do… I would think about it more, but I’m trying to
keep this post somewhat low-effort. I would love to hear someone else’s
thoughts on this, though. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:5" role="doc-endnote">
<p>There are other “fundamental” forward differences worth knowing as well.
Here’s a few to have in your pocket:</p>
<ul>
<li>$\Delta 2^x = 2^x$</li>
<li>More generally, $\Delta r^x = (r-1) r^x$</li>
<li>$\Delta \binom{x}{n} = \binom{x}{n-1}$</li>
<li>If we define $x^{\underline{0}} = 1$ and $x^{\underline{-n}} = \frac{1}{(x+1)(x+2)\cdots(x+n)}$, then the power rule continues to work.</li>
<li>$\Delta H_x = x^{\underline{-1}}$, where $H_x$ are the <a href="https://en.wikipedia.org/wiki/Harmonic_number">harmonic numbers</a></li>
</ul>
<p><a href="#fnref:5" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:6" role="doc-endnote">
<p>This is the kind of thing that I would probably just tell my
hypothetical students, but I might post a video or send them a blog
post where I go through it in detail as extra material for anyone
who’s interested. Introducing stirling numbers and proving properties
about them is really the regime of a combinatorics class, but I think
it doesn’t take too much time to show them the analogue of pascal’s
triangle so that they can actually <em>use</em> this technique should the need
arise. <a href="#fnref:6" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Thu, 13 May 2021 00:00:00 +0000
https://grossack.site/2021/05/13/stirling-basis-change.html
https://grossack.site/2021/05/13/stirling-basis-change.htmlReducing to $\mathbb{Z}$ -- Permanence and Concrete Proofs<p>There are lots of ways in which good notation can make results seem obvious.
There are also lots of ways in which “illegally” manipulating expressions can
give a meaningful answer at the end of the day.
It turns out that in many cases our illegal manipulations are actually justified,
and this is codified in the principle of
<span class="defn">Permanence of Identities</span>!
This is one place where category theory and model theory conspire in a
particularly beautiful (and powerful) way.</p>
<p>In this post we’ll talk about how to prove statements in general rings by
proving analogous statements for polynomials with integer coefficients.
This is nice because we often have access to ~bonus tools~ when working in
$\mathbb{Z}$, and it doesn’t matter if we <em>use</em> these bonus tools to prove
the general result!</p>
<p>I think this technique is best shown by example, so I’ll give a smattering
of proofs using this idea. Hopefully by the end you’ll be convinced of its
flexibility ^_^.</p>
<hr />
<h2 id="example--the-binomial-theorem">Example – The Binomial Theorem</h2>
<p>Let’s start simple, and build from here. In any of my favorite rings $R$
(though I should mention all my favorite rings are commutative with $1$)
we can take any two elements $a$ and $b$ and know that</p>
\[(a+b)^n = \sum_k \binom{n}{k} a^k b^{n-k}\]
<p>where we (as usual) interpret scaling by an integer as repeated addition.</p>
<p>Many authors prove this by saying something like
“the usual proof goes through unchanged”, but we can actually be a bit cleverer.</p>
<p>$\ulcorner$
First we prove this identity in $\mathbb{Z}[a,b]$. Then we notice there is
a (unique) ring hom $\varphi : \mathbb{Z}[a,b] \to R$ for each choice of $a$ and $b$
in $R$. This is the category theory at work, since $\mathbb{Z}[a,b]$ is the
free (commutative) ring on two generators. Next, we use model theory:
Homomorphisms preserve truth, and so the true equation $p(a,b) = q(a,b)$ in
$\mathbb{Z}[a,b]$ must stay true after we hit it with $\varphi$!</p>
<p>So since we proved $(a+b)^n = \sum_k \binom{n}{k} a^k b^{n-k}$ in $\mathbb{Z}[a,b]$,
we actually get for <em>free</em> that the equation holds for every pair of elements
in every ring!
<span style="float:right">$\lrcorner$</span></p>
<p>Notice that it doesn’t matter what “the usual proof” is! Maybe you like to
prove the binomial theorem by looking at the taylor series of $(1+x)^n$ and
remembering $\mathbb{Z} \subseteq \mathbb{R}$.
This method is entirely unavailable in a general ring, but because we end up
with a polynomial equality, we can conclude that the theorem is
true in general rings anyways!</p>
<p>There are lots of situations where we run this same kind of argument.</p>
<hr />
<h2 id="example--sylvesters-identity">Example – Sylvester’s Identity</h2>
<p>This one is a favorite example of <a href="https://math.stackexchange.com/users/242/bill-dubuque">Bill Dubuque</a> on mse. I’ve seen him
evangelize it so often I felt obligated to include it. It helps that it’s
such a great example! He actually has a <em>fantastic</em> explanation of this same
princple <a href="https://math.stackexchange.com/a/98365/655547">here</a>, which I definitely recommend reading if you’re interested!</p>
<p>Sylvester’s identity says that
(for $n \times n$ matrices over any ring $R$)</p>
\[\text{det}(1 + AB) = \text{det}(1 + BA)\]
<p>$\ulcorner$
How can we prove this? Well, let’s work in
$\mathbb{Z}[a_{ij}, b_{ij}]$, where we have one variable for each of the $2n^2$
matrix entires. Now we have</p>
\[\text{det}(1 + AB) \text{det}(A) = \text{det}(A + ABA) = \text{det}(A) \text{det}(1+BA)\]
<p>Since the determinant is a polynomial in the entries of a matrix<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">1</a></sup>, this is
really expressing the equality of polynomials with integer coefficients!</p>
<p>So we have a polynomial equation $fh = hg$, and we can happily cancel the
nonzero polynomial $h$ from both sides, since $\mathbb{Z}[a_{ij}, b_{ij}]$
is a domain! Here $h$ is the polynomial $\text{det}(A)$, and we get the claim.
<span style="float:right">$\lrcorner$</span></p>
<p>Notice that we’ve, again, used a special property of integer polynomials in
this proof! We can cancel polynomials with reckless abandon because we’re working
in a domain. Once we prove this polynomial identity, though, the result remains
true after we evaluate! In particular, even if
the specific $\text{det}(A)$ of interest is $0$, or the particular $R$ of
interest is <em>not</em> a domain!</p>
<p>Whatever tools we want to use inside $\mathbb{Z}[a_{ij}, b_{ij}]$ is totally ok,
as long as we end our proof with a polynomial identity.</p>
<div class="boxed">
<p>As a simple exercise, can you extend this result to the case where
$A = n \times m$ and $B = m \times n$?</p>
</div>
<hr />
<h2 id="example--computing-inverses">Example – Computing Inverses</h2>
<p>I said we would be focusing on commutative rings in this post, and that’s true.
But there’s a really cool noncommutative example that follows the same
principle and is worth showing.
I learned about this on mse (where else?) in a different <a href="https://math.stackexchange.com/a/675128/655547">excellent post</a> by Bill Dubuque.</p>
<div class="boxed">
<p>Even in noncommutative rings<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">2</a></sup>, if $1 - ab$ has an inverse, then
$1 - ba$ does too.</p>
</div>
<p>$\ulcorner$
We want to work in the ring of noncommutative polynoimals
$\mathbb{Z} \langle a,b \rangle$, but it’s not quite big enough.
We’re making an assumption that $(1-ab)^{-1}$ exists, but it actually
<em>doesn’t</em> in $\mathbb{Z} \langle a,b \rangle$. That said, we can freely add
such an inverse – let’s work in</p>
\[R = \mathbb{Z} \langle a, b, c \rangle \Bigg / (1 - ab)c = 1 = c(1-ab).\]
<p>Now for the clever trick:
we can embed this into the ring of noncommutative power series
\(\mathbb{Z} \langle \! \langle a,b \rangle \! \rangle\).
We send $a \mapsto a$, $b \mapsto b$, and $c \mapsto (1-ab)^{-1}$.</p>
<p>In \(\mathbb{Z} \langle \! \langle a,b \rangle \! \rangle\), we
can run the following argument
(which, of course, would be nonsensical in other settings):</p>
\[\begin{aligned}
(1-ba)^{-1}
&= 1 + ba + (ba)^2 + (ba)^3 + \ldots \\
&= 1 + ba + baba + bababa + \ldots \\
&= 1 + b(1 + ab + abab + \ldots)a \\
&= 1 + b(1-ab)^{-1}a \\
\end{aligned}\]
<p>But this means in \(\mathbb{Z} \langle \! \langle a, b \rangle \! \rangle\)
we have the identity</p>
\[(1-ba) (1 + b(1-ab)^{-1}a) = 1 = (1 + b(1-ab)^{-1}a) (1-ba)\]
<p>which, under our embedding, gives us the following identity in
$R$:</p>
\[(1-ba) (1+bca) = 1 = (1+bca) (1-ba)\]
<p>But then since this ring is initial
among all noncommutative rings with $2$ free variables and an inverse
for $(1-ab)$, we find the identity holds in <em>every</em> (noncommutative) ring $R$!
<span style="float:right">$\lrcorner$</span></p>
<p>Notice the extra power we got, both by using quotient rings to model
some hypotheses in our theorem and by passing to formal power series.
This is part of what’s so nice about embeddings! They let us prove statements
in some smaller setting by using techniques from a bigger setting<sup id="fnref:8" role="doc-noteref"><a href="#fn:8" class="footnote" rel="footnote">3</a></sup>. We’ve
been implicitly using this idea throughout the post, but I wanted to make it
explicit at least once. After all, once we’re aware of it, we can intentionally
use it in other settings as well<sup id="fnref:9" role="doc-noteref"><a href="#fn:9" class="footnote" rel="footnote">4</a></sup>.</p>
<hr />
<h2 id="a-sentimental-interlude--seven-trees-in-one">A Sentimental Interlude – Seven Trees in One</h2>
<p>The first paper I ever read<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">5</a></sup> opens with the following beautiful passage:</p>
<blockquote>
<p>Consider the following absurd argument concerning planar, binary, rooted,
unlabelled trees. Every such tree is either the trivial tree or consists of
a pair of trees joined together at the root, so the set $T$ of trees is
isomorphic to $1+T^2$. Pretend that $T$ is a complex number and solve the
quadratic $T = 1+T^2$ to find that $T$ is a primitive sixth root of unity
and so $T^6 = 1$. Deduce that $T^6$ is a one-element set; realize immediately
that this is wrong. Notice that $T^7 \cong T$ is, however, not obviously
wrong, and conclude that it is therefore right. In other words, conclude
that there is a bijection $T^7 \cong T$ built up out of copies of the
original bijection $T \cong 1 + T^2$: a tree is the same as seven trees.
The point of this paper is to show that ‘nonsense proofs’ of this kind are,
actually, valid.</p>
</blockquote>
<p>You can see that we’ve “proven” a claim about trees by proving a polynomial
implication in $\mathbb{C}$. That is, the authors show</p>
\[T = 1+T^2 \implies T^7 = T.\]
<p>In the paper, the authors show that homomorphisms of certain polynomial
<em>implications</em> are also preserved<sup id="fnref:10" role="doc-noteref"><a href="#fn:10" class="footnote" rel="footnote">6</a></sup> for rigs (that is, rings without negatives).
Here $\mathbb{N}[T]$ plays the role of the initial rig, which embeds in
$\mathbb{C}[T]$. Then we use complex analysis to show the above implication
holds in $\mathbb{C}[T]$ and thus in $\mathbb{N}[T]$.</p>
<p>Since the objects of a category (up to isomorphism) with products and coproducts
forms a rig, this tells us there is a hom from $\mathbb{N}[T]$ to the category
of, say, algebraic datatypes (up to isomorphism).</p>
<p>Since this polynomial implication is of the variety that’s preserved, and in
the category of datatypes we have $T \cong 1 + T^2$, we are allowed to conclude
$T \cong T^7$!</p>
<p>This follows the <em>spirit</em> of what we’re doing in this post, but is a bit
more detailed because general homomorphisms <em>don’t</em> preserve all implications.
A model theorist might jump straight to elementary embeddings, but that’s
far too restrictive for our purposes here. The authors of the above paper do a
great job finding (only slightly technical) conditions which make this
argument go through. I’ve included it both to show what’s possible when you
extend the ideas in this post, and also because it was my first paper and I feel
a certain amount of love for it.</p>
<hr />
<h2 id="example--mutliplicative-determinants">Example – Mutliplicative Determinants</h2>
<p>Say we want to prove that $\text{det}(AB) = \text{det}(A) \text{det}(B)$.
When we’re working over a field, there are slick basis-dependent arguments
to show this. See, for instance, Knapp’s “Basic Algebra” (ch. II.7)<sup id="fnref:11" role="doc-noteref"><a href="#fn:11" class="footnote" rel="footnote">7</a></sup>.</p>
<p>These arguments don’t go through for a general ring $R$, though, so you might
think we should drink some caffeine, look up the definition of the determinant
(for the second time this blog post… nobody <em>really</em> remembers it, right?),
and just hit this problem with some honest computation.</p>
<p>Of course, once we remember $\mathbb{Z} \subseteq \mathbb{Q}$, we
we might think of a better way (especially since we’ve seen the rest of the
post).</p>
<p>$\ulcorner$
We again look at $\mathbb{Z}[a_{ij}, b_{ij}]$.</p>
<p>We first note that the entries of $AB$ are polynomials in the
entries of $A$ and $B$ (if this isn’t clear, you should check it).</p>
<p>But since the determinant is a polynomial in the entries, we see the equation</p>
\[\text{det} \left ( (a_{ij}) (b_{ij}) \right )
= \text{det} \left ( (a_{ij}) \right ) \text{det} \left ( (b_{ij}) \right )\]
<p>is actually a polynomial equation (albeit a complicated one) in the
$a_{ij}$ and $b_{ij}$. So it suffices to show that it’s true in our
polynomial ring.</p>
<p>But $\mathbb{Z}[a_{ij},b_{ij}] \subseteq \mathbb{Q}(a_{ij},b_{ij})$, and we
<em>know</em> the formula is true for matrices over a field! So the formula is true
for us, and the claim follows for all rings.
<span style="float:right">$\lrcorner$</span></p>
<hr />
<p>This is a fun and powerful technique, and it’s really useful in a lot of
situations! Here are some quick exercises for you to play around with,
but I encourage you to look for your own as well!</p>
<div class="boxed">
<p>Pick your favorite two facts about matrix algebra and see if they’re
true over arbitrary rings. If you stick to facts about determinants,
matrix multiplication, row operations, etc. you should be able to choose
pretty much anything!</p>
</div>
<div class="boxed">
<p>Show that the quadratic formula always works, unless it obviously doesn’t.</p>
<p>As a hint, you’ll want to work in the ring</p>
<p>\(\mathbb{Z}\left [ a,b,c,d,a^{-1}, \frac{1}{2} \right ] \Bigg / d^2 = b^2 - 4ac\)</p>
</div>
<div class="boxed">
<p>Show that $\mathbb{Z}[x_1, \ldots, x_n]$ embeds in</p>
<ul>
<li>$C(\mathbb{R})$ (the ring of continuous functions on $\mathbb{R}$)</li>
<li>$C(\mathbb{C})$ (the ring of continuous functions on $\mathbb{C}$</li>
<li>$C^\infty(\mathbb{R})$ (the ring of smooth functions on $\mathbb{R}$)</li>
<li>$C^\infty(\mathbb{C})$ (the ring of entire functions on $\mathbb{C}$)</li>
</ul>
<p>so if we prove an polynomial identity using real or complex analysis it
is true in $\mathbb{Z}[x_1, \ldots, x_n]$ (and thus in all rings).</p>
</div>
<div class="boxed">
<p>As another powerful tool in your arsenal, say you want to prove some
polynomial identity in one variable: $p(x) = q(x)$.</p>
<ol>
<li>
<p>Show that there’s some finite number $N$ (depending on $p$ and $q$)
so that $p(x) = q(x)$ if and only if it’s true for $N$ many choices of $x$.</p>
</li>
<li>
<p>Show that that \(\binom{x}{3} \binom{3}{2} = \binom{x}{2} \binom{x-2}{1}\)
is a polynomial identity in $x$. Verify it by hand for $4$ choices of $x$.
Argue that this verification proves this identity holds in all rings
where you can divide by $2$.</p>
</li>
</ol>
<p>It’s wild to me that some finite verification like this is enough to prove
an identity (even a simple one like this) for all rings. If you want to see
more of this proof technique you should check out Petkovšek, Wilf, and
Zeilberger’s book <a href="https://www2.math.upenn.edu/~wilf/AeqB.pdf">A = B</a> (section 1.4, to start).</p>
<p>Does this technique work for polynomials with more than one variable?</p>
</div>
<hr />
<p>As one last aside, I’m really interested in figuring out when we can do this
with power series. Over a year ago now I asked <a href="https://math.stackexchange.com/q/3500045/655547">a question</a> about this,
though I accepted the answer too quickly
(and in hindsight it isn’t really the kind of answer I was looking for).</p>
<p>With the extra year to think about it, though, I think I have a better idea
how to make it work. I meant this to be a kind of prequel where we work in
the simpler setting of polynomials to get practice before we jump into proving
identities with power series.</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:7" role="doc-endnote">
<p>This is the one place where it’s helpful to remember that
<a href="https://en.wikipedia.org/wiki/Leibniz_formula_for_determinants">horrible definiton</a> of the determinant in terms of a sum over
permutations of products of the entries. It’s obviously a polynomial
(albeit a gross one). <a href="#fnref:7" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>They still have $1$, though. We aren’t animals. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:8" role="doc-endnote">
<p>If you like the model theoretic language, embeddings <em>reflect</em> truth. <a href="#fnref:8" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:9" role="doc-endnote">
<p>We really do use this kind of machinery ALL the time, though.
Whenver we use complex numbers to prove things about
$\mathbb{R}$, $\mathbb{N}$, etc. for instance.</p>
<p>More excitingly, this is part of the power of the
yoneda lemma – We embed any (small) category into a <a href="https://en.wikipedia.org/wiki/Topos">topos</a>
of presheaves. Then if we can prove some fact
(which doesn’t refer to any topos-y things) using this high powered
machinery, it reflects down to our original category!</p>
<p>This is also why model theorists care so much about
<a href="https://en.wikipedia.org/wiki/Elementary_equivalence#Elementary_embeddings">elementary embeddings</a>, which I’ve given a quick introduction
to <a href="/2020/10/01/elementary-vs-submodel">here</a>. The tl;dr is that embeddings <em>don’t</em> need to preserve
or reflect the truth of formulas involving quantifiers. Elementary
embeddings, on the other hand, do both. <a href="#fnref:9" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>“Objects of Categories as Complex Numbers” by Fiore and Leinster. See
<a href="https://arxiv.org/pdf/math/0212377.pdf">here</a> for an arxiv link. It’s
<em>really</em> excellent, and readable too! <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:10" role="doc-endnote">
<p>In fact, we do this by quotienting by the assumptions of our impliction,
just like we did with the $(1-ba)^{-1}$ example. So the relevant rig
for binary trees is $\mathbb{N}[T] \big / T^2 - T + 1$.</p>
<p>The issue is that for some equations
$\mathbb{Z}[x] \big / p=q$ (and also $\mathbb{N}[x] \big / p=q$) might
not <em>embed</em> into $\mathbb{C}$!</p>
<p>For instance, when we quotient by some non-monic identity, we get
something that <a href="https://math.stackexchange.com/q/2230921/655547">isn’t finitely generated</a> as a $\mathbb{Z}$-module.
In particular, the powers of some root we added won’t satisfy any integer
linear combinations.
This is a problem since in $\mathbb{C}$ the subfield generated by the
roots of some integer polynomial will be finite dimensional over
$\mathbb{Q}$, and thus powers of any root <em>will</em> satisfy some integer linear combination!</p>
<p>Since we no longer have an embedding, truth is no longer reflected!
The core of Fiore and Leinster’s paper is giving conditions where this
doesn’t happen. <a href="#fnref:10" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:11" role="doc-endnote">
<p>Knapp also happens to spend some time talking about Permanence of Identities
(in ch. V.2), which is why I chose this book in particular to mention.
It turns out the exact example we’re about to work out is <em>also</em> in Artin’s
“Algebra” (ch. 12.3) alongside a discussion of Permanence of Identities!
So if you want to see some other perspectives on this topic, you can read
about it there too. <a href="#fnref:11" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Wed, 05 May 2021 00:00:00 +0000
https://grossack.site/2021/05/05/initial-polynomial-proofs.html
https://grossack.site/2021/05/05/initial-polynomial-proofs.htmlTwo Sage Visuals<p>I’m in a reading group with Elliott Vest and <a href="https://sites.google.com/view/jacobgarcia/jacob-garcia">Jacob Garcia</a>
(supervised by <a href="https://sites.google.com/view/mgdurham/">Matt Durham</a>) where we’re talking about
CAT(0) Cube Complexes. We’re reading a set of lecture notes
by sageev (pdf <a href="http://www.math.utah.edu/pcmi12/lecture_notes/sageev.pdf">here</a>, for the interested) and we came across
a fairly simple problem that we wanted to draw. In a completely
different vein, <a href="https://github.com/russphelan">Russell Phelan</a> asked a fun topological question
in the UCR math discord, and to solve it I ended up needing to draw
something else. I figured I would write up a quick post about both
visualizations, since these things can be a bit tricky to get right.</p>
<hr />
<h2 id="hyperbolic-circles">Hyperbolic Circles</h2>
<p>Let’s start with the cube complexes. One of the exercises in Sageev’s
notes<sup id="fnref:sageev-ex" role="doc-noteref"><a href="#fn:sageev-ex" class="footnote" rel="footnote">1</a></sup> asks a question which uses circles in the hyperbolic
plane $H^2$. We had an intuitive idea why this should be true, but
to really visualize it, I asked a question (which in hindsight was silly):</p>
<p>We know that in the <a href="https://en.wikipedia.org/wiki/Poincar%C3%A9_disk_model">disk model</a> hyperbolic circles and euclidean
circles agree (albeit the centers might not be where they appear).
By this I mean a hyperbolic circle</p>
\[C(r,x_0) = \{x \mid d_\text{hyperbolic}(x_0,x) = r \}\]
<p><em>looks</em> like a euclidean circle when you draw it. I didn’t know of any
such fact for the <a href="https://en.wikipedia.org/wiki/Poincar%C3%A9_half-plane_model">upper half plane model</a>, though, and I asked if
anyonne knew what circles look like there. We didn’t, so I went on a
quest to just… draw a bunch of hyperbolic circles in <a href="https://www.sagemath.org/">sage</a>.</p>
<p>According to <a href="https://en.wikipedia.org/wiki/Poincar%C3%A9_half-plane_model#Distance_calculation">wikipedia</a>, the hyperbolic distance in the upper half plane
is given by</p>
\[d \big ( (x_1,y_1),(x_2,y_2) \big ) =
2 \text{arcsinh}
\left (
\frac{1}{2}
\sqrt{\frac{(x_2 - x_1)^2 + (y_2 - y_1)^2}{y_1 y_2}}
\right )\]
<p>So let’s go ahead and plot a circle in this metric! The relevant function
for this is <code class="language-plaintext highlighter-rouge">implicit_plot</code>, which does what it says on the tin.</p>
<div class="auto">
<script type="text/x-sage">
x,y = var('x,y')
d(x1,y1,x2,y2) = 2 * arcsinh(1/2 * sqrt(((x2-x1)^2 + (y2-y1)^2) / (y1*y2)))
# plot a circle of radius 1/2 centered at (0,1)
implicit_plot(d(x,y,0,1) - 1/2, (-5,5), (0,5))
</script>
</div>
<p>You can see this <em>looks</em> like a regular circle. Of course, we know that
distances should be distorted – just look at the notion of distance!
The distortion, it turns out, is in the apparent location of the center
and the apparent <em>size</em> of the circle.</p>
<p>To see exactly what I mean by this, let’s plot a bunch of circles
(with their centers marked) each of radius $1$.</p>
<div class="auto">
<script type="text/x-sage">
x,y = var('x,y')
d(x1,y1,x2,y2) = 2 * arcsinh(1/2 * sqrt(((x2-x1)^2 + (y2-y1)^2) / (y1*y2)))
colors = ["blue", "red", "green", "maroon", "olive", "pink", "silver", "navy"]
def draw_circle(x0,y0,r, c):
"""
Draw a circle with center (x0,y0), radius r, and color c
"""
# draw the circle
p1 = implicit_plot(d(x,y,x0,y0) - r, (-5,5), (0,5), color=c)
# draw the center
p2 = point((x0,y0), color=c)
return p1 + p2
out = Graphics()
for i in range(1,8):
# draw a sequence of circles, all of radius 1,
# but with centers moving closer to the x axis
# (which we think of as a line at infinity)
out += draw_circle(-3 + i, 1/i, 1, colors[i])
out.show()
</script>
</div>
<p>You can tell that the true center of the circle is not where one might
think. This is because distances near the $x$-axis are longer than they appear,
and so our center must be closer to points near the $x$-axis to compensate.</p>
<div class="boxed">
<p>As an aside, it might seem magical that this wild distance formula makes
circles look like euclidean circles. But like a good magic trick, there’s
a shockingly simple explanation under the surface.</p>
<p>If we take for granted that circles in the disk model are euclidean circles,
can you show that this must be true for the upper half plane model as well?</p>
<p>As a (possibly too helpful) hint, you might consider <a href="https://en.wikipedia.org/wiki/M%C3%B6bius_transformation">mobius transformations</a>.</p>
</div>
<p>To really have fun experimenting, let’s make the above graphic interactive,
and let’s throw in an interactive disk model graphic as well, just for fun.</p>
<p>Be a bit careful with the disk model – because I’m using the same sliders
as the upper half plane model, you need to make sure that your center
stays in the unit circle.</p>
<div class="sage">
<script type="text/x-sage">
x,y = var('x,y')
dUHP(x1,y1,x2,y2) = 2 * arcsinh(1/2 * sqrt(((x2-x1)^2 + (y2-y1)^2) / (y1*y2)))
dPD(x1,y1,x2,y2) = arccosh(1 + (2 * ((x2-x1)^2 + (y2-y1)^2))/((1 - (x1^2 + y1^2))*(1 - (x2^2 + y2^2))))
@interact
def _(model=selector(['upper half plane', 'poincare disk'], buttons=True),
x0=slider(-5,5,step_size=0.1, default=0),
y0=slider(0.1,5,step_size=0.1, default=1/2),
r=slider(0,3,step_size=0.1, default=1)):
if model == "upper half plane":
show("Upper Half Plane Circles")
# draw the circle
p1 = implicit_plot(dUHP(x,y,x0,y0) - r, (-5,5), (0,5))
# draw the center
p2 = point((x0,y0))
show(p1+p2)
else:
show("Poincare Disk Circles")
# draw the boundary circle of the poincare disk
p1 = implicit_plot(x^2 + y^2 - 1, (-1.5,1.5), (-1.5,1.5), color="black")
# draw the circle
p2 = implicit_plot(dPD(x,y,x0,y0) - r, (-1.5,1.5), (-1.5,1.5))
# draw the center
p3 = point((x0,y0))
show(p1+p2+p3)
</script>
</div>
<hr />
<h2 id="a-topological-problem">A Topological Problem</h2>
<p><br /></p>
<p><img src="/assets/images/two-sage-visuals/completely-different.gif" /></p>
<p>In the second half of this post we’ll go over a fun problem that Russell
put in the UCR discord. It’s Question 1 from Example 3.1.10 in Burago, Burago,
and Ivanov’s <a href="https://bookstore.ams.org/gsm-33">A Course in Metric Geometry</a>:</p>
<div class="boxed">
<p>What topological space do you get when you quotient $\mathbb{R}^2$ by
$(x,y) \sim (-y,2x)$?</p>
</div>
<p>I encourage you to give this a go by yourself before reading ahead! It took
me a few days to be really confident in my answer, and it was a
<em>lot</em> of fun to work through ^_^.</p>
<p>After a bit of looking for low hanging fruit (which, as far as I can tell,
wasn’t there), I decided to just hunker down and look for a fundamental domain.
It’s easy to see that this quotient space is the orbit space of the action
of $\mathbb{Z}$ on $\mathbb{R}^2$ where the generator acts by
\(T = \begin{pmatrix} 0 & -1 \\ 2 & 0 \end{pmatrix}\).</p>
<p>A little bit of experimentation lets you find a fundamental domain.
We know that we scale things by a factor of $2$ each time we apply $T$,
so that clues us in to look around the anulus between $1/2$ and $1$.</p>
<div class="auto">
<script type="text/x-sage">
x,y = var('x,y')
xr = (x,-2,2)
yr = (y,-2,2)
out = Graphics()
# the first anulus
region = [1/4 < x^2 + y^2, x^2 + y^2 < 1, x > 0, y > 0]
out += region_plot(region, xr, yr, incol="purple")
# the second anulus
region = [1/16 < x^2 + y^2, x^2 + y^2 < 1/4, x > 0, y > 0]
out += region_plot(region, xr, yr, incol="cyan")
out.show()
</script>
</div>
<p>We can convince ourselves that this really is a fundamental domain by
plotting the orbit of all of these regions and checking that they cover the
whole plane
(barring the origin, of course. Do you see why we have to treat it specially?).</p>
<div class="auto">
<script type="text/x-sage">
x,y = var('x,y')
N = 5
T = matrix([[0,-1],[2,0]])
xr = (x,-2,2)
yr = (y,-2,2)
def drawRegion(n):
[v1,v2] = T^n * matrix([x,y]).transpose()
# janky hack mate
# I have no idea why we need to do this
v1 = eval(str(v1))
v2 = eval(str(v2))
out = Graphics()
# the first anulus
region = [1/4 < v1^2 + v2^2, v1^2 + v2^2 < 1, v2 > 0, v1 > 0]
out += region_plot(region, xr, yr, incol="purple")
# the second anulus
region = [1/16 < v1^2 + v2^2, v1^2 + v2^2 < 1/4, v2 > 0, v1 > 0]
out += region_plot(region, xr, yr, incol="cyan")
return out
out = Graphics()
for n in range(-N,N+1):
out += drawRegion(n)
out.show()
</script>
</div>
<p>From this information we can piece together the quotient space!
The pretty picture tells us that we have a (topological) hexagon
from the two cyan sides, the two purple sides, the purple top and the
cyan bottom.</p>
<div class="boxed">
<p>As a quick exercise, write down the hexagon above and figure out
which sides are identified. Why is this a torus?</p>
</div>
<p>So we understand \(\mathbb{R}^2 \setminus \{ 0 \} \bigg / \langle T \rangle\),
and the day of reckoning is upon us. We need to handle the origin.
It’s easy to see that our equivalence relation is not closed in
$\mathbb{R}^2 \times \mathbb{R}^2$, so our quotient space <em>cannot</em> be
hausdorff.
Notice also that any neighborhood of the origin contains a tail of fundamental
domains. That tells us that every neighborhood in the quotient should do the same.</p>
<p>So our picture is of a torus, plus one <em>really big</em> “generic” point.
Any neighborhood of this point contains the entire torus.</p>
<p>What a bizarre space, right? I had a <em>lot</em> of fun working this out!
Before I typed this up I was feeling a bit insecure about the solution
(I also had a much grosser fundamental domain at first), and asked about it
on <a href="https://math.stackexchange.com/q/4117907/655547">mse</a>. It doesn’t have an answer yet, but I’m feeling more confident
in my computation now, so I’m posting this anyways. I can always edit this post
if someone leaves an answer that totally changes how I think about the problem,
and you can always follow that link if you (in the future) want to see what
people had to say.</p>
<p>Finally, if you want to think about some similar things, I have a fun problem
for you ^_^</p>
<div class="boxed">
<p>Let’s look at the action of $T^2$ and $T^4$ on $\mathbb{R}^2$ instead.
These matrices are much easier to understand (they’re diagonal, for instance).</p>
<p>What are the quotient spaces of these actions? They generate subgroups of
$\langle T \rangle$, so (at least morally) we expect them to correspond to
covering spaces of $\mathbb{R}^2 \big / \langle T \rangle$. Do they?
If so, what does the covering action look like? If not, what goes wrong?</p>
</div>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:sageev-ex" role="doc-endnote">
<p>Exercise 2.15, for the curious <a href="#fnref:sageev-ex" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Tue, 27 Apr 2021 00:00:00 +0000
https://grossack.site/2021/04/27/two-sage-visuals.html
https://grossack.site/2021/04/27/two-sage-visuals.htmlQuick Analysis Trick 5<p>The quarter is over, and now that I’m vaccinated (twice!) I feel comfortable
seeing people again. So I flew east coast to see my family and a bunch of friends.
Before I left, I had a few ideas for blog posts, and figured I would get around
to writing one now.</p>
<p>I’ve made it known that I struggle with analysis<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>, and one manifestation of
this is a complete inability to remember elementary facts about inequalities.
It took me a long time to feel comfortable with things as basic as
“which way does the triangle inequality go?”, and until fairly recently things
like Cauchy-Schwarz were almost entirely beyond me. Over the past year or two,
I’ve been trying to answer lots of analysis questions on mse, as well as read
lots of books on analysis and solve lots of problems, and (thankfully) some of
it is starting to stick. But one inequality that I <em>always</em> seem to forget is
the <a href="https://en.wikipedia.org/wiki/Triangle_inequality#Reverse_triangle_inequality">reverse triangle inequality</a>:</p>
<div class="boxed">
<p>\(\Bigg | |x| - |y| \Bigg | \leq |x-y|\)</p>
</div>
<p>I don’t know many ways for showing a lower bound on absolute values,
but almost every time I need one, I go through the following process:</p>
<ol>
<li>“Doesn’t the reverse triangle inequality give a lower bound?”</li>
<li>“I wonder if I should use that. Let me google it!”</li>
<li>“Oh right, <em>that’s</em> what it says. How do I always forget this?”</li>
<li>“This is actually not as useful as I would have liked. Oh well.”</li>
</ol>
<p>The most recent time I went through this, something on the wikipedia page
really clicked with me, and I’m not sure why it never clicked before.
The geometric intuition<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup> never really stays in my head, but for some reason
this did:</p>
<div class="boxed">
<p>The reverse triangle inequality says that the norm $\lVert \cdot \rVert$
of some vector space $X$ is 1-<a href="https://en.wikipedia.org/wiki/Lipschitz_continuity">lipschitz</a> as a function from $X \to \mathbb{R}$:</p>
\[\Bigg | \lVert x \rVert - \lVert y \rVert \Bigg | \leq \lVert x-y \rVert\]
<p>Or, even more suggestively:</p>
<p>\(d_\mathbb{R}(\lVert x \rVert, \lVert y \rVert) \leq d_X(x,y)\)</p>
</div>
<p>I’m trying to see why this is more memorable for me, and moreover why it’s
<em>suddenly</em> more memorable. Because I know that I’ve seen this before.</p>
<p>I think part of it is the visual and semantic distinction that we get by
writing $\lVert \cdot \rVert$ instead of $|\cdot|$. When everything in sight
was a real number, there were too many combinations of what we should and
shouldn’t be absolute-value-ing. As with many things in math and computer
science, taking some time to recognize the <a href="https://en.wikipedia.org/wiki/Type_system">types</a> involved in an equation
or proof, and then making sure to distinguish these types inyour mind,
helps a lot for keeping the structures straight.</p>
<p>I think another reason this is memorable is because the notion of lipschitz
maps has become something I feel familiar with. When I was taking my first
undergraduate analysis class, I really didn’t know why we should care about
the various strengthenings of continuity. Over time I’ve learned to better
appreciate their differences, and I feel like lipschitz-ness is one of the
regularity conditions that I understand best<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup>. It also makes intuitive
sense that taking the norm of a vector should be a very regular operation.</p>
<p>Anyways, this isn’t so much a “trick” as a “mnemonic”, but I wanted to say
something about it anyways because I think it would have helped younger me.
At the very least, it was nice to write up a really short post with a
somewhat obvious observation. To make it somewhat more worth your time,
here’s a picture of my old cat Oreo. I got to visit her while I was visiting
<a href="https://remydavison.com">Remy</a> in New York!</p>
<p><img src="/assets/images/quick-analysis-trick-5/oreo.jpg" alt="My daughter, a gremlin" style="width: 400px; height: auto; display: block; margin-left: auto; margin-right: auto" /></p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Though I’ve done fairly well the past two quarters, which has been a
real confidence boost… It’s feeling better, but I still don’t feel
like I understand things as well as I should, and while it’s coming
faster, I still don’t feel like it’s coming naturally…
Maybe it’s imposter syndrome? Who’s to say ¯_(ツ)_/¯ <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>The difference between two legs of a triangle must be less than than the
length of the third leg. Otherwise, by adding the length of the shorter
leg to both sides you would violate the triangle inequality. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>Not that that’s saying much. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Sun, 21 Mar 2021 00:00:00 +0000
https://grossack.site/2021/03/21/quick-analysis-trick-5.html
https://grossack.site/2021/03/21/quick-analysis-trick-5.htmlChecking Concavity with Sage<p>I haven’t been on MSE lately, because I’ve been fairly burned out from a
few weeks of more work than I’d have liked. I’m actually still catching up,
with a commutative algebra assignment that should have been done last week.
I was (very kindly) given an extension, and I’ll be finishing it soon, though.</p>
<p>I meant to do it today, but I got my second covid vaccination earlier and it
really took a lot out of me. I’m feverish and have a pretty bad migraine, so
I didn’t want to work on “real things”, but I still wanted to feel productive…
MSE it is.</p>
<p>Today someone asked <a href="https://math.stackexchange.com/q/4055724/655547">a question</a>, which again I’ll paraphrase here:</p>
<div class="boxed">
<p>Why is \(\left ( 1+\frac{1}{x} \right )^x\) <a href="https://en.wikipedia.org/wiki/Concave_function">concave</a> (on $x > 0$)?</p>
</div>
<p>It clearly <em>is</em> concave. Here’s a picture of it:</p>
<p><img src="/assets/images/sage-concave/desmos.png" /></p>
<p>Obviously it has an asymptote at $e$, and should always be $\lt e$, so
it really should be concave… showing that is a bit of a hassle, though.</p>
<p>Thankfully, we can use <a href="https://sagemath.org">sage</a> to automate away most of the difficulties. I’ll
more or less be rehashing my answer here. The idea is to put this example of
using sage “in the wild” somewhere a bit easier to find than a stray mse post.</p>
<p>Showing that a function is convex (resp. concave) is routine but tedious,
especially when that function is twice differentiable. Then we can just check
$\frac{d^2f}{dx^2} \geq 0$ (resp. $\leq 0$) on the region of interest.
The issue here, of course, is that
$\frac{d^2}{dx^2} \left ( 1 + \frac{1}{x} \right )^x$ is… unpleasant.
Thankfully, sage doesn’t care in the least! Let’s see if we can bash out
the second derivative and show it’s $\leq 0$ (whenever $x > 0$, of course).</p>
<p>We start by defining $f$ and its second derivative</p>
<div class="linked_auto">
<script type="text/x-sage">
f(x) = (1+1/x)^x
secondDerivative = diff(f,x,2)
show(secondDerivative)
</script>
</div>
<p>In a perfect world, we could just… ask sage if this is $\leq 0$.
Unfortunately, the expression is a bit too complicated, and we don’t get
a clean answer<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>:</p>
<div class="linked_auto">
<script type="text/x-sage">
solve(secondDerivative < 0, x)
</script>
</div>
<p>This gives a list of lists of domains, and if you intersect all the domains
in some fixed list, you get a region where the second derivative is $\lt 0$.
Of course, these domains are far too complicated to be useful. We’ll need to
try something else.</p>
<p>Let’s look at the second derivative again.
We as humans can see how to clean it up a little, so let’s do that first:</p>
\[\left ( 1 + \frac{1}{x} \right )^x
\left [
\left ( \frac{1}{1+x} - \log \left ( 1 + \frac{1}{x} \right ) \right )^2 -
\frac{1}{x^3 + 2x^2 + x}
\right ]\]
<p>Since $(1 + 1/x)^x$ is always positive when $x$ is, the sign of this expression
is controlled by the second factor. We might try to ask sage about the second
factor again, but you can check that it’s still too complicated for sage to
handle it automatically. We’ll need to simplify the expression if we want
to proceed.</p>
<p>One obvious way we might try to simplify things is by turning our expression
into a rational function. After all, polynomials are more combinatorial
in nature than things like $\log$, and so sage is better equipped to handle
them. Your first instinct might be to kill the $\log$s with taylor series,
since $x - \frac{x^2}{2} \leq \log(1+x)$. This will work, but we can be a bit
more efficient. It’s <a href="https://math.stackexchange.com/q/324345/655547">well known</a> that</p>
\[\frac{x}{1+x} \leq \log(1+x) \leq x\]
<p>So plugging in $1/x$ and negating, we see</p>
\[-\log(1+1/x) \leq - \frac{1}{1+x}\]
<p>But that means our expression of interest is upper bounded by</p>
\[\left ( \frac{1}{1+x} - \frac{1}{1+x} \right )^2 - \frac{1}{x^3 + 2x^2 + x}\]
<p>and we’ve reduced the problem to showing</p>
\[- \frac{1}{x^3 + 2x^2 + x} \lt 0 \quad \quad (\text{when } x \gt 0)\]
<p>and in the interest of offloading as much thinking as possible to sage,
we see</p>
<div class="linked_auto">
<script type="text/x-sage">
assume(x > 0)
bool(-1/(x^3 + 2*x^2 + x) < 0)
</script>
</div>
<p>and so $f$ is, in fact, concave.</p>
<hr />
<p>This was fairly painless, but we got pretty lucky with that estimate for
$\log$. I’m curious if there’s a way to completely automate this process,
and to remove all need for creativity. If anyone knows a simpler way to do
this, where we can just directly ask if the second derivative is negative,
I would love to hear about it!</p>
<p>We’re at least a little bit out of luck, since <a href="https://en.wikipedia.org/wiki/Richardson%27s_theorem">Richardson’s Theorem</a>
shows that it’s undecidable whether certain (very nice!) functions are
nonnegative. As an easy exercise, can you turn this into a proof that
checking convexity is undecidable on some similarly nice class of functions?</p>
<p>Even though logicians came to ruin the fun
(as we have a reputation for doing, unfortunately…),
I’m curious if any kind of result is possible. Maybe there’s some
hacky solution that works fairly well in practice?
Approximating every nonpolynomial by the first, say, 50 terms of its
taylor series comes to mind, but I’m currently struggling to get sage
to expand and simplify expressions in way that makes me happy, so
manipulating huge expressions like that is, at least for now, a bit beyond me<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>.</p>
<p>Again, all ideas are welcome ^_^</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Another thing I tried was to get sage to do <em>everything</em> for us. But for
some reason <code class="language-plaintext highlighter-rouge">bool(secondDerivative < 0)</code> returns false, even when we
<code class="language-plaintext highlighter-rouge">assume(x > 0)</code>… I suspect this is (again) because our expression is too
complicated. After all, it seems like there are <a href="https://ask.sagemath.org/question/42825/assumptions-and-inequalities/">issues</a> with <em>much</em>
simpler expressions than this one. If anyone knows how to make this kind
of direct query work, I would love to hear about it! <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>Wow. I know I speak (and write) in run-on sentences, but this one’s on
another level. I feel like I need a swear jar but for commas. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Tue, 09 Mar 2021 00:00:00 +0000
https://grossack.site/2021/03/09/sage-concave.html
https://grossack.site/2021/03/09/sage-concave.htmlTalk - Problem Solving Without Ansibles -- An Introduction to Communication Complexity<p>Wow, two talk posts in one day! Thankfully the actual talks were a week apart!</p>
<p>Earlier today I gave <em>another</em> talk in the grad student seminar here at UCR.
I wanted to break out of my logician mold a <em>little</em> bit, and so I decided to
talk about a result which absolutely blew my mind when I first saw it:
the (public coin) randomized communication complexity of equality is $O(1)$
for any fixed error tolerance $\epsilon$!</p>
<p>Since communication complexity is all about measuring the number of messages
you send, I thought a fun framing device would be to imagine Alyss and Bob on
separate planets. If they’re very far from each other, but they each have the
computational power of an entire planet at their disposal, then it makes sense
to measure communication as the limiting factor in their computation. This was
in part inspired by <a href="http://www.cs.cmu.edu/~odonnell/">Ryan O’Donnell</a>’s excellent videogame themed talk
(<a href="https://www.youtube.com/watch?v=4B0jwIu9fPs">here</a>), and indeed I tried to make my slides in google slides instead of
beamer as an homage to him.</p>
<p>It definitely had advantages and disadvantages, but I liked a lot of the
flexibility it offers. I think he does his in powerpoint, which might solve
some of my bigger gripes. Notably drawing on the slides directly was impossible
(because any time you release your pen, the scribble tool closes itself…
that really needs a rethink on google’s end), so I had to do all the handwriting
in gimp, then insert the writing as an image in google slides. This was annoying
at first, but I gradually got into the flow of it. The most damning problem was
how annoying it is to write mathematical symbols. Every single $\epsilon$ gave
me a headache, and my entire browser lagged anytime the symbols menu was open.
I know there are some add-ons which might make this easier, but nothing can
beat a raw latex engine. In a more technical talk, I don’t think I could have
possibly made the slides in this way.</p>
<p>All in all, I was really pleased with how the talk went, though! I think it’s
an interesting enough topic to stand on its own, and it was fun getting to
evangleize computer science to a crowd of mathematicians. CMU’s math department
obviously worked quite closely with its CS department, and I forget sometimes
that that isn’t the norm. I knew going into the talk that I wanted to spend
some time talking about an interpretation of this using error-correcting codes
(<a href="https://en.wikipedia.org/wiki/Hamming_code">Hamming Codes</a> in particular), but I ended up scrapping it and not writing
the slides for it. In hindsight, I should have just made the slides, because
as soon as someone asked a question that even <em>hinted</em> at this idea, I pounced
on it and went on a mild tangent. I suspect I lost a lot of people during that,
and it would have been a lot easier to retain them if I’d just organized the
big ideas into slides. Oh well, I’m not going to fault past my too much for
their laziness.</p>
<p>All in all, I really enjoyed giving this talk, and it seemed like the
audience really enjoyed watching it. This was almost certainly due to the
influence of Ryan’s lecturing style, and anyone familiar with his (excellent)
“Theorist’s Toolkit” lectures (which I reference in the talk, and which you
can find <a href="https://www.youtube.com/playlist?list=PLm3J0oaFux3ZYpFLwwrlv_EHH9wtH6pnX">here</a>) will recognize his impact.</p>
<p>With that out of the way, here are the things:</p>
<hr />
<p>Problem Solving Without Ansibles: An Introduction to Communication Complexity</p>
<p>In the world of Science Fiction, an “ansible” is a device that allows for
faster-than-light communication. Without ansibles, interstellar travel
puts an interesting constraint on computation. If two planets want to
collaborate on solving a problem, the obstruction will likely not be the
computation that either planet does individually. Instead, what matters
is the <em>Communication Complexity</em> which tracks the amount of messages
the planets have to send each other to solve their problem. In this talk
we will solve a prototypical problem in communication complexity. But be
warned: the answer may surprise you!</p>
<p>You can find the slides <a href="/assets/docs/problem-solving-without-ansibles/handout.pdf">here</a></p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/ImCFucEag3I" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>
Fri, 05 Mar 2021 00:00:00 +0000
https://grossack.site/2021/03/05/problem-solving-without-ansibles.html
https://grossack.site/2021/03/05/problem-solving-without-ansibles.htmlTalk - Categories, Modalities, and Type Theories: Oh My<p>Last week I gave a talk at CMU’s Graduate Student Workshop on
Homotopy Type Theory (HoTT). You can see the schedule of talks
<a href="https://cmu-hott.github.io/workshop2021.html">here</a>.</p>
<p>A good friend of mine from undergrad, <a href="https://jacobneu.github.io/">Jacob Neumann</a>,
reached out to me about speaking here, and I super appreciate it.
It was great to see some familiar faces from the HoTT group at CMU,
and the talks were well worth the 8am start time<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>! Jacob’s talk on
<a href="https://ncatlab.org/nlab/show/allegory">Allegories</a> was well exposited, though it was regrettably cut short
before we could get to the fun modal logic interpretation. Similarly,
<a href="http://www.andrew.cmu.edu/user/wcaldwel/">Wes Caldwell</a>’s talk was a return to form for HoTT! It was cool seeing
some heavy duty algebraic topology recast in this language, even if the
ending was a bit beyond me right now. Steve Awodey was certainly impressed by
it, so I hope Wes is proud ^_^. I was upset to have missed
<a href="https://colinzwanziger.com/">Colin Zwanziger</a>’s talk, but I couldn’t miss the GSS at UCR<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>.</p>
<p>I felt like I should present some thoughts of my own at a seminar like
this one, and the subject matter is <em>just</em> on the boundary of what I feel
comfortable with. Since the organizers asked for a 30-45 minute talk, I
decided to stay on the shorter side:
“Better to remain silent and be thought a fool than to speak and remove all doubt”,
you know?</p>
<p>One idea that had been on my mind for a little while was making
sense of first order modal logic using presheaf categories. In particular,
for <a href="https://en.wikipedia.org/wiki/Algebraic_theory">algebraic theories</a>, we can view a model in a presheaf category as
a presheaf of models. Then if $\mathfrak{F}$ is a kripke frame for $\mathsf{S4}$,
it is a preorder and thus a category in a natural way. So we should be able
to use modal logic to talk about models in $\mathsf{Set}^\mathfrak{F}$.</p>
<p>This idea was (and is) a bit half-baked, but it was an interesting and stressful
feeling to talk about ideas of my own (particularly ideas that haven’t been
fully realized yet) in a talk. <em>Particluarly</em> a talk with so many people I
respect in attendence. I’m really grateful to the organizers and the attendees
for making it such a safe space for me to discuss these things, even if I’m sure
a lot of what I said was obvious to many people in the room. I really miss the
CMU HoTT group, and I might start attending the seminars again since they’re
online anyways.</p>
<p>I knew my idea wasn’t fully fleshed out<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup>, but I also knew that something
like it must be right. So I took a moment to ask the audience if they knew
of any references for similar ideas. I had found a paper of Awodey and Kishida<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup>
which uses topological spaces and a kind of “Étale Space” version of this idea,
but nothing which used presheaves directly. Thankfully, the audience gave me
a veritable barrage of papers to read
(many of which I’m moving to the front of my, ever increasing “to read pile”)!
For the interested:</p>
<ul>
<li><a href="https://www.andrew.cmu.edu/user/awodey/students/kishida.pdf">Kishida’s Thesis</a></li>
<li><a href="https://arxiv.org/abs/1403.0020">An extension to HOL</a> (this one was mentioned twice, so it <em>must</em> be good)</li>
<li><a href="https://www.sciencedirect.com/science/article/pii/S1571066111001393">Another Kishida paper</a>
(I’d actually skimmed this one already, but I’m including it for completeness)</li>
<li><a href="https://www.sciencedirect.com/science/article/pii/0168007293000854">A 77 pager on categorical modal logic</a>
(This is one that I’m definitely going to try to get through soon)</li>
</ul>
<p>Even if last friday was <em>extremely</em> long, with seminars straight through from
8am - 3pm, then teaching from 3-5, it was entirely worth it! And now, for the
abstract and recording (as usual)</p>
<hr />
<p>Categories, Modalities, and Type Theories: Oh my!</p>
<p>Category theory and logic have a tight interplay, with
structured categories providing semantics for certain
logics, and “internal logics” providing a useful language for
speaking about structured categories. In this introductory
talk we will survey both directions of this correspondence
from the point of view of modal logic.</p>
<p>The slides are <a href="/assets/docs/categories-modalities-and-type-theories/handout.pdf">here</a>,
and the recording is below.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/qh96mjSmEyI" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Sometimes living on the west coast has its down sides… <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>And I’m <em>extra</em> glad I went that week, because it was Jonathan Alcaraz’s
talk on LO Groups, which led me to a problem I talked about
<a href="/2021/02/26/lo-groups">last week</a> <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>Even calling it that is kind. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:4" role="doc-endnote">
<p>Topology and Modality: the Topological Interpretation of First-Order Modal Logic,
DOI: 10.1017/S1755020308080143 <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Fri, 05 Mar 2021 00:00:00 +0000
https://grossack.site/2021/03/05/categories-modalities-and-type-theories.html
https://grossack.site/2021/03/05/categories-modalities-and-type-theories.htmlCohomology Intuitively<p>So I was on mse the other day…</p>
<p><img src="/assets/images/cohomology-intuitively/letterkenny.jpg" /></p>
<p>Anyways, somebody <a href="https://math.stackexchange.com/q/4011756/655547">asked a question</a> about finding generators
in cohomology groups. I think understanding how to compute the generators
is important, but it’s equally important to understand what that computation
is doing. Regrettably, while there’s some very nice visual intuition<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> for
homology classes and what they represent, cohomology groups tend to feel a bit
more abstract.</p>
<p>This post is going to be an extension of the answer I gave on the linked
question. Cohomology is a big subject, and there were a lot of things that
I wanted to include in that answer that I didn’t have space for. A blog post
is a much more reasonable setting for something a bit more rambling anyways.
That said, everything contained in that answer will <em>also</em> be discussed here,
so it’s far from prerequisite reading.</p>
<p>In particular, we’re going to go over <a href="https://en.wikipedia.org/wiki/Simplicial_homology">Simplicial Cohomology</a><sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>,
but we’ll steal language from <a href="https://en.wikipedia.org/wiki/De_Rham_cohomology">De Rham Cohomology</a>, and our first
example will be a kind of informal cohomology just to get the idea across.</p>
<p>There’s a really nice example that <a href="https://www.math.cmu.edu/~ffrick/">Florian Frick</a> gave when I took his
algebraic topology class, and it was such good pedagogy I have to evangelize it.
The idea is to study simplicial cohomology for graphs – it turns out to say
something which is very down to earth, and we can then view cohomology of more
complicated simplicial complexes as a generalization of this.</p>
<p>Graphs are one of very few things that we can really understand, and so using
them as a case study for more complex theorems tends to be a good idea.
As such, we’ll study what cohomology on graphs is all about, and there will
even be some sage code at the bottom so you can check some small cases yourself!</p>
<p>With that out of the way, let’s get started ^_^</p>
<hr />
<p>First things first. Let’s give a high level description of what
cohomology does for us.</p>
<p>Say you have a geometric object, and you want to define a function
on it. This is a very general framework, and “geometric” can mean a lot
of different things. Maybe you want to define a continuous function on some
topological space. Or perhaps you’re interested in smooth functions on a
manifold. The same idea works for defining functions on schemes as well,
and the rabbit hole seems to go endlessly deep!</p>
<div class="boxed">
<p>For concreteness, let’s say we want to define a square root function
on the complex plane. So our “geometric object” will be $\mathbb{C}$
and our goal will be to define a continuous $\sqrt{\cdot}$ on it.</p>
</div>
<p>It’s often the case that you know what you <em>want</em> your function to do
somewhere (that’s why we want to define it at all!), and then you would like
to <em>extend</em> that function to a function defined everywhere.</p>
<div class="boxed">
<p>For us, then, we know we want $\sqrt{1} = 1$.</p>
<p>This is an arbitrary choice, but it certainly seems like a natural one.</p>
<p>We now want to extend $\sqrt{\cdot}$ (continuously!) to the rest of the plane.</p>
</div>
<p>It is <em>also</em> often the case that the continuity/smothness/etc. constraint
means that there’s only one way to define your function locally. So it
should be “easy” (for a certain notion of easy) to do the extension in a small
neighborhood of anywhere it’s already been defined.</p>
<div class="boxed">
<p>So we know that $\sqrt{1} = 1$. What should $\sqrt{i}$ be?</p>
<p>Well, the real part of $\sqrt{1} = 1$ is positive. So if we want
$\sqrt{\cdot}$ to be continuous we had better make sure $\sqrt{i}$
has potiive real part as well
(otherwise we would contradict the intermediate value theorem).</p>
<p>So we’re forced into defining $\sqrt{i} = \frac{1 + i}{\sqrt{2}}$.</p>
</div>
<p>However, sometimes the geometry of your space prevents you from gluing all
of these small easy solutions together. You might have all of the pieces lying
around to build your function, but the pieces don’t quite fit together right.</p>
<div class="boxed">
<p>As before, we know $\sqrt{1} = 1$, and this forces
$\sqrt{i} = \frac{1 + i}{\sqrt{2}}$. This has positive imaginary part.</p>
<p>If we want to extend this continuously from $\sqrt{i}$ to $\sqrt{-1}$,
we have no choice but to define $\sqrt{-1} = i$
(which also has positive imaginary part).
This is the intermediate value theorem again.</p>
<p>But now we can go from $\sqrt{-1}$ to $\sqrt{-i}$. Again, we have to keep
the imaginary part positive, and we’re forced into choosing
$\sqrt{-i} = \frac{-1 + i}{\sqrt{2}}$.</p>
<p>And lastly, we go from $\sqrt{-i}$ back to $1$. The real part is negative
now, and we’re forced by continuity into choosing $\sqrt{1} = -1$…</p>
<p>Uh oh.</p>
</div>
<p>Obviously the above argument isn’t entirely rigorous. That said, it does a
good job outlining what problem cohomology solves. We had only one choice
at every step, and at every step nothing could go wrong. Yet somehow, when
we got back where we started, our function was no longer well defined.
We thus come to the following obvious question:</p>
<div class="boxed">
<p>If you have a way to solve your problem <em>locally</em>, can we tell if those local
solutions patch together to form a <em>global</em> solution?</p>
</div>
<p>It turns out the answer is yes! Our local solutions come from a global
solution exactly when the “cohomology class” associated to our function
vanishes<sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup>.</p>
<p>There’s a bit of a zoo of cohomology theories depending on exactly what kinds
of functions you’re trying to define<sup id="fnref:4" role="doc-noteref"><a href="#fn:4" class="footnote" rel="footnote">4</a></sup>. They all work in a similar way, though:
your local solutitions piece together exactly when their cohomology class is $0$.
So the nonzero cohomology classes are all of the different “obstructions” to
piecing your solutions together. Rather magically, cohomology groups tend to
be finite dimensional, and so there are finitely many “basic” obstructions
which are responsible for all the ways you might fail to glue your pieces
together.</p>
<p><span class="defn">De Rham Cohomology</span> makes this precise by looking at
certain differential equations which can be solved locally. Then the cohomology
theory tells us which differential equations admit a global solution. In this
post, though, we’re going to spend our time thinking about
<span class="defn">Simplicial Cohomology</span>. Simplicial Cohomology doesn’t
have quite as snappy a description, but it’s more combinatorial in nature,
which makes it easier to play around with.</p>
<hr />
<p>Actually setting up cohomology requires a fair bit of machinery, so before
go through the formalities I want to take a second to detail the problem
we’ll solve.</p>
<p>Take your favorite graph, but make sure you label the vertices.
My favorite graph (at least for the purposes of explanation) is this one:</p>
<p><img src="/assets/images/cohomology-intuitively/naked-graph.png" width="50%" /></p>
<p>Notice the edges are always oriented from the smaller vertex to the bigger one.
This is not an accident, and keeping a consistent choice of orientation is
important for what follows. The simplest approach is to order your vertices,
then follow the convention of $\text{small} \to \text{large}$</p>
<p>Now our problem will be to “integrate” a function defined on the edges
to one defined on the vertices. What do I mean by this? Let’s see some
concrete examples:</p>
<p><img src="/assets/images/cohomology-intuitively/coboundary.png" width="50%" /></p>
<p>Here we see a function defined on the edges. Indeed, we could write this
more formally as</p>
\[\begin{aligned}
f(e_{01}) &= 5 \\
f(e_{02}) &= 5 \\
f(e_{12}) &= 0
\end{aligned}\]
<p>The goal now is to find a function on the vertices whose difference
along each edge agrees with our function. This is what I mean when I
say we’re “integrating” this edge function to the vertices.
It’s not hard to see that the following works:</p>
<p><img src="/assets/images/cohomology-intuitively/coboundary-integrated.png" width="50%" /></p>
<p>Again, if you like symbols, we can write this as</p>
\[\begin{aligned}
F(v_0) &= 3 \\
F(v_1) &= 8 \\
F(v_2) &= 8 \\
\end{aligned}\]
<p>Then we see for each edge $f(e_{ij}) = F(v_j) - F(v_i)$. This may seem
like a weird problem to try and solve, but at least we solved it!
Notice we pick up an arbitrary constant when we do this –
We can set $v_0 = C$ for any $C$ we want as long as $v_1 = v_2 = C+5$.
This is one parallel with integration, and helps justify our language.</p>
<p>As some more justification, notice this obeys a kind of “fundamental theorem
of calculus”: If you want to know the total edge values along some path,
\(\displaystyle \sum_{v_{k_1} \to v_{k_2} \to \ldots \to v_{k_n}} f(e_{k_i, k_{i+1}})\),
that turns out to be exactly $F(k_n) - F(k_1)$ for some “antiderivative” $F$ of $f$.</p>
<div class="boxed">
<p>As a (fun?) exercise, you might try to formulate and prove an analogue of the
other half of the fundamental theorem of calculus. That is, can you formulate
a kind of “derivative” $d$ which takes functions on the vertices to functions
on the edges? Once you have, can you show that differentiating an antideriavtive
gets you your original function?</p>
<p>For (entirely imaginary) bonus points, you might try to come up with a
parallel between edge functions of the form $dF$ (that is, edge functions
which have an antiderivative) and <a href="https://en.wikipedia.org/wiki/Conservative_vector_field">conservative vector fields</a>.</p>
</div>
<p>Let’s look at a different function now:</p>
<p><img src="/assets/images/cohomology-intuitively/cocycle.png" width="50%" /></p>
<p>You can quickly convince yourself that no matter how hard you try,
you can’t integrate this function. There is no antiderivative in the sense that
no function on the vertices can possibly be compatible with our function on the edges.</p>
<p>After all, say we assign $v_0$ the value $C$. Then $v_1$ is forced into
the value $C+5$ in order to agree with $e_{01}$. But then because of
$e_{12}$ we must set $v_2 = C+5$ as well, and uh oh! Our hand was forced
at every step, but looking at $e_{02}$ we see were out of luck.</p>
<p>This should feel at least superficially similar to the $\sqrt{\cdot}$
example from earlier. At each step along the way, it’s easy to solve
our problem: If you know what $F(v_i)$ is, and you see an edge $v_i \to v_j$,
just assign $F(v_j)$ to $F(v_i) + f(e_{ij})$. The problem comes from making
these choices <em>consistently</em>, which turns out to not always be possible!</p>
<div class="boxed">
<p>As an aside, you can see that the problem comes from the fact that our
graph has a cycle in it. Can you show that, on an acyclic graph,
<em>every</em> edge function can be integrated to a function on the vertices?</p>
<p>We will soon see that the functions which can’t be integrated are
(modulo an equivalence relation) exactly the cohomology classes. So the
presence of a function which can’t be integrated means there must be a cycle
in our graph, and it is in this sense that cohomology “detects holes”.</p>
<p>This is entirely analogous to the fact that every (irrotational) vector
field on a simply connected domain is conservative. It seems the presence
of some “hole” is the reason some functions don’t have primitives.</p>
</div>
<hr />
<p>Ok, so now we know what problem we’re trying to solve. When can we
find an antiderivative for one of these edge functions? The machinery ends up
being a bit complicated, but that’s in part because we’re working with
graphs, which are one dimensional simplicial complexes.
This <em>exact same setup</em> works for spaces of arbitrary dimension,
so it makes sense that it would feel a bit overpowered for our comparatively
tiny example.</p>
<p>First things first, we look at the <a href="https://en.wikipedia.org/wiki/Free_abelian_group">free abelian groups</a> generated
by our $n$-dimensional cells. For us, we only have $0$-dimensional vertices
and $1$-dimensional edges. So we have to consider two groups:</p>
\[\mathbb{Z}E \text{ and } \mathbb{Z}V\]
<p>For the example from before, that means</p>
\[\mathbb{Z} \{ e_{01}, e_{12}, e_{02} \} \text{ and } \mathbb{Z} \{ v_0, v_1, v_2 \}\]
<p>which are both isomorphic to $\mathbb{Z} \oplus \mathbb{Z} \oplus \mathbb{Z}$.</p>
<p>Second things second. We want to <em>connect</em> these two groups together in a way
that reflects the combinatorial structure. We do this with the
<span class="defn">Boundary Map</span> $\partial : \mathbb{Z}E \to \mathbb{Z}V$.
This map takes an edge to its “boundary”, so $\partial e_{01} = v_1 - v_0$.
Since we have a basis floating around anyways, it’s convenient to represent
this map by a matrix:</p>
\[\partial =
\begin{pmatrix}
-1 & 0 & -1 \\
1 & -1 & 0 \\
0 & 1 & 1
\end{pmatrix}\]
<p>So for instance,</p>
\[\partial e_{01} =
\begin{pmatrix}
-1 & 0 & -1 \\
1 & -1 & 0 \\
0 & 1 & 1
\end{pmatrix}
\begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix} =
\begin{pmatrix} -1 \\ 1 \\ 0 \end{pmatrix} =
v_1 - v_0\]
<p>Now our groups assemble into a <span class="defn">Chain Complex</span>:</p>
\[\cdots \to
0 \to
0 \to
\mathbb{Z}E \overset{\partial}{\longrightarrow}
\mathbb{Z}V\]
<p>The extra groups $0$ correspond to higher dimensional simplices that
aren’t present for us. If we filled in our cycle with a $2$-dimensional
triangular <em>face</em>, then we would have an extra group $\mathbb{Z}F$ and
an extra map (which is <em>also</em> called $\partial$, rather abusively)
from $\mathbb{Z}F \to \mathbb{Z}E$ which takes a face to its boundary
(this might also help explain the term “boundary”). Then if we had a
“cycle” of faces, we could fill them in with a (solid) tetrahedron.
So we would have a new group $\mathbb{Z}T$, equipped with a map
$\partial : \mathbb{Z}T \to \mathbb{Z}F$ taking each tetrahedron to
its boundary of faces. Of course, this goes on and on into higher and
higher dimensions<sup id="fnref:5" role="doc-noteref"><a href="#fn:5" class="footnote" rel="footnote">5</a></sup>.</p>
<p>There’s actually a technical condition to be a chain complex that’s
automatically satisfied for us because our chain only has one interesting
“link”. Given an $n+2$-dimensional simplex $\sigma$, we need to know that
$\partial \partial \sigma = 0$.
I won’t say much more about it now, but I might write a blog post
giving examples of higher-dimensional simplicial cohomology at some point.
When that happens, we’ll have no choice but to go into more detail.</p>
<div class="boxed">
<p>As a quick exercise:</p>
<p>What is the boundary $\partial (e_{01} + e_{12})$? What, intuitively,
does $e_{01} + e_{12}$ represent? Does it make sense why the <em>boundary</em> of
this figure should be what it is?</p>
<p>What about $\partial (e_{01} + e_{12} - e_{02})$? Again, what does
$e_{01} + e_{12} - e_{02}$ represent? Does it make sense why
the <em>boundary</em> of this figure should be what it is?</p>
</div>
<p>So we know that elements of $\mathbb{Z}E$ (resp. $\mathbb{Z}V$)
represent (linear combinations of) edges (resp. vertices). Of course,
we want to look at <em>functions</em> defined on the edges and vertices.
So our next step is to <em>dualize</em> this chain:</p>
\[\cdots \leftarrow
\text{Hom}(0, \mathbb{R}) \leftarrow
\text{Hom}(0, \mathbb{R}) \leftarrow
\text{Hom}(\mathbb{Z}E, \mathbb{R}) \overset{\partial^T}{\longleftarrow}
\text{Hom}(\mathbb{Z}V, \mathbb{R})\]
<p>We’re now looking at all (linear) functions from
$\mathbb{Z}E \to \mathbb{R}$ (resp. $\mathbb{Z}V \to \mathbb{R}$).
By the universal property of free abelian groups, though, we know
that the functions $\mathbb{Z}E \to \mathbb{R}$ correspond exactly
to the functions $E \to \mathbb{R}$ (extended linearly).</p>
<p>Moreover, our boundary operator $\partial$ has become a <em>coboundary</em>
operator $\partial^T$ that points the other direction<sup id="fnref:6" role="doc-noteref"><a href="#fn:6" class="footnote" rel="footnote">6</a></sup>. Here if
$F : V \to \mathbb{R}$ then we define $\partial^T f : E \to \mathbb{R}$ by</p>
\[(\partial^T F) (e) = F(\partial e)\]
<p>Moreover, our notation $\partial^T$ is not misleading.
$\text{Hom}(\mathbb{Z}V, \mathbb{R})$ has a basis of characteristic functions</p>
\[\{ \chi_{v_0}, \chi_{v_1}, \chi_{v_2} \}\]
<p>where</p>
\[\chi_{v_i}(v_j) = \begin{cases} 1 & i=j \\ 0 & i \neq j \end{cases}\]
<p>Similarly $\text{Hom}(\mathbb{Z}E, \mathbb{R})$ has a basis of characteristic
functions, and it turns out that, with respect to these “dual bases”, the map
$\partial^T$ is actually represented by the transpose of $\partial$!</p>
<div class="boxed">
<p>If you haven’t seen this before, you should convince yourself that
it’s true. Remember that the transpose of a matrix has
<a href="https://en.wikipedia.org/wiki/Transpose_of_a_linear_map">something to do with</a> dualizing.</p>
<p>Moreover, you should check that a function $f$ on the edges
is in the image of $\partial^T$ exactly when it can be integrated.
Moreover, if $f = \partial^T F$, then $F$ <em>is</em> an antiderivative
for $f$.</p>
</div>
<p>We’re in the home stretch! The second half of that box alludes to
something important: A function $f$ can be integrated exactly when
it is in the image of $\partial^T$. With this in mind, we’re finally
led to the definition of the cohomlogy group of our graph:</p>
<p>Since the only map $0 \to \mathbb{R}$ is the trivial one, we can
rewrite our complex as:</p>
\[\cdots \overset{0}{\leftarrow}
0 \overset{0}{\leftarrow}
0 \overset{0}{\leftarrow}
\text{Hom}(\mathbb{Z}E, \mathbb{R}) \overset{\partial^T}{\longleftarrow}
\text{Hom}(\mathbb{Z}j, \mathbb{R})\]
<p>Then we define<sup id="fnref:7" role="doc-noteref"><a href="#fn:7" class="footnote" rel="footnote">7</a></sup></p>
\[H^1 =
\frac
{
\text{Ker}\big ( \partial^T : \text{Hom}(\mathbb{Z}E, \mathbb{R}) \to 0 \big )
}{
\text{Im} \big ( \partial^T : \text{Hom}(\mathbb{Z}V, \mathbb{R}) \to \text{Hom}(\mathbb{Z}E, \mathbb{R}) \big )
}\]
<p>Since there are no two dimensional faces, $\partial^T : \mathbb{Z}E \to 0$
is the trivial map, and so its kernel is everything. In light of this,
we see a slightly simpler definition of $H^1$:</p>
\[H^1 =
\frac
{
\text{Hom}(\mathbb{Z}E, \mathbb{R})
}{
\text{Im} \big ( \partial^T : \text{Hom}(\mathbb{Z}V, \mathbb{R}) \to \text{Hom}(\mathbb{Z}E, \mathbb{R}) \big )
}\]
<p>This says the elements of $H^1$ are exactly the functions on edges, but
we’ve quotiented out by all the functions that we can integrate to the vertices!
So if we want to check if a function can be integrated, we just compute its
cohomology class and check if it’s $0$.</p>
<p>Moreover, the <em>basis</em> of $H^1$ as an $\mathbb{R}$-vector space give us
a collection of “basic” non-integrable functions. Then <em>every</em> function
on the edges can be written as an integrable one, plus some linear
combination of the basic nonintegrable ones. This dramatically reduces
the number of things we have to think about! From the point of view of
integration, we only need to worry about the “good” functions
(which admit antiderivatives) and (typically finitely many) “bad”
functions which we can handle on a case-by-case basis.</p>
<div class="boxed">
<p>If we put $0$s to the right of $\mathbb{Z}V$ as well as to the left of
$\mathbb{Z}E$, we can also look at</p>
\[H^0 =
\frac
{
\text{Ker}\big ( \partial^T : \text{Hom}(\mathbb{Z}V, \mathbb{R}) \to \text{Hom}(\mathbb{Z}E, \mathbb{R}) \big )
}{
\text{Im} \big ( \partial^T : 0 \to \text{Hom}(\mathbb{Z}V, \mathbb{R}) \big )
} =
\text{Ker}\big ( \partial^T : \text{Hom}(\mathbb{Z}V, \mathbb{R}) \to \text{Hom}(\mathbb{Z}E, \mathbb{R}) \big )\]
<p>What is the dimension of $H^0$ in our example? What about for a graph with
multiple connected components? In this sense, $H^0$ detects “$0$-dimensional holes”.</p>
</div>
<hr />
<p>We’ve spent some time now talking about what cohomology is. But again,
part of its power comes from how <em>computable</em> it is. Without the exposition,
you can see it’s really a three step process:</p>
<ol>
<li>
<p>Turn your combinatorial data into a chain complex by taking free
abelian groups and writing down boundary matrices $\partial$.</p>
</li>
<li>
<p>Dualize by hitting each group with $\text{Hom}(\cdot, \mathbb{R})$</p>
</li>
<li>
<p>Compute the kernels and images of $\partial^T$, then take quotients.</p>
</li>
</ol>
<p>Steps $1$ and $2$ should feel very good, and hopefully you’re aware that
taking kernels and images of a linear map <em>should</em> be easy
(even if I know I’m pretty rusty). It turns out computing a vector space
quotient is <em>also</em> easy, though it’s much less widely taught. That doesn’t
matter, though, since <a href="https://www.sagemath.org">sage</a> absolutely remmebers
how to do it.</p>
<p>Since it’s so computable, and the best way to gain intuition for things
is to work through examples, I’ve included some code to do just that!</p>
<hr />
<div class="boxed">
<p>Enter a description of a graph, and then try to figure out what you think
the cohomology should be.</p>
<p>See if you can find geometric features of your graph which make the
dimension obvious! If you want a bonus challenge, can you guess what the
generators will be? Keep in mind there’s lots of generating sets, so you
may get a different answer from what sage tells you even if you’re right.</p>
<p>You might also try to <em>implement</em> the algorithm we described yourself,
at least for simple cases like graphs. You can then check yourself against
the built in sage code below!</p>
</div>
<div class="linked_auto">
<script type="text/x-sage">
# Write the edges in the box.
# You can add isolated vertices by including
# an 'edge' with only one vertex
@interact
def _(Simplices = input_box([["a"],["b","c"],["c","d"],["b","d"]], width=50), auto_update=False):
show("The graph is:")
S = SimplicialComplex(Simplices)
show(S.graph()) # It looks like there's no builtin way to draw complexes...
show("The chain complex is:")
# we did it over the reals in the post,
# but if we use the reals here, sage will
# print 1.00000000000000 instead of 1...
# so we're using the rationals instead
C = S.chain_complex(base_ring=QQ)
# mathjax uses its own font and I'm too lazy to change it
# but it's not monospace, so the ascii_art looks silly
# when we `show` it...
# the solution is to print it instead, since I have
# control over non-mathjax fonts.
# but printing doesn't flush the output buffer, so
# things show up in a silly order!
# we can fix this by manually flushing the buffer ourselves.
# this means the cell complexes are going to be left-aligned, though
# which we'll just have to deal with.
print(ascii_art(C))
sys.stdout.flush()
show("Which dualizes to:")
Cdual = C.dual()
print(ascii_art(Cdual))
sys.stdout.flush()
show("So the cohomology is:")
# the cohomology of the original complex is
# exactly the homology of the dual complex.
H1 = Cdual.homology(deg=1,generators=True)
show(QQ^(len(H1)))
# Remember, the outputs here represent functions!
# The entry in position i is the value that our
# function assigns edge i
show("With generators:")
for g in H1:
show(g[1].vector(1))
</script>
</div>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>See, for instance, <a href="https://jeremykun.com/2013/04/03/homology-theory-a-primer/">this</a>
wonderful series by Jeremy Kun, and even the
<a href="https://en.wikipedia.org/wiki/Homology_(mathematics)#Background">wikipedia page</a>.
The basic idea is that homology groups correspond to “holes” in your space.
These correspond to subsurfaces that aren’t “filled in”. That is, they
aren’t the boundary of another subsurface. This is where the “boundary”
terminology comes from. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>I know this is a link to simplicial <em>homology</em>, but there’s no
good overview page (at least on the first page of google) for
simplicial cohomology. It’s close enough, though, especially since
we’re going to be spending a lot of time talking about simplicial
cohomology in this post. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:3" role="doc-endnote">
<p>If you’ve heard of <a href="https://en.wikipedia.org/wiki/Sheaf_(mathematics)">sheaves</a>
before, this is also why we care about sheaves! They are the right
“data structure” for keeping track of these “locally defined functions”
that we’ve been talking about. <a href="#fnref:3" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:4" role="doc-endnote">
<p>We can tell we’re onto something important, though, because for nice
spaces, all the different definitions secretly agree! Often when you
have a topic that is very robust under changes of definition, it means
you’re studying something real. We see a similar robustness in, for
instance, the notion of computable function. There’s at least a half
dozen useful definitions of computability, and it’s often useful to
switch between them fluidly to solve a given problem. Analogously, we
have a bunch of definitions of cohomology theories which are known to
be equivalent in many contexts. It’s similarly useful to keep multiple
in your head at once and use the one best suited to a given problem. <a href="#fnref:4" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:5" role="doc-endnote">
<p>For $\partial$ from edges to vertices, we know what our orientation
should be (always subtract the low vertex from the high vertex),
but it’s less clear what signs each of the edges bounding a triangle
should receive… It’s even <em>less</em> clear what signs the faces bounding
a tetrahedron should get! In fact, the issue of signs
(and orientation in general) is a <a href="https://en.wikipedia.org/wiki/Simplicial_homology#Boundaries_and_cycles">bit fussy</a>. Once you pick a convention,
though (for us, it’s high minus low), the orientation in higher dimensions
is set in stone. You shouldn’t worry too much about the formulas for $\partial$.
What matters is the signs are chosen to make $\partial \partial \sigma = 0$
for every $n+2$-simplex $\sigma$. This should make a certain amount of sense,
as the boundary of a figure should not have its own boundary…
Thats worth some meditation. <a href="#fnref:5" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:6" role="doc-endnote">
<p>Oftentimes you’ll see this written as $d$ in the literature, since it acts
like a differential. Indeed in the case of De Rham Cohomology it literally
is the derivative! <a href="#fnref:6" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:7" role="doc-endnote">
<p>In general, if we have a complex
\(\cdots \overset{\partial^T_{n+2}}{\longleftarrow}
C^{n+1} \overset{\partial^T_{n+1}}{\longleftarrow}
C^{n} \overset{\partial^T_{n}}{\longleftarrow}
C^{n-1} \overset{\partial^T_{n-1}}{\longleftarrow} \cdots\)
The $n$th cohomology group is
\(H^n =
\frac
{
\text{Ker} \big ( \partial^T_{n+1} : C^n \to C^{n+1} \big )
}{
\text{Im} \big ( \partial^T_n : C^{n-1} \to C^n \big )
}\)
Again, if I end up writing a follow-up post with higher dimensional
examples, you’ll hear <em>lots</em> more about this. <a href="#fnref:7" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Mon, 01 Mar 2021 00:00:00 +0000
https://grossack.site/2021/03/01/cohomology-intuitively.html
https://grossack.site/2021/03/01/cohomology-intuitively.htmlLinearly Ordered Groups and CH<p>Earlier today <a href="https://math.jonathanalcaraz.com/">Jonathan Alcaraz</a> gave a GSS talk about
Linearly Ordered (LO) Groups, which are a fun topic with connections to
dynamics, topology, geometric group theory, etc. This reminded me of a
problem I told myself to think about a while ago, and so I decided to
finally do that. After a bit of thought, a friend from CMU (Pedro Marun)
and I were able to figure it out. This post is going to be somewhat
more meandering than usual (if you can imagine such a thing), because I want
to showcase what the flow of thoughts was in solving the problem.
At the end I’ll clean things up and write them linearly.</p>
<p>I guess we should start with what a LO Group is, but it’s pretty much
what it says on the tin:</p>
<div class="boxed">
<p>A (Left) <span class="defn">Linearly Ordered Group</span> is a
group $G$ equipped with a total order $\leq$ which is compatible
with (left) multiplication in the following sense:</p>
<p>\(g_1 \leq g_2 \quad \implies \quad hg_1 \leq hg_2\)</p>
</div>
<p>I first heard of LO Groups from an exercise in Marker’s
“Model Theory: An Introduction”, where an exercise has you
use compactness to show every torsion free abelian group admits
a compatible linear order. I heard about them again on mse,
to the surprise of nobody. Somebody <a href="https://math.stackexchange.com/q/3928388/655547">asked</a> for examples of
finitely generated left orderable groups. I knew about the abelian
example because of Marker, but I was curious about nonabelian examples.</p>
<p>This led me down a rabbit hole of papers to skim, including
Katheryn Mann’s “Left Orderable Groups that Don’t Act on the Line”
(see <a href="https://e.math.cornell.edu/people/mann/papers/germsatinfinity.pdf">here</a>). This paper mentions a classical result:</p>
<div class="boxed">
<p>A countable group is LO if and only if it embeds in $\text{Homeo}_+(\mathbb{R})$,
the group of orientation preserving homeomorphisms of $\mathbb{R}$.</p>
<p>The order on $\text{Homeo}_+(\mathbb{R})$ is as follows:</p>
<p>Enumerate \(\mathbb{Q} = \{q_n\}\). Then we say $f \lt g$ exactly when
$f q_i \lt g q_i$, where $i$ is least with $f q_i \neq g q_i$
(this is more or less the lex order on $\prod_{\omega} \mathbb{R}$).</p>
</div>
<p>As soon as I saw this, I wondered if anything was special about “countable” here.
If we assume the Continuum Hypothesis (CH) fails, what can we say about other
LO groups of size $\lt \mathfrak{c}$? Do they all have to embed in
$\text{Homeo}_+(\mathbb{R})$ as well?</p>
<p>I keep a list of “problems to think about”, so I added this and left some
brief thoughts before going back to answering mse questions.</p>
<hr />
<p>When Jonathan brought up this theorem in his talk, it reminded me to think
about that problem. I found a proof of the result to see if it immediately
worked for larger cardinalities, and much to my surprise it relies <em>heavily</em>
on the countability of $G$! This is a summary of a proof from Clay and Rolfsen’s
“Ordered Groups and Topology” (see <a href="https://arxiv.org/abs/1511.05088">here</a>), Theorem 2.23.</p>
<p>$\ulcorner$
Let $G$ be countable and LO. Then by looking at $G \times \mathbb{Q}$ with
the lex order, we can assume $G$’s ordering is dense. Moreover, it is easy
to see that $G$ is torsion free, so for any element $g$, there is always
some $g_L \lt g$ and some $g_R \gt g$ ($g^{-1}$ works for one, and $g^2$ for the other).</p>
<p>So $G$ is a countable dense linear order without endpoints! If you’re a
logician your heart should be leaping now. Cantor’s famed
<a href="https://en.wikipedia.org/wiki/Back-and-forth_method">back and forth argument</a> shows that <em>any</em> such ordering is isomorphic
(as an order) to $(\mathbb{Q}, \lt)$. It was really exciting to see this
familiar face pop up in this proof! But since $(G, \lt) \cong (\mathbb{Q}, \lt)$
embeds densely in $(\mathbb{R}, \lt)$, we can extend the left action of $G$ on
itself to a homeomorphism of $\mathbb{R}$.
<span style="float:right">$\lrcorner$</span></p>
<p>This theorem relies on a back and forth argument for most of the heavy lifting,
and that argument fails <em>spectacularly</em> for uncountable cardinalities.
In fact, for any uncountable $\kappa$ there are $2^\kappa$ nonisomorphic
dense linear orders without endpoints of cardinality $\kappa$
(see <a href="https://math.stackexchange.com/q/2580875/655547">here</a>, for instance). This made me start wondering if the theorem is
actually <em>false</em> for groups of size, say, $\aleph_1$.</p>
<p>I texted Pedro, a close friend and set theorist, with some ideas that he
pretty quickly found flaws in. He had a good idea, though, and reminded me
that $\mathbb{R}$ doesn’t contain any chains of length $\omega_1$. That is,
there’s no monotone function $f : \omega_1 \to \mathbb{R}$.</p>
<p>I thought if we could find a LO group $G$ with some $\omega_1$ chain,
then we should be done. My thought process was baiscally:</p>
<ul>
<li>If \(G \hookrightarrow \text{Homeo}_+(\mathbb{R})\), then $G \curvearrowright \mathbb{R}$.</li>
<li>If \(\{ g_\alpha \}_{\omega_1}\) is a chain in $G$, then \(\{ g_\alpha x \}_{\omega_1}\) should be a chain in $\mathbb{R}$ once we pick
some initial value $x$.</li>
</ul>
<p>Of course, this turned out to be wrong too. It’s not hard to find homeomorphisms
$f \lt g$ where $fx \not \lt gx$. It was a good start, though, on the way to
the right answer.</p>
<p>If nothing else, we should just build such a group to show we know how, right?
This is a simple compactness argument:</p>
<ul>
<li>Look at the language of ordered groups, but add \(\omega_1\) many ~ bonus constants ~ \(x_\alpha\).</li>
<li>Look at the theory which includes the sentences
<ul>
<li>“I am an ordered group”</li>
<li>”\(x_\alpha \lt x_\beta\)” for each $\alpha \lt \beta$.</li>
</ul>
</li>
<li>Now each finite subtheory only refers to finitely many constants, so $\mathbb{Z}$ is a model.</li>
<li>Then compactness buys us a model of the whole theory – an ordered group with a chain of length $\omega_1$.</li>
<li>Now by Lowenheim-Skolem, we look at the elementary submodel containing this chain.
This is also an ordered group with a chain of length $\omega_1$, but it’s guaranteed to
have cardinality $\aleph_1$.</li>
</ul>
<p>So we’ve successfully found a LO group of size $\aleph_1$ which contains an
increasing chain of length $\omega_1$… But didn’t we say this doesn’t actually
solve our problem?</p>
<p>This is where I remembered a fact from Descriptive Set Theory: For a
compact metric space $X$, we actually know that \(\text{Homeo}(X)\) is
<a href="https://en.wikipedia.org/wiki/Polish_space">polish</a> (see Kechris’s “Classical Descriptive Set Theory”, I.9B, example 8).
There’s a classic argument that $\mathbb{R}$ doesn’t contain any chains of
length $\omega_1$ which seems to only use separability<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>, and the
dream would be to show this continues to hold in a general polish space.</p>
<p>Of course, we <em>also</em> need to check that \(\text{Homeo}_+(\mathbb{R})\) is
actually polish. The above theorem only guarantees polishness for <em>compact</em>
spaces $X$, and the reals are (among other things) not compact.</p>
<p>First, I searched for “borel ordering” in Kechris’s book, and found a
reference to Harrington, Marker, and Shelah’s “Borel Orderings”
(see <a href="https://www.ams.org/journals/tran/1988-310-01/S0002-9947-1988-0965754-3/S0002-9947-1988-0965754-3.pdf">here</a>). Corollary 3.2 gives exactly what we want<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>, but it’s phrased
in terms of subsets of $\mathbb{R}$… But now we know what to search for,
and we quickly find a <a href="https://math.stackexchange.com/q/184200/655547">mse question</a> which cites the paper and makes me
feel confident that I’m not misinterpreting it.</p>
<p>All that’s left is to show $\text{Homeo}_+(\mathbb{R})$ is really polish,
but our journey ends like it began, on <a href="https://math.stackexchange.com/q/732380/655547">mse</a>.</p>
<div class="boxed">
<p>As a nice exercise, can you show that the order on $\text{Homeo}_+(\mathbb{R})$
really is borel? That is, can you show</p>
\[\{ (f,g) ~|~ f \leq g \}
\subseteq
\text{Homeo}_+(\mathbb{R}) \times \text{Homeo}_+(\mathbb{R})\]
<p>is a borel subset?</p>
</div>
<hr />
<p>Ok. Now that the exposition is out of the way, we’re holding a draft of a
proof in our heads. It was a wandering path, but look how deceptively simple
it looks once we organize our thoughts and write it down:</p>
<div class="boxed">
<p>Theorem ($\lnot \mathsf{CH}$):</p>
<p>There exists a LO group of size $\aleph_1$ which does not embed in
$\text{Homeo}_+(\mathbb{R})$</p>
</div>
<p>$\ulcorner$
Since $(\mathbb{Z}, \lt)$ is an infinite LO group, a standard
logical compactness argument furnishes an LO group of size
$\aleph_1$ which contains an increasing sequence \(\{g_\alpha\}\)
of length $\omega_1$. Call such a group $G$.</p>
<p>Then since \(\text{Homeo}_+(\mathbb{R})\) is a polish space
(cf. <a href="https://math.stackexchange.com/q/732380/655547">here</a>) and its ordering is borel, a theorem of
Shelah and Harrington (cf. Corollary 3.2 <a href="https://www.ams.org/journals/tran/1988-310-01/S0002-9947-1988-0965754-3/S0002-9947-1988-0965754-3.pdf">here</a>)
shows that no chain of length $\omega_1$ can exist in
$\text{Homeo}_+(\mathbb{R})$.</p>
<p>Since $G$ contains such a chain, it cannot embed into
$\text{Homeo}_+(\mathbb{R})$.
<span style="float:right">$\lrcorner$</span></p>
<p>Can you believe that teeny little proof took <em>hours</em> of reading and thinking
(times two people, no less!) to figure out? It really makes you appreciate
how much work goes into some of the longer and tricker theorems you come across.</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Any two elements
\(x_\alpha \lt x_{\alpha_1}\) must contain a rational between them. Since
there’s only countably many rationals, we can’t have a chain of uncountable
length. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>And after looking through the paper, I’m <em>extremely</em> grateful I didn’t
try to stubbornly prove it myself. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Fri, 26 Feb 2021 00:00:00 +0000
https://grossack.site/2021/02/26/lo-groups.html
https://grossack.site/2021/02/26/lo-groups.htmlSage Sums<p>Today I learned that <a href="https://www.sagemath.org">sage</a> can automatically
simplify lots of sums for us by interfacing with <a href="https://maxima.sourceforge.io/">maxima</a>.
I also learned recently that the <code class="language-plaintext highlighter-rouge">init.sage</code> file exists, which let me fix some
minor gripes with my sage. Notably, I was able to add commands <code class="language-plaintext highlighter-rouge">aa</code> and <code class="language-plaintext highlighter-rouge">nn</code>
which automatically get ascii art or a numeric answer for the most recent
expression! This is going to be a short post just to highlight how these things
work, since I had to figure them out for myself.</p>
<p>I was reading Concrete Mathematics the other night when I came across the
section on <a href="https://en.wikipedia.org/wiki/Gosper%27s_algorithm">Gosper’s Algorithm</a>.
This promises to solve a large class of sums
(the <a href="https://en.wikipedia.org/wiki/Hypergeometric_function">hypergeometric</a> ones)
algorithmicaly, which is extremely alluring. I periodically find myself trying to
simplify tricky sums (either for mse questions or for some problem I’m thinking about)
and it would be nice to offload that thinking to a computer.</p>
<p>Unfortunately, googling around for “gosper’s method sage” and similar didn’t
actually give anything useful (at least not quickly). In hindsight, it turns
out there’s actually a <code class="language-plaintext highlighter-rouge">gosper_sum</code> built in
(see the <a href="https://doc.sagemath.org/html/en/reference/calculus/sage/symbolic/expression.html#sage.symbolic.expression.Expression.gosper_sum">docs</a>),
but for some reason I didn’t find it at the time. After some searching I instead
found a <a href="https://github.com/benyoung/AeqB-sage">github repo</a> that coded
up gosper’s algorithm, as well as a bunch of other algorithms from a book.
This was how I found my way to Petkovsek, Wilf, and Zeilberger’s
<a href="https://www2.math.upenn.edu/~wilf/AeqB.html">A=B</a>, which has lots of similar
algorithms for algorithmically simplifying sums. It’s an extremely interesting
read, both mathematically and historically, and I’ve been enjoying it so far.</p>
<p>Before I learned it was built-in after all, I was planning to put up a blog post
with an implementation of gosper’s algorithm so that people could come here to
simplify their sums. Thankfully, I did eventually find the implementation, which
saved me a bunch of coding! Sage actually aliases over
python’s default <code class="language-plaintext highlighter-rouge">sum</code> function, and will pass off symbolic sums to maxima
where they’re evaluated using tons of powerful techniques (one of which is
gosper’s algorithm). The reason I was having trouble finding a function for this
was (in part) because it’s baked into the <code class="language-plaintext highlighter-rouge">sum</code> function already!</p>
<p>I’m still putting up this post, in part to share this realization with
anyone else who didn’t know about it (which probably isn’t many people…),
but in part to still provide a place where people can come to simplify sums.
In case you don’t have a sage installation on your own computer, you can
modify one of the examples below and evaluate your favorite summation here!</p>
<p>Let’s see some quick examples. The syntax <code class="language-plaintext highlighter-rouge">sum(f,k,a,b)</code> corresponds to
$\sum_{k=a}^b f$.</p>
<div class="auto">
<script type="text/x-sage">
n,k = var('n,k')
# I think we're legally obligated to make this our first sum.
# hold=True keeps it from evaluating
sum1 = sum(binomial(n,k),k,0,n, hold=True)
# so that we can display the original sum as the LHS here
# unhold then lets the expression evaluate as it would naturally
show(sum1 == sum1.unhold())
# You can also define a symbolic function, then use it in the sum
f = k * binomial(n,k)
sum2 = sum(f,k,0,n, hold=True)
show(sum2 == sum2.unhold())
</script>
</div>
<p>This can solve fairly complex sums. These come from an exercise in A=B
(exercise 1d and 1e from chapter 5). After seeing the solutions,
I’m definitely glad I didn’t have to work them out by hand!</p>
<div class="auto">
<script type="text/x-sage">
n,k = var('n,k')
soln_d = sum(k^4 * 4^k / binomial(2*k,k), k, 0, n, hold=True)
show(soln_d == soln_d.unhold())
f = factorial(3*k) / (factorial(k) * factorial(k+1) * factorial(k+2) * 27^k)
soln_e = sum(f,k,0,n, hold=True)
show(soln_e == soln_e.unhold())
</script>
</div>
<p>This has already helped me “in the wild”. There is an
<a href="https://math.stackexchange.com/q/4039066/655547">mse question</a>
which asked about the sum $\sum_{n=0}^\infty \frac{16n^2 + 20n + 7}{(4n+2)!}$.
A commenter asks whether OP wants a closed form, or merely a convergence result.
The sum certainly <em>looks</em> like it doesn’t admit a nice closed form, but I’ve
been deceived before. Instead of wasting a few minutes trying to find a
nice closed form (which is what I would have done even a few days ago),
we can simply ask sage:</p>
<div class="auto">
<script type="text/x-sage">
n = var('n')
f = (16*n^2 + 20*n + 7) / factorial(4*n + 2)
# I also just learned oo is an alias for Infinity!
soln = sum(f,n,0,oo, hold=True)
show(soln)
print("This is exactly: ")
show(soln.unhold())
print("This is approximately: ")
show(soln.unhold().n())
</script>
</div>
<p>Sage happily computed a closed form for this sum… It just happens to use
a bunch of hypergeometric functions! This pretty quickly answers the
“does OP want a closed form” question, assuming OP’s professor isn’t a sadist<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>.</p>
<hr />
<p>As an aside, in the above example we computed something exactly, but then
used <code class="language-plaintext highlighter-rouge">.n()</code> in order to get a numerical value
(which is often better for getting a sense of things). Since sage will let you
write <code class="language-plaintext highlighter-rouge">_</code> to get the output of the last command, <code class="language-plaintext highlighter-rouge">_.n()</code> is probably my most
typed command. Either that or <code class="language-plaintext highlighter-rouge">ascii_art(_)</code>, which draws an ascii version of
whatever your most recent output was. Since I use sage in a terminal, rather
than a jupyter notebook, this saves me endless parsing related headaches
when it comes to actually reading sage’s output.</p>
<p>If you also find yourself using these commands all the time, I can’t recommend
the following aliases enough. These are part of my <code class="language-plaintext highlighter-rouge">init.sage</code>, and have changed
my life. If you want to see my entire sage configuration, you can find it
(with the rest of my dotfiles)
<a href="https://github.com/HallaSurvivor/dotfiles/blob/master/init.sage">here</a>.</p>
<div class="no_out">
<script type="text/x-sage">
# get the ipython instance so we can
# do black magic with our repl
_ipy = get_ipython()
# add a macro so typing nn will
# automatically convert the most
# recent output to a numeric.
_ipy.define_macro('nn', '_.n()')
# add a macro so typing aa will
# automatically run ascii_art
# on the most recent output.
_ipy.define_macro('aa', 'ascii_art(_)')
</script>
</div>
<hr />
<p>Before I end this post, there are a few more parting observations
that I want to squeeze in.</p>
<p>First, sage can solve recurrences for you as well,
by using either maxima’s <code class="language-plaintext highlighter-rouge">solve_rec</code> or
<a href="https://www.sympy.org/en/index.html">sympy</a>’s <code class="language-plaintext highlighter-rouge">rsolve</code>. I
have a wrapper in my <code class="language-plaintext highlighter-rouge">init.sage</code> that makes using the latter
slightly more convenient.</p>
<p>Second, if you’re faced with a particularly stubborn sum that sage won’t
simplify for you, you should try using maxima directly. You can actually
do this from inside sage by using <code class="language-plaintext highlighter-rouge">maxima.console()</code> and then loading the
<code class="language-plaintext highlighter-rouge">simplify_sum</code> package. You can see a worked out example
<a href="https://stackoverflow.com/a/28663533/3911897">here</a>, and you can see all the
high-power tools that <code class="language-plaintext highlighter-rouge">simplify_sum</code> buys you
<a href="https://github.com/andrejv/maxima/blob/master/share/solve_rec/simplify_sum.mac">here</a>.</p>
<hr />
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>Rather magically, though, this numerically agrees with $e + \sin(1)$ up
to all available digits, according to an
<a href="http://wayback.cecm.sfu.ca/cgi-bin/isc/lookup?number=3.55975281326694&lookup_type=simple">inverse symbolic calculator</a>.
Sage says this is a fluke (that is, they aren’t actually equal), but it’s an
extremely bizarre coincidence. Life is full of mysteries. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>
Thu, 25 Feb 2021 00:00:00 +0000
https://grossack.site/2021/02/25/sage-sums.html
https://grossack.site/2021/02/25/sage-sums.html