Analytic Combinatorics -- A Worked Example
08 Apr 2025 - Tags: sage
Another day, another blog post that starts with
“I was on mse the other day…”. This time, someone asked
an interesting question amounting to “how many unordered rooted
ternary trees with
Now, you might be thinking – didn’t you write a very similar blog post three years ago? Yes. Yes I did. Did I also completely forget what was in that post? Yes. Yes I did, haha. For some reason I was getting it mixed up with this other post from even more years ago, which isn’t nearly as relevant. Thankfully it didn’t matter much, since I’m fairly sure what I wanted to do wouldn’t work in the framework of that post anyways, and now I have a better idea of what this machinery is actually doing which is really exciting (since I’ve been wanting to understand this stuff for years).
Let’s start with a warmup problem! How might we count the number of
rooted ordered ternary trees with
These are the kinds of “ternary trees” that you might remember from a first class on algorithms and data structures. Such a tree is either a leaf or an internal node with 1,2, or 3 children. Crucially these children come with an order, because the datatype keeps track of which child is on the left/right (in the case of 2 children) or on the left/middle/right (in the case of 3 children). After all, from a CS perspective, we need to remember this in order to traverse the tree!
This means that the following two trees are distinct for our purposes, even though they’re isomorphic as graphs:
In a functional programming language, you might describe this datatype by
T z = Leaf z
| Unary z (T z)
| Binary z (T z) (T z)
| Ternary z (T z) (T z) (T z)
and this translates immediately to give a functional equation for the
generating function of
We can rearrange this as
so that we can compute the power series expansion of
xxxxxxxxxx
R.<z> = QQbar[[]] # power series ring
T = (z/(1 + z + z^2 + z^3)).reverse()
show(T.O(10))
Incredibly, this agrees with A036765, which is the “number of ordered rooted trees with n non-root nodes and all outdegrees <= three”, as we hoped!
You might reasonably ask if there’s a closed form for these numbers, and this is too much to ask for (indeed, it’s already too much to ask for a closed form for fibonacci numbers, and this is more complicated). But like the fibonacci numbers, we can come up with an excellent approximation:
where
Indeed, this gives fantastic results! Let’s plot the ratio of this
approximation to the true value so we can see just how good this approximation
gets as
xxxxxxxxxx
def actual(n,N):
R.<z> = PowerSeriesRing(QQbar, default_prec=N)
T = (z/(1 + z + z^2 + z^3)).reverse()
return T.coefficients()[n-1:]
def approx(n,N):
# very very magic
C = 0.8936373911078061 / (2 * sqrt(pi))
w = 3.610718613276040
return [(C * w^k * (1/k^1.5)).n() for k in range(n,N+1)]
show(html("Plot the error ratios in the range [n,N]"))
def _(n=input_box(100, width=20, label="$n$"),
N=input_box(120, width=20, label="$N$"),
auto_update=False):
actuals = actual(n,N)
approxs = approx(n,N)
ratios = [(actuals[i]/approxs[i]) for i in range(N+1-n)]
show(html("ratio at n={}: {}".format(n, ratios[0])))
show(html("ratio at N={}: {}".format(N, ratios[-1])))
text_options = {'horizontal_alignment': 'right',
'vertical_alignment': 'bottom',
'fontsize': 'large'}
G = Graphics()
G += scatter_plot([(n+i,r) for (i,r) in enumerate(ratios)])
G += plot(1, (x,n,N))
G += plot(1-1/x, (x,n,N))
G += text('$y = 1$', (N, 1), **text_options)
G += text('$y = 1-1/x$', (N, 1-1/N), **text_options)
G.show()
This is extremely cool, but where the hell did this approximation come from?
The answer is called Singularity Analysis, and can be found in Chapter 2 Section 3 of Melczer’s excellent An Invitation to Analytic Combinatorics, or Chapters VI and VII of Flajolet and Sedgewick’s tome. See especially Theorem VII.3 on pg 468.
Like seemingly every theorem in complex analysis, this is basically an application of the Residue Theorem. I won’t say too much about why it works, but I’ll at least gesture at a proof. You can read the above references if you want something more precise.
First, we recall
where
Then we draw the most important picture in complex analysis:
Here the obvious marked point is our singularity
It turns out that the two horizontal pieces basically contribute
To estimate the integral around the little circle, it would be really
helpful to have a series expansion at
I won’t say much about what these are, especially since Richard Borcherds already put out such a great video on the topic. What matters is is that Sage can compute them for us3, so we can actually get our hands on the approximation!
We compute the integral4 around the little circle to be roughly:
In step
This gives us something which grows like
This looks a bit weird with the
Which finally shows us where our magic numbers came from!
Sage happily tells us that the dominant singularity for
so that for us
Finally, since
as promised!
Indeed, here’s how you could actually get sage to do all this for you!
xxxxxxxxxx
# at time of writing the abelfunctions package
# isn't installed on sagecell, so you'll need to
# run this locally if you want to play with it
import abelfunctions
# to compute the discriminant we need P to be a
# polynomial whose coefficients are polynomials.
R.<z> = QQbar[]
S.<T> = R[]
# our defining polynomial.
# we want to solve for T as a function of z
P = z*T^3 + z*T^2 + (z-1)*T + z
# following Example 2.12 in Melczer,
# we compute the dominant singularity w.
w = min([r for (r,_) in P.discriminant().roots() if r != 0],
key=lambda r: r.abs())
# to work with abelfunctions we need P
# to be a polynomial in two variables
R.<z,T> = QQbar[]
P = R(P)
# compute the puiseux expansion for T at w.
# I know from trial and error that the correct
# branch to pick is entry [1] in the list. You
# just need to check which one gives you
# positive real coefficients for T at 0.
s = abelfunctions.puiseux(P,w)[1]
c = -sqrt(-s.x0/s.xcoefficient)
As a fun exercise, you might modify this code (using s.add_term()
) to
compute a longer puiseux series and get asymptotics valid up to a
multiplicative error of
Try modifying the previous code block to see that our current approximation
is not accurate to
Ok, now that we understand the warmup, let’s get to the actual problem!
How many unordered rooted ternary trees are there, up to isomorphism?
Now we’re counting up to graph isomorphism, so that our two trees
are now considered isomorphic. It’s actually much less obvious how one might pin down a generating function for something like this, but the answer, serendipitously, comes from Pólya-Redfield counting! If this is new to you, you might be interested in my recent blog post where I talk a bit about one of its corollaries. Today, though, we’ll be using it in a much more sophisticated way.
Say you have a structure
We start by building the cycle index polynomial of the action
Say that
For us, we realize an unordered rooted ternary tree is either empty,
or a root with 3 recursive children8. In the recursive case we
also want to count up to the obvious
where
Now, how might we get asymptotics for
First let’s think about how our solution to the warmup worked. We
wrote
We’re going to play the same game here, except we’ll assume that
We’ll assume that the functions
Now for a touch of magic. Say we can find a pair
Since we know
so that a puiseux series for
If we write
See Flajolet and Sedgewick Theorem VII.3 on page 468 for a more careful proof of this theorem.
Now how do we use this? We can approximate
xxxxxxxxxx
# how far we taylor expand T. This controls the accuracy
# of our approximation of (s,r).
# the sagecell times out if you make this too big, so you
# might want to run this locally.
N = 5
# get the first n taylor coefficients for T
def approx(n):
B = PolynomialRing(QQ, 't', n+1)
t = B.gens()
R.<z> = B[[]]
# cycle index polynnomials
S.<x1,x2,x3> = QQ[]
P3 = (x1^3 + 3*x1*x2 + 2*x3)/6
# to start, the taylor coefficients of T are variables
T = sum([t[i] * z^i for i in range(n)]) + O(z^n)
lhs = T
rhs = 1 + z*P3.subs(x1=T(z), x2=T(z^2), x3=T(z^3))
# formally expand out the rhs and set the coefficients
# equal to each other. This gives a recurrence relation
# for t[n] in terms of the t[i] for i<n.
# Then we solve this recurrence via groebner bases,
# which is extremely fast.
I = B.ideal([lhs.coefficients()[i] - rhs.coefficients()[i]
for i in range(n)])
# add all our solved coefficients back together to
# get an approximation for T
return sum([I.reduce(t[i])*z^i for i in range(n)]) + O(z^n)
z,w = var('z,w')
T = SR(approx(N).truncate())
F = -w + 1 + (z/3)*T.subs(z=z^3) + (z/2)*T.subs(z=z^2)*w + (z/6)*w^3
eqns = [diff(F,w)==0, F==0]
solns = [[a.rhs(), b.rhs()] for [a,b] in solve(eqns,[z,w])]
realSolns = [[a,b] for [a,b] in solns if a in RR and b in RR]
[r,s] = realSolns[0]
show(html("$r={}$".format(r)))
show(html("$s={}$".format(s)))
gamma = -sqrt(2*r*diff(F,z).subs(z=r,w=s)/diff(F,w,2).subs(z=r,w=s))
show(html("$\\gamma={}$".format(gamma)))
If we run this code locally with
so that we expect
and indeed this seems to work really well!
Our taylor expansion for
xxxxxxxxxx
r = 0.3551817478731632
s = 2.1174205967276225
gamma = -1.8358222405236164
def approx(n):
return (gamma / (-2 * sqrt(pi))) * (r^(-n) / n^(3/2))
def approxList(n,N):
return [approx(k) for k in range(n,N)]
def actualList(n,N):
B = PolynomialRing(QQ, 't', N+1)
t = B.gens()
R.<z> = B[[]]
S.<x1,x2,x3> = QQ[]
P3 = (x1^3 + 3*x1*x2 + 2*x3)/6
T = sum([t[i] * z^i for i in range(N)]) + O(z^N)
lhs = T
rhs = 1 + z*P3.subs(x1=T(z), x2=T(z^2), x3=T(z^3))
I = B.ideal([lhs.coefficients()[i] - rhs.coefficients()[i]
for i in range(N)])
return [I.reduce(t[i]) for i in range(n,N)]
show(html("Plot the error ratios in the range [n,N]"))
show(html("This is likely to time out if you make N too large online,"))
show(html("so you might want to play around with it locally instead!"))
def _(n=input_box(10, width=20, label="$n$"),
N=input_box(30, width=20, label="$N$"),
auto_update=False):
actuals = actualList(n,N)
approxs = approxList(n,N)
ratios = [actuals[i]/approxs[i] for i in range(N-n)]
show(html("ratio at n={}: {}".format(n, ratios[0].n())))
show(html("ratio at N={}: {}".format(N, ratios[-1].n())))
text_options = {'horizontal_alignment': 'right',
'vertical_alignment': 'bottom',
'fontsize': 'large'}
G = Graphics()
G += scatter_plot([(n+i,r) for (i,r) in enumerate(ratios)])
G += plot(1, (x,n,N))
G += plot(1-1/x, (x,n,N))
G += text('$y = 1$', (N, 1), **text_options)
G += text('$y = 1-1/x$', (N, 1-1/N), **text_options)
G.show()
I’m pretty proud of this approximation, so I think this is a good place to stop ^_^.
As a fun exercise, can you write a program that outputs the number
of cyclic rooted ternary trees on
For bonus points, can you check that the number of such trees is, asymptotically,
Wow! It’s been super nice to be writing up so many posts lately! Like I said in one of my other recent posts, I’ve had a lot of time to think about more bite sized problems and mse stuff while waiting for my DOI generating code to run, so I’ve had more things that felt quick to write up on my mind.
My research is actually going quite well too! I have a few interesting directions to explore, and at least one project that might be wrapping up soon. Of course when that happens I’ll be sure to talk about it here, and I’m still planning out a series on fukaya categories, hall algebras, skein algebras, and more! That’s a pretty long one, though, so it’s easy for me to deprioritize, haha.
It’s already heating up in Riverside, consistently in the 80s (Fahrenheit), so I can really feel the summer coming. I’m enjoying the sunny days, though, and it’s been nice to spend time working outside and under trees.
Thanks for hanging out, all! Take care of each other, and I can’t wait to chat soon ^_^.
-
Just like
, a solution to has branches, so too does , a solution to . ↩ -
Really these are circles with a little arc cut out, say by integrating from
to … But we’ll end up taking and we’re all friends here, so let’s not worry about it. ↩ -
Though at time of writing you need the abelfunctions package to do it. There is a builtin implementation of puiseux series, but it won’t actually compute a series expansion for you. ↩
-
Of course, it’s easy to see how to extend this technique to get better asypmtotics. The first approach is to just keep more terms of the puiseux series of
at . Then apply the generalized binomial formula multiple times for each term you kept.You can also do better by keeping track of more of the singularities. Build a contour with multiple keyholes in order to get sharper lower order asymptotics:
Of course, you can also combine these approaches to keep track of both more singularities and more puiseux coefficients at each singularity. ↩
-
In fact we can push things even farther and work with multivariate generating functions, but we won’t need that here. ↩
-
This is why we plug
into . Because is responsible for -cycles, so we choose a single color for each cycle, but we have to count it -many times towards our total weight. See the proof on the wikipedia page for more information! ↩ -
This actually isn’t the version of the recurrence I used in my mse answer. There I used the convention that a rooted tree had to be nonempty, since… you know… it has a root.
But allowing possibly empty trees makes the recurrence much simpler, which in turn allows for much easier to analyze asymptotics. Hilariously this exact example is on the wikipedia page for the Pólya-Redfield theorem, which could have saved me a lot of time writing up that answer.
I was a bit worried at first about doing these asymptotics by myself, since this was my first serious attempt at using analytic combinatorics, but serendipitously this exact example was also worked out in Flajolet and Sedgewick VII.5, though slightly more tersely than I would have liked, haha. ↩
-
Officially we have to check that the choice of puiseux series matches up with our choice of taylor series (since there’s multiple branches to our function). But this is easy to arrange for us by choosing the branch of the puisuex series that leads to all our coefficients being positive reals. If you want to do this purely analytically you need to solve a “connection problem”. See figure VII.9 in Flajolet and Sedgewick, as well as the surrounding text. ↩
-
Under mild technical conditions this pair
is unique. See Flajolet and Sedgewick Theorem VII.3. ↩