Skip to main content

Analytic Combinatorics -- A Worked Example

08 Apr 2025 - Tags: sage

Another day, another blog post that starts with “I was on mse the other day…”. This time, someone asked an interesting question amounting to “how many unordered rooted ternary trees with n nodes are there, up to isomorphism?”. I’m a sucker for these kinds of combinatorial problems, and after finding a generating function solution I wanted to push myself to get an asymptotic approximation using Flajolet–Sedgewick style analytic combinatorics! I’ve never actually done this before, so I learned a lot, and I want to share some of the things I learned – especially how to do this stuff in sage!

Now, you might be thinking – didn’t you write a very similar blog post three years ago? Yes. Yes I did. Did I also completely forget what was in that post? Yes. Yes I did, haha. For some reason I was getting it mixed up with this other post from even more years ago, which isn’t nearly as relevant. Thankfully it didn’t matter much, since I’m fairly sure what I wanted to do wouldn’t work in the framework of that post anyways, and now I have a better idea of what this machinery is actually doing which is really exciting (since I’ve been wanting to understand this stuff for years).


Let’s start with a warmup problem! How might we count the number of rooted ordered ternary trees with n nodes?

These are the kinds of “ternary trees” that you might remember from a first class on algorithms and data structures. Such a tree is either a leaf or an internal node with 1,2, or 3 children. Crucially these children come with an order, because the datatype keeps track of which child is on the left/right (in the case of 2 children) or on the left/middle/right (in the case of 3 children). After all, from a CS perspective, we need to remember this in order to traverse the tree!

This means that the following two trees are distinct for our purposes, even though they’re isomorphic as graphs:

In a functional programming language, you might describe this datatype by

T z = Leaf z
    | Unary z (T z)
    | Binary z (T z) (T z) 
    | Ternary z (T z) (T z) (T z)

and this translates immediately to give a functional equation for the generating function of tn, which counts the number of rooted ordered ternary trees with n nodes:

T(z)=z+zT+zT2+zT3

We can rearrange this as

z=T1+T+T2+T3

so that we can compute the power series expansion of T by lagrange inversion:

Incredibly, this agrees with A036765, which is the “number of ordered rooted trees with n non-root nodes and all outdegrees <= three”, as we hoped!

You might reasonably ask if there’s a closed form for these numbers, and this is too much to ask for (indeed, it’s already too much to ask for a closed form for fibonacci numbers, and this is more complicated). But like the fibonacci numbers, we can come up with an excellent approximation:

tn=[zn]TC(3.6107)nn3/2(1±O(1n))

where C=(0.8936)2π is a constant.

Indeed, this gives fantastic results! Let’s plot the ratio of this approximation to the true value so we can see just how good this approximation gets as n gets large. Note that it respects the promised big-oh error bound too!

This is extremely cool, but where the hell did this approximation come from?

The answer is called Singularity Analysis, and can be found in Chapter 2 Section 3 of Melczer’s excellent An Invitation to Analytic Combinatorics, or Chapters VI and VII of Flajolet and Sedgewick’s tome. See especially Theorem VII.3 on pg 468.

Like seemingly every theorem in complex analysis, this is basically an application of the Residue Theorem. I won’t say too much about why it works, but I’ll at least gesture at a proof. You can read the above references if you want something more precise.

First, we recall

tn=12πiCTzn+1 dz

where C is any contour containing the origin inside a region where T is holomorphic.

Then we draw the most important picture in complex analysis:

Here the obvious marked point is our singularity ω, and we’ve chosen a branch cut for T (shown in blue1) so that T is holomorphic in the region where the pink curve lives. We’ll estimate the value of this integral by estimating the contribution from the big circle, the little circle, and the two horizontal pieces2.

It turns out that the two horizontal pieces basically contribute O(1) amount to our integral, so we ignore them. Since the big circle is compact, T will attain a maximal value on it, say M. Then the integral along the big circle (of radius ω+ϵ, say) is bounded by 2π(ω+ϵ)M(ω+ϵ)n+1=O((ω+ϵ)n)

To estimate the integral around the little circle, it would be really helpful to have a series expansion at ω since we’re staying really close to it… Unfortunately, ω is a singular point so we don’t have a taylor series, but fortunately there’s another tool for exactly this job: a Puiseux Series!

I won’t say much about what these are, especially since Richard Borcherds already put out such a great video on the topic. What matters is is that Sage can compute them for us3, so we can actually get our hands on the approximation!

We compute the integral4 around the little circle to be roughly:

12πiTzn+1 dz(1)12πicα(1zω)αzn+1 dz=(2)12πicαk(αk)(zω)kzn+1 dz=(3)cα(αn)(1)nωn

In step (1) we approximate T by the first term of its puiseux series, in step (2) we apply the generalized binomial theorem, so that in step (3) we can apply the residue theorem to realize this integral as the coefficient of z1 in our laurent expansion.

This gives us something which grows like O~(ωn), dominating the contribution O((ω+ϵ)n) from the big circle, so that asymptotically this is the only term that matters. If we were more careful to keep track of the big-oh error for the puiseux series for T we could easily sharpen this bound5 to see

tn=cα(αn)(1)nωn(1+O(1n))

This looks a bit weird with the (1)n, but remember that (αn) also alternates sign. Indeed, asymptotics for (αn) are well known so that we can rewrite this as

tn=cαΓ(α)ωnnα+1(1+O(1n))

Which finally shows us where our magic numbers came from! Sage happily tells us that the dominant singularity for T is at ω=(0.2769) and that a puiseux expansion for T at ω is

T(0.65)(0.8936)(1z/ω)1/2+

so that for us α=1/2, and C1/2=(0.8936).

Finally, since Γ(1/2)=2π and (0.2769)1=(3.6107) we see

tn=(0.8936)2π(3.6107)nn3/2(1+O(1n))

as promised!

Indeed, here’s how you could actually get sage to do all this for you!

As a fun exercise, you might modify this code (using s.add_term()) to compute a longer puiseux series and get asymptotics valid up to a multiplicative error of O(1n2) for the number of rooted ternary trees.

Try modifying the previous code block to see that our current approximation is not accurate to O(1n2), and then check that your approximation is!


Ok, now that we understand the warmup, let’s get to the actual problem!

How many unordered rooted ternary trees are there, up to isomorphism?

Now we’re counting up to graph isomorphism, so that our two trees

are now considered isomorphic. It’s actually much less obvious how one might pin down a generating function for something like this, but the answer, serendipitously, comes from Pólya-Redfield counting! If this is new to you, you might be interested in my recent blog post where I talk a bit about one of its corollaries. Today, though, we’ll be using it in a much more sophisticated way.

Say you have a structure X acted on by a group G and a collection C of “colors”. How can we count the number of ways to give each xX a color from C, up to the action of G?

We start by building the cycle index polynomial of the action GX, which we call PG(x1,,xn). Then we can plug all sorts of things into the variables xi in order to solve various counting problems. For example, if C is literally just a finite set of colors, we can plug in x1=x2==xn=|C| to recover the expression from my recent blog post. But we can also do much more!

Say that C is a possibly infinite family of allowable “colors”, each of a fixed “weight”. (For us, our “colors” will be trees, and the “weight” will be the number of nodes). Then we can arrange them into a generating function F(t)=citi, where ci counts the number of colors of weight i6. Then an easy argument (given in full on the wikipedia page) shows that PG(F(t),F(t2),,F(tn)) is a generating function counting the number of ways to color X by colors from C, counted up to the G action, and sorted by their total weight7. Check out the wikipedia page for a bunch of great examples!

For us, we realize an unordered rooted ternary tree is either empty, or a root with 3 recursive children8. In the recursive case we also want to count up to the obvious S3 action permuting the children, so by the previous discussion we learn that

T(z)=1+zPS3(T(z),T(z2),T(z3))

where PS3(x1,x2,x3)=16(x13+3x1x2+2x3) is the cycle index polynomial for the symmetric group on three letters. Plugging this in we see

T(z)=1+z6(T(z)3+3T(z)T(z2)+2T(z3))

Now, how might we get asymptotics for tn using this functional equation?

First let’s think about how our solution to the warmup worked. We wrote F(z,T)=0 for a polynomial F, used the implicit function theorem to get a taylor series for T at the origin, then got a puiseux series near the dominant singularity ω which let us accurately estimate the taylor coefficients tn9.

We’re going to play the same game here, except we’ll assume that F is merely holomorphic rather than a polynomial. Because we’re no longer working with a polynomial, this really seems to require an infinite amount of data, so I’m not sure how one might get an exact solution for the relevant constants… But following Section VII.5 in Flajolet and Sedgewick we can get as precise a numerical solution as we like!

We’ll assume that the functions T(z2) and T(z3) are already known analytic functions, so that we can write F(z,w)=w+1+z6(w3+3T(z2)w+2T(z3)). This is an analytic function satisfying F(z,T)=0.

Now for a touch of magic. Say we can find a pair (r,s) with both F(r,s)=0 and Fw|(r,s)=0… Then F is singular in the w direction at (r,s) so this is a branch point for T. Since both F and Fw=Fw vanish at (r,s) the taylor series for F at (r,s) starts

Fz(r,s)(zr)+12Fww(r,s)(ws)2

Since we know F(z,T) vanishes, we estimate up to smaller order terms

Fz(r,s)(zr)+12Fww(r,s)(Ts)2=0

so that a puiseux series for T at (r,s) begins

T=s±2rFz(r,s)Fww(r,s)(1zr)1/2

If we write γ=2rFz(r,s)Fww(r,s) and are slightly more careful with our error terms, the same technique from the warmup shows

tn=γΓ(1/2)rnn3/2(1+O(1n))

See Flajolet and Sedgewick Theorem VII.3 on page 468 for a more careful proof of this theorem.

Now how do we use this? We can approximate T by its taylor series at the origin, then numerically solve for the unique10 positive real solution to the system F(r,s)=Fw(r,s)=0 using this approximation:

If we run this code locally with N=20, we get the approximation

r0.3551817478731632s2.1174205967276225γ1.8358222405236164

so that we expect

tn(1.835)2π(0.355)nn3/2

and indeed this seems to work really well! Our taylor expansion for T agrees with A000598, as we expected, and comparing our approximation to our taylor expansion gives:

I’m pretty proud of this approximation, so I think this is a good place to stop ^_^.

As a fun exercise, can you write a program that outputs the number of cyclic rooted ternary trees on n vertices? For these we consider two trees the same if they’re related by cyclicly permuting their children. Compare your solution to A000625

For bonus points, can you check that the number of such trees is, asymptotically,

tn=(0.34630)(3.2871)nn3/2

Wow! It’s been super nice to be writing up so many posts lately! Like I said in one of my other recent posts, I’ve had a lot of time to think about more bite sized problems and mse stuff while waiting for my DOI generating code to run, so I’ve had more things that felt quick to write up on my mind.

My research is actually going quite well too! I have a few interesting directions to explore, and at least one project that might be wrapping up soon. Of course when that happens I’ll be sure to talk about it here, and I’m still planning out a series on fukaya categories, hall algebras, skein algebras, and more! That’s a pretty long one, though, so it’s easy for me to deprioritize, haha.

It’s already heating up in Riverside, consistently in the 80s (Fahrenheit), so I can really feel the summer coming. I’m enjoying the sunny days, though, and it’s been nice to spend time working outside and under trees.

Thanks for hanging out, all! Take care of each other, and I can’t wait to chat soon ^_^.


  1. Just like w=z, a solution to w2=z has branches, so too does T, a solution to T=z+zT+zT2+zT3

  2. Really these are circles with a little arc cut out, say by integrating from ϵ to 2πϵ… But we’ll end up taking ϵ0 and we’re all friends here, so let’s not worry about it. 

  3. Though at time of writing you need the abelfunctions package to do it. There is a builtin implementation of puiseux series, but it won’t actually compute a series expansion for you. 

  4. Thanks to Shane and Raj for helping me work this out! 

  5. Of course, it’s easy to see how to extend this technique to get better asypmtotics. The first approach is to just keep more terms of the puiseux series of T at ω. Then apply the generalized binomial formula multiple times for each term you kept.

    You can also do better by keeping track of more of the singularities. Build a contour with multiple keyholes in order to get sharper lower order asymptotics:

    Of course, you can also combine these approaches to keep track of both more singularities and more puiseux coefficients at each singularity. 

  6. In fact we can push things even farther and work with multivariate generating functions, but we won’t need that here. 

  7. This is why we plug F(tk) into xk. Because xk is responsible for k-cycles, so we choose a single color for each k cycle, but we have to count it k-many times towards our total weight. See the proof on the wikipedia page for more information! 

  8. This actually isn’t the version of the recurrence I used in my mse answer. There I used the convention that a rooted tree had to be nonempty, since… you know… it has a root.

    But allowing possibly empty trees makes the recurrence much simpler, which in turn allows for much easier to analyze asymptotics. Hilariously this exact example is on the wikipedia page for the Pólya-Redfield theorem, which could have saved me a lot of time writing up that answer.

    I was a bit worried at first about doing these asymptotics by myself, since this was my first serious attempt at using analytic combinatorics, but serendipitously this exact example was also worked out in Flajolet and Sedgewick VII.5, though slightly more tersely than I would have liked, haha. 

  9. Officially we have to check that the choice of puiseux series matches up with our choice of taylor series (since there’s multiple branches to our function). But this is easy to arrange for us by choosing the branch of the puisuex series that leads to all our coefficients being positive reals. If you want to do this purely analytically you need to solve a “connection problem”. See figure VII.9 in Flajolet and Sedgewick, as well as the surrounding text. 

  10. Under mild technical conditions this pair (r,s) is unique. See Flajolet and Sedgewick Theorem VII.3.