Skip to main content

Quantitative Cesaro Sums

18 Nov 2020 - Tags: quick-analysis-tricks

To the surprise of no one, I was on math stackexchange earlier and saw an interesting analysis question. I have a weird fascination with tricky limit questions because I feel like I’ve always been bad at them. I like working on them for the same reason I like practicing the difficult parts of pieces of music – it makes me feel like I’m improving (in a “no pain no gain” kind of way).

Anyways, the question was as follows (paraphrased):

Compute the following limit

limn(1+2+33++nn)ln(2n+12n)

The beginning of the solution makes sense: We want to approximate the ln, so we rewrite it as ln(1+12n) and approximate that1 as 12nO(1n2).

Then (rewriting as a summation as well) we get

limn(knk1k)(12nO(1n2))

Now here’s the clever idea: We use Cesàro Averages backwards! This is mentioned in a comment on the original mse question, and it’s a trick I’ll absolutely have to remember!

To explain what I mean, let’s take a quick detour into the world of Cesàro Averages:


If (xn) is a sequence, then the Cesàro Average ak is the average of the first k terms:

ak=1kjkxj

People care about the Cesàro averages because it’s possible for the Cesàro averages to converge even if the original series doesn’t. Moreover, the notion of “averaging” in this way comes up very naturally in Fourier Analysis (see here) and Ergodic Theory (see here).

The fundamental theorem in this area is this:

Whenever xn is already convergent, the sequence of Cesàro averages converges to the same limit.

So the notion of Cesàro convergence is a true generalization of the original notion of convergence. It allows us to evaluate certain limits that used to be divergent, but it doesn’t mess up any limits that already converged.


And now we get to the application here:

We recognize 1nknk1k as a Cesàro average of the sequence k1k. Since we know k1k1, we conclude the same is true of the Cesàro averages. So

limn(knk1k)(12nO(1n2))=limnknk1kn(12O(1n))=limn1(12O(1n))12

So we did properly figure out what the limit should be… But I did a little bit of sleight of hand in the above manipulation. Did you catch it?

As people interested in computer science, we have to be slightly pickier than a lot of mathematicians when it comes to computing limits. It’s often important exactly how quickly we get convergence2, and so we kept track of the big-Oh for the whole problem.

But above we just substituted n1n1. If we’re careful, it’s not too hard to get error bounds for the convergence of this series… But who’s to say what the error bounds will be for the Cesàro averages? These averages keep track of terms earlier in the series. It’s reasonable to worry that the convergence might be slower, and this worry turns out to be legitimate:

Say xn=L±O(e(n)). That is, xn converges to a limit L, and we can bound the error by O(e). Then

knxkn=knL±O(e(k))n=nL±O(kne(k))n=L±O(1nkne(k))

So the error in the Cesàro averages is the average of the errors. Let’s see how this comes up in our analysis of this particular problem.


First, we need to know the error bounds on n1n. This isn’t too hard to figure out:

n1n=e1nln(n)=1+O(ln(n)n)

Now we can find the error of the Cesàro averages:

1nknk1k=1nkn(1+O(ln(n)n))=1+O(1nknln(k)k)

While this is technically true, I would argue it isn’t useful yet. We can clean it up.

The important observation is that, for k4, ln(k)k is monotone decreasing. Then we do the old “approximate by the nearest power of 2” trick that we use to prove the harmonic series diverges3:

ln(4)4+ln(5)5+ln(6)6+ln(7)7+ln(8)8+ln(9)9++ln(15)15+ln(16)16+ln(4)4+ln(4)4+ln(4)4+ln(4)4+ln(8)8+ln(8)8++ln(8)8+ln(16)16+=ln(4)+ln(8)+ln(16)+

Notice we can safely assume n is a power of 2 by adding some extra terms. Since we’re trying to upper bound the error anyways, this is no issue. Also, since we’re not interested in additive constants, we can go aheaed and approximate ln(3)3 by ln(2)2 to make the sum more uniform.

O(1nknln(k)k)=O(1n(ln(2)2+ln(3)3+ln(4)+ln(8)++ln(n)))=O(1nilog2nln(2i))=O(1nilog2niln(2))=O(1nilog2ni)=O((logn)2n)

Importantly, this error bound is different from the error bound of our original series! As expected, this converges slightly slower (by a log factor) because it’s being weighed down by earlier terms in the series.

Why go through this pain, though? Because now we can finally put a nice bow on things: How quickly does the limit converge?

(1+2+33++nn)ln(2n+12n)=12(1±O((logn)2n))
  1. A notational quirk I picked up from Ryan O’Donnell that my brain likes: I will only ever write O(f) when f is a positive function. So I meaningfully distinguish between blah+O(1n) and blahO(1n).

    This is a pretty minor thing, but it’s led to notably more comfort on my part, since my brain tends to implicitly make positivity assumptions when I’m not looking. Explicitly writing down when things can be negative forces my brain to pay extra attention to it. 

  2. In fact, since I have Ryan O’Donnell on the brain, I audited his Theorist’s Toolkit class during my year off. In one of the lectures, he asserted the Central Limit Theorem was the most useless theorem in mathematics. Of course, he was being intentionally inflammatory. He prefaced that statement with some history – the Central Limit Theorem got its name not (as I originally thought) because summing random variables “tends to lie at the center” of a gaussian. It’s because Pólya thought it was the most important result in probability. It was central to the field.

    Ryan went on to tell us about the Berry-Esseen Theorem, which gives good error bounds on the convergence guaranteed by the Central Limit Theorem. 

  3. As a quick exercise, why do we get an upper bound here? We prove the harmonic series diverges by lower bounding it…