A truly incredible fact about the number 37
08 Nov 2023 - Tags: sage , featured
So I was on math stackexchange the other day, and I saw a cute post looking for a book which lists, for many many integers, facts that Ramanujan could have told Hardy if he’d taken a cab other than 1729. A few days ago OP answered their own question, saying that the book in question was Those Fascinating Numbers by Jean-Marie De Koninck. I decided to take a glance through it to see what kinds of facts lie inside (and also to see just how many integers are covered!). Not only was I overwhelmed by the number of integers and the number of facts about them, the preface already includes one of the single wildest facts I’ve ever heard, and I have to talk about it here! Here’s a direct quote from the preface:
37, the median value for the second prime factor of an integer; thus the probability that the second prime factor of an integer chosen at random is smaller than 37 is approximately
; 
My jaw was on the floor when I read this, haha. First it sounded totally 
unbelievable, since 37 is a tiny number in the grand scheme of things. Then 
it started to sound slightly more plausible… After all, about half of all 
integers have 
So then let’s go ahead and do it ^_^
First, I think, the sage code. I want to know if this really works!
“Obvoiusly” there’s no uniform distribution on the natural numbers, so 
what does it even mean to choose a “random” one? The way the number theorists 
usually solve this problem is by fixing a large number 
So for us, we’ll want to first fix a large number 
xxxxxxxxxxdef getSecondSmallestFactors(N):    data = []    for n in range(1, N):        fs = factor(n)        if len(fs) > 1:            second_smallest = fs[1][0]            data += [second_smallest]        else:            # If there's only one prime factor            # say the second prime factor is infinity            data += [oo]        return datadef _(N=input_box(10^5, width=10, label="$N$")):    data = getSecondSmallestFactors(N)    med = numpy.median(data)    show("median second prime: ")    show(med)    show("")    below37 = len([d for d in data if d <= 37]) / N    show("fraction of numbers whose second prime is at most 37: ")    show(below37.n())    show("")When I first ran this code, it honestly felt like magic, haha. What the hell is going on here!?
The key idea, found in a paper of De Koninck and Tenenbaum2, 
is that we can compute the density of numbers whose second prime is 
Let’s do a simple example to start. What fraction of numbers have 
Well it’s not hard to see that the numbers whose second prime is 
or
so we need to count the density of numbers of these forms.
But a number is of the first form (
To bring this back to elementary school3, we can highlight all of 
our numbers with a factor of 
numbers with no factors of 
and numbers with a factor of 
Then the numbers whose prime factorization starts 
It’s intuitively clear that 
So now we have our hands on the density of numbers of the form 
It’s easy to see that these sets are disjoint, so their densities add, and 
Now with the warm-up out of the way, let’s see how we can compute 
We’ll play exactly the same game. How can 
for some 
But we can count these densities as before! For each choice of 
The density of numbers whose second prime is 
We can rearrange this to
As a cute exercise, write 
De Koninck and Tenenbaum mention in passing that
where
Can you prove that this formula is correct4?
But remember the goal of all this! We want to know the prime 
But we can implement 
xxxxxxxxxxdef lambda2(p):    """    Compute the density of the set of n whose 2nd prime is p.    See equation (1.3) in Koninck and Tenenbaum, 2002    """    s = 0    out = 1    for q in Primes():        if q >= p:            break        out *= (1 - (1/q))    for q in Primes():        if q >= p:            break        s += (1/q) * (1 - (1/q))^(-1)    out *= s    out *= 1/p     return outtotal = 0for p in Primes():    if total > 0.5: break    l = lambda2(p)    total += l    show("{} of numbers have {} as their second prime".format(l.n(),p))    show("so {} of numbers have second prime at most {}".format(total.n(), p))    show("")Again we see that 
Also, this shows that we expect the actual density to be 
As another cute exercise – using the ideas in this post, can you compute the median third prime?
As a (much) harder exercise6, can you get asymptotics for how 
the median 
Thanks for hanging out, all! This was a really fun post to write up, 
and I’m really excited to share it with everybody!
This fact about 
I have more blog posts coming, of course, so I’ll see you all soon!
Stay safe, and stay warm ^_^
- 
      
Absolutely the understatement of the year ↩
 - 
      
Sur la loi de répartition du k-ième facteur premier d’un entier
Yes, this paper is in french, but it’s really not so hard to read, especially with liberal use of google translate. Though if you want to avoid reading it, I’ve done the hard work for you, and everything in this blog post is in english.
It also wasn’t too hard to find this paper, thankfully. It’s mentioned in a footnote in the entry for
 in Those Fascinating Numbers, so I had a decent starting point. ↩ - 
      
I literaly got the base image by googling “grid of numbers high res” and clicking the first result, which was for elementary schoolers ↩
 - 
      
It might be helpful to remember a generating function trick that shows up fairly often (for instance in partitions and the riemann zeta function):
Don’t worry that this sum diverges for now. Just take note of why these two sides are equal. You should expand each term of the right hand side as a geometric series, then check what happens when you foil. ↩
 - 
      
(and run it locally, since factoring numbers that big takes so long that the online sagecell times out) ↩
 - 
      
If you read french, the De Koninck and Tenenbaum paper we’ve been referencing all post (Sur la loi de répartition du k-ième facteur premier d’un entier) is actually all about analyzing these asymptotics!
If we write
 for the median th prime, then they show:where
 and is the Euler-Mascheroni Constant. ↩ 
