Processing math: 100%

Case of the Missing Birth Month

Riddler Express solution for the problem of an unrepresented birth month for a group of officemates.

Several floating colored balloons.
Photo by Adi Goldstein / Unsplash

This week's Riddler Express was very simple.

What was the probability that none of the 40 people had birthdays [in March]? (For the purpose of this riddle, assume that a year consists of 12 equally long months. It’s a sufficiently good approximation!)

Solution

So there are 40 people, and we want to know the probability that none of them have a birthday in March. If the chance of being born in each of the months of the year is uniformly distributed, then the chance of one person not being born in March is simply Pr(Person not born in March)=1112 . For forty people, since their birth months are all independent, it's Pr(Forty people not born in March)=(1112)403.079% .

Thus, the probability is 3.079%.

Verification

We can verify by simulations. We can generate uniformly random birth months for 40 people in julia as rand(1:12, 40). That would be one sample of birth months for everyone. We check if March is excluded from the sample with 3 ∉ rand(1:12, 40). This function counts the number of successes for n simulations:

single_month(n) = sum(3 ∉ rand(1:12, 40) ? 1 : 0 for _ in 1:n)
Julia

Obviously the choice of 3 makes no difference. We can run this for different choices of n and calculate a posterior distribution in the probability of success parameter using the sim function:

using Distributions
using Printf

function sim(f, n)
    w = f(n)
    l = n - w
    d = Beta(w+1, l+1)
    @printf "  %0.17f\n± %0.17f" mean(d) sqrt(var(d))
end
Julia

We'll derive the posterior mean and standard deviation for different values of n:

julia> sim(single_month, 10_000)
  0.03139372125574885
± 0.00174353192717317

julia> sim(single_month, 100_000)
  0.03108937821243575
± 0.00054883444787154

julia> sim(single_month, 1_000_000)
  0.03072593854812290
± 0.00017257394329145

julia> sim(single_month, 10_000_000)
  0.03078269384346123
± 0.00005462152565854
Julia

These agree nicely with our answer of 3.079%.

Extra credit

An extra credit problem was added after I originally wrote this. This time, instead of having none of 40 people with a birthday in March, what if we want the probability that there is any month that is no one's birthday? Using an alternate form of the notin function, ∉(v), we are not given a Boolean, but another function which tests whether its argument is not in v. E.g. if we define f = ∉([1, 3, 4]). Then, f(1) is true and f(2) is false. Then, we can see if there's any month missing from a random sample of 40 birthdays with the expression any(∉(rand(1:12, 40)), 1:12). Thus we can define a function similar to single_month but now for any month as follows:

any_month(n) = sum(any((rand(1:12, 40)), 1:12) ? 1 : 0 for _ in 1:n)
Julia

Now we can estimate the probability as before.

julia> sim(any_month, 10_000)
  0.32773445310937810
± 0.00469317061061426

julia> sim(any_month, 100_000)
  0.32668346633067341
± 0.00148308725472723

julia> sim(any_month, 1_000_000)
  0.32682234635530727
± 0.00046905099962754

julia> sim(any_month, 10_000_000)
  0.32688503462299306
± 0.00014833446759037

julia> sim(any_month, 100_000_000)
  0.32677092346458153
± 0.00004690327072209
Julia

Thus we estimate this to be the much larger probability of 32.7%.