Tribute to David Seal, 1956 - 2021
For those who have never seen a "Seal-O-gram" and are wondering what one might look like - here is one that David wrote just over a year ago in response to a simple, possibly rhetorical, question from Robin Saxby about the probability of 3 people in the first 20 ARM employees sharing a birthday:
About Robin's question, with a couple of simplifying assumptions it's quite easy to work out the probability that two people in a group of N share a birthday by instead working out the probability that no two of them share a birthday. Starting with N=1, there simply aren't two people in a group of 1, so the probability for N=1 is 100%. Then for N=2, it's certain that the first 1 of them don't contain two with the same birthday, so they 'occupy' one birthday and when you add the second person to the group, they have a 1 in 365 chance of sharing a birthday with someone in the group, or a 364 in 365 chance of not sharing a birthday with anyone in the group. So the probability for N=2 is 100% * 364/365 = 99.73%. A similar argument then says that the probability for N=3 is 99.73% * 363/365 = 99.18%, then that the probability for N=4 is 99.18% * 362/365 = 98.36%, and so on. As Allen says, the probability of no two sharing a birthday gets down to 50% (or rather slightly less at 49.27%) at N=23, and so the probability of two people sharing a birthday rises above 50% at a group of 23 people.
The simplifying assumptions in that are that nobody is born on February 29th and that all the other birthdays are equally likely. The first is obviously not true, and the second isn't either according to https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/articles/howpopularisyourbirthday/2015-12-18 . I doubt that those assumptions not being true makes much difference to the answer, but a fully accurate calculation would be a lot more difficult and probably only something that can reasonably be done by a computer program.
And I'm afraid that I think the problem of calculating the probability that three people share a birthday in a group of N people is probably similar, either with or without the simplifying assumptions. I have however done a random simulation that indicates that the probability for 23 people is about 1.25%, i.e. only about half the figure that Allen has found or calculated. I will think a bit more about it, though, and let you know if I find a reasonably explainable way of calculating it!
By the way, that link indicates that February 4th is a less likely birthday than most - about three quarters of all the days of the year occur more commonly as people's birthdays.
John
13 January 2022