Celebrity deaths: A statistical analysis

Twice before I’ve written about the “celebrity deaths come in threes” superstition, in 2008 and 2009.

With the recent passing of Art Linkletter, Gary Coleman, and Dennis Hopper, this superstition has again resurfaced.

I feel my previous arguments have already been quite persuasive, but now let’s add a statistical debunking.

To analyze the superstition, we need to define it. That includes two tasks:

  1. Who is and isn’t a celebrity
  2. The timeframe for the deaths to occur

I extracted the data of all 1,422 celebrity deaths that have occurred between January 1, 1995 and May 31, 2010 from a site called stiffs.com, which is the location of a death pool contest. (The contest has entrants predict which celebrities will die in the upcoming year and assigns points for correct guesses. Last year’s winner took away over $3,000.)

This addresses the first question, who’s a celebrity. At stiffs.com they have a panel of judges determine whether or not a person who passes away is famous, based on simply whether or not five or more members of the panel have heard of the person. They then create a list of celebrities ahead of time, and then monitor that list to see who has passed away.

You may well disagree with the fame assessments of stiffs.com. Certainly the data included plenty of people I personally had never heard of. But it’s a list that exists independently of the superstition, and is pre-existing, so it doesn’t suffer from the selection bias that arises when you assess whether or not a person is famous only after they have died.

As for the timeline, I decided to analyze it with as much leeway as possible. One day between each death? Up to two days? Three? Five? Seven? Who knows. I analyzed with a number of tolerance days all the way up to 10.

Before we get into the numerical analysis, let’s visualize the data.

[A timeline chart showing all celebrity deaths from 2004-01 to 2010-06, using data extracted from stiffs.com

(Click to enlarge; depending on your browser, you may need to click again to view at 100% and then scroll from left to right)

As you scroll back and forth in the listing of deaths from 2004 through today, your mind can certainly pick out groups of three. But is it ALL groups of three? Is it even MOSTLY groups of three? Your eyes already tell you the truth, that of course it’s not.

The numbers back up that visual refutation.

There are quite a few ways to analyze the data, and I tried to be comprehensive. Here are the approaches I took:

  1. Rolling timeline: This is probably the best method. (It was suggested by Patri.) When a death occurs, I start a counter. The counter lasts up to x days. (I analyzed with x from 1 to 10.) I keep track of how many celebrity deaths occur within that period. The counter resets after x days, and starts again whenever the next death occurs. With x at 7, for example, it’s basically an analysis of how many deaths a week, using rolling weeks.
  2. Continuous grouping: When a death occurs, start a count. Look at the next death. Is it within x days? If so, increment the counter. If not, start over at 1. Again, I analyzed with x ranged from 1 to 10.
  3. Separate tests: For each death, I calculate if it’s part of a group by looking at the date of death of the first member of the group, and see if it’s within x days of the last death. For the first death, it should be more than x days. For the subsequent deaths, it should be within x days. I then judge “pass” or “fail” for each death. I applied this analysis to groups of 1, groups of 2, groups of 3, groups of 4, groups of 5, and groups of 6. I also let it “roll” by varying where I started the counter. This analysis also looked at x ranging from 1 to 10.

So, what are the results?

For rolling timeline, we see the following results:

Tolerance Days (x) Groups of 1 Groups of 2 Groups of 3 Groups of 4 or more
1 75.7% 19.0% 4.6% 0.6%
2 47.6% 35.3% 12.4% 4.6%
3 28.3% 40.8% 20.0% 10.8%
4 18.8% 39.8% 23.6% 17.8%
5 12.7% 31.9% 23.8% 31.6%
6 9.7% 26.0% 24.7% 39.6%
7 7.5% 22.9% 25.1% 44.5%
8 6.0% 18.0% 22.6% 53.4%
9 4.7% 14.9% 20.5% 59.9%
10 3.7% 12.1% 19.4% 64.8%

No matter how many days of leeway you give, groups of three never actually best explain the data. If you give a lot of leeway, such as 10 days, larger groups occur. If you give only a little leeway, most deaths happen alone or in pairs.

The best performance for groups of three is when you allow a leeway of 7 days, but even then the superstition fits for just 25% of the deaths. (Groups of two deaths are not far behind, at 23%.) A superstition that’s only right one time out of four — and does no better than several of variants of the superstition — well, that’s not a useful superstition.

So, for this methodology, groups of three never really succeeds. With 7 or more days of leeway, three is the average and median for groups of deaths, but only with a 23% success rate. No interpretation of this data with this method would lead one to agree that celebrity deaths come in threes.

For the second method, continuous grouping, the results are similar. You can get some pretty big groups with this method — using three tolerance days, the largest group turns out to be a group of 21 celebrity deaths. And with 10 tolerance days, the largest group is of 243 deaths!

However, no matter how many tolerance days you allow, groups of three never amount to more than 14.3% of all groups. So at best, groups of three explains about 1 death in 7 with this method.

The third method I used was separate tests. To be honest, this is a pretty stupid method, since if, say, two deaths in a group of three fit the pattern but one doesn’t, it still scores as two out of three when really the entire group should fail. And the groups are highly dependent on previous groups, so if there’s a missing celebrity or a person included who isn’t really a celebrity, it throws off the entire test.

Under this method, groups of three still score very poorly. No matter how many tolerance days you allow, from 1 to 10, it always turns out that some other grouping (such as groups of 2 or groups of 6) beat out groups of 3. Groups of 3 performed best with 10 days of tolerance, but with that high a tolerance, groups of 4, 5, or 6 fit even better. At most, 64% of celebrity deaths would pass a group of three test but at the same time 72% fit a group of 4.

The data, analysis, and chart are all available for you to examine (Google docs share, 6.6 megs, Excel format).

If you asked me, the best method is the rolling timeline method, and the most reasonable number of days of tolerance is three. Going with that, we find that, on average, the group size is 1.7.

But “Celebrity deaths come in 1.7s” doesn’t have a winning ring to it.

4 Responses to “Celebrity deaths: A statistical analysis”

  1. Steve Says:

    I think there’s one more way to analyze the data:

    What is the average number of days for a span of any 3 deaths?

    The algorithm would be:
    1. Start with the oldest death, D.
    2. Move forward N deaths (N=2 to get to the 3rd death) to death D+2
    3. Record the number of days between deaths D and D+2
    4. Move forward to death D+1 and continue the algorithm at step 2.

    Then, you could compute the mean, median, standard distribution, etc. for the number of days between 3 deaths. You could even use the result of this computation as an input to your other analysis.

    I’m happy to code this up if you can provide a CSV or other easily parseable version of the data. 🙂

  2. Otto Says:

    Looking at your analysis, I find it unusual that the groups of 3 percentages, while never really being the most common, do remain strangely consistent across the 1-10 days leeway period.

    The groups of 1 and 2 fall as you increase the number of days (as you’d expect) and the groups of 4+ rise along the same lines. But the groups of 3 number stays within 20-25%, for the most part.

    Perhaps this gives rise to the thinking, as it’s most noticeable despite how much leeway is given in the minds of the people thinking it. Noticing it happen 25% of the time is plenty enough for it to cross the “coincidence threshold” for most people.

  3. Stephen Says:

    (Steve, see below, comment responded to on FriendFeed.)

    Otto, I think you’re right that in general (using the rolling method, method 1) with a wide range for x, groups of 3 performs slightly less miserably than other groupings, usually at best 25%.

    I suspect there’s a selective perception going on (where people remember the groups of 3 but forget the other times when there isn’t), and you’re right that 25% is high enough to reinforce the belief, if you already hold it.

  4. Stephen Says:

    (The detailed follow-up discussion is at http://friendfeed.com/zeigen/10c5af72/celebrity-deaths-statistical-analysis)

Leave a Reply

AVATAR: Sign up for a free avatar with Gravatar.
CLICK FOR COMMENT XHTML TAG HELP