Friday, November 30, 2012

Q: Event A has a probability of 70% of happening within the year and event B, 40%. The events are independent and uniformly distributed through the year. What is the probability that they will occur within 3 months of each other?

There is more than one way to answer this question!

The ambiguity comes down to exactly how we interpret the statement that "Events A and B are uniformly distributed across the year."

FIrst interpretation: Events A and B are produced by a memoryless process with uniform hazard function. Every day, we wake up and Event A has a certain uniform probability of happening that day, the same as every other day. Event B is independent and has its own uniform probability of happening that day. If the event doesn't happen, we go to bed and wake up the next day, and it's the exact same story, with the same probabilities, all over again, just like "Groundhog Day."

I'm saying "every day," but to be mathematically precise, this is not about days -- it's true of every instant in time. Every infinitessimal fraction of a second is treated the same. Past performance is of no use in trying to predict future results. 

The effect is sort of like playing daily Russian roulette. Every time you pull the trigger, you have uniform probability 1/6 of killing yourself. Each time you pull the trigger is the same setup as before. I'm belaboring the point here because I do want to persuade you that this interpretation of the question has some appeal.

What makes this tricky is that if you do play Russian roulette every morning starting January 1, you are more likely to die earlier in the year than later -- it's more likely that you will end up killing yourself on day 1 or day 5 than on day 500,000.

The probability distribution of times (for the event) that reflects this uniform hazard function is the exponential distributionf(x) = \lambda e^{-\lambda x}, where \lambda controls the probability that the event will occur within the first year as you say. (Or in other words, it controls the number of bullets in the chamber.)

To match the values you gave (and with the units of time in years), event A will have \lambda = \log \frac{10}{3} and event B will have \lambda = \log \frac{5}{3}.

That is to say, in any given short snippet of time, Event A is twice as likely to happen as Event B. We can verify that this meets the constraints outlined in the problem:

\int_0^1 \log{(10/3)}\,e^{-\log (10/3) x} \, dx = 0.7

\int_0^1 \log{(5/3)}\,e^{-\log (5/3) x} \, dx = 0.4

And Mathematica can help us verify that the probability of an event at any particular moment, when you reach that moment, really is uniform:

  • dist1 := ExponentialDistribution[Log[10/3]]
  • dist2 := ExponentialDistribution[Log[5/3]]
  • Simplify[HazardFunction[dist1, x], x >= 0]
  • Log[10/3] (is a constant)

Ok, so what is the answer to the problem? We just need to integrate up the area of the joint probability density function where the two events occur within a quarter year of each other. So that's one integral to cover the whole year for the first event, and then a second integral for the second event where we just cover the ground within plus or minus a quarter-year of the first event (making sure not to count area outside the year itself). Using Mathematica:

  • Integrate[PDF[dist1, x] PDF[dist2, y], {x, 0, 1}, {y, Max[0, x - 1/4], Min[1, x + 1/4]}]
  • (1/(50*Log[50/9]))*(15^(1/4)*(3*Sqrt[3]*(-1 + 2^(1/4)) + 5*Sqrt[5]*(-2 + 2^(3/4)))*Log[10/3] - (-59 + Sqrt[900 + 277*Sqrt[30]])*Log[50/9])

That's a bit nasty, but we can approximate it:

  • N[%,24]
  • 0.125554938301265317927791

So there you have it: if you have two independent memoryless processes, each with uniform hazard, and one has 70% probability of happening in a year and the other has 40% probability of happening in a year, then the probability they will both happen within the year and within 1/4 year of each other is about 12.6%.

Second interpretation: There's another way to interpret the question, which is to say that the day is chosen beforehand, by throwing a dart at a calendar, and just sticking to that secret date until it arrives. Here, there really is a memory. If the event has 100% probability of happening in a year, and we get to December 30th with no event, we know there's a 50-50 chance[*] it will happen that day. If it still doesn't happen, then we know there's a 100% chance it will happen on December 31st. Past performance is helpful in predicting the future.

[*] (A conditional probability of 0.5, given that the event hasn't happened yet and there are only two days left.)

Under this interpretation, we are just asking whether two darts thrown at a calendar will both hit the calendar and will be within 1/4 year of each other. Or in other words, if we pick a random number uniformly between 0 and 10/7, and another random number between 0 and 10/4, whether the two random numbers will both (a) lie between 0 and 1 and (b) lie within 1/4 of each other.

Or in still other words, if you make a rectangle that's 10/4 high and 10/7 wide, then what fraction of the area of the whole rectangle lies (a) within a unit square and (b) in the diagonal band of points that have an x-value within 1/4 of the corresponding y-value.

This can be done by adding up the area of a few triangles, but because I am lazy I also used Mathematica:

  • Integrate[(7/10) (4/10), {x, 0, 1}, {y, Max[0, x - 1/4], Min[1, x + 1/4]}]
  • 49/400
  • N[%]
  • 0.1225

So there you have it: if you select two random instants in time uniformly at random, such that one has 70% probability of happening in a year and the other has 40% probability of happening in a year, then the probability they will both happen within the year and within a quarter-year of each other is about 12.3%.


In summary, it does depend (albeit to a small degree) exactly how we translate the somewhat intuitive statements of English into the language of probability. In one case, the risk of the event remains uniform, and each day gets an equal and identical opportunity when it arrives. In the second case, the date of the event itself is chosen uniformly at random.


Note: Only a member of this blog may post a comment.