Friday, December 21, 2012

Q: The probability of success in each of a series of independent trials is constant. How can a 95% confidence interval for this proportion be obtained?

This is called the "binomial confidence interval," and there are a few solutions. Wikipedia discusses this here: Binomial proportion confidence interval

Let's say you did 300 trials, and observed 30 successes and 270 failures.

  • One approximate method is to use an approximation to the normal distribution called the "Wald interval."

    With the example above, your 95% confidence interval would be 30/300 plus or minus 1.96 * sqrt( 30*270 / 300^3 ), or in other words between 6.6% and 13.4%.
  • Another approximate method is simply to assume that the underlying parameter p was chosen from a random variable with uniform distribution between zero and one, and then to find a region of conditional probability that contains 95% probability mass given the observation. There are many regions you could choose, but one popular choice is to arrange it so that the conditional probability is 2.5% that p lies below the lower limit of the interval, and 2.5% that p lies above the upper limit.

    In Mathematica:
    • limits[successes_, total_] := InverseBetaRegularized[{0.025, 0.975}, successes + 1, (total - successes) + 1]
    • limits[300, 1000]
      {0.0710558, 0.13922}

    This technique gives an interval between 7.1% and 13.9%. This is known as a Bayesian credibility interval with a uniform prior.
  • A sophisticated exact method is the Blyth-Still-Casella confidence interval. This method was first described in a 1986 paper by Casella and is pretty complicated. It is the preferred interval of the StatXact software from Cytel.

    I have implemented it here: , and worked to match the output of StatXAct.

    • $ ./bsctester 300 30 0.95
      Observed 30 successes in 300 trials.
      Blyth-Still-Casella 95% confidence interval: [0.06849168 0.13821040]

    The Blyth-Still-Casella interval for this test case goes between 6.8% and 13.8% and is "exact." That means that unlike the above two methods, going into the experiment this procedure is guaranteed to have at least the specified probability of including the true value, no matter what it is. For example, if you did N experiments, each with 300 trials and each with the same true value of the p parameter, and calculated a Blyth-Still-Casella 95% interval from each experiment, as N went to infinity at least 95% of the N intervals would include the right answer.

    Blyth-Still-Casella is not the only "exact" interval for this situation -- the simpler Clopper-Pearson interval is often taught in introductory statistics classes and is also exact. But the Clopper-Pearson interval will be wider than the Blyth-Still-Casella interval. "Exact" doesn't mean the interval limits are particular values; it just means the guarantee works for all values of the p parameter. We can also create an exact interval procedure that always returns the entire interval [0,1], but this is not very helpful!

No comments:

Post a Comment