Let's say you did 300 trials, and observed 30 successes and 270 failures.

- One
**approximate**method is to use an approximation to the normal distribution called the "Wald interval."

With the example above, your 95% confidence interval would be 30/300 plus or minus 1.96 * sqrt( 30*270 / 300^3 ), or in other words between**6.6% and 13.4%**. - Another
**approximate**method is simply to assume that the underlying parameter p was chosen from a random variable with uniform distribution between zero and one, and then to find a region of conditional probability that contains 95% probability mass given the observation. There are many regions you could choose, but one popular choice is to arrange it so that the conditional probability is 2.5% that p lies below the lower limit of the interval, and 2.5% that p lies above the upper limit.

In Mathematica:- limits[successes_, total_] := InverseBetaRegularized[{0.025, 0.975}, successes + 1, (total - successes) + 1]
- limits[300, 1000]
__{0.0710558, 0.13922}__

This technique gives an interval between**7.1% and 13.9%**. This is known as a Bayesian credibility interval with a uniform prior. - A sophisticated
**exact**method is the Blyth-Still-Casella confidence interval. This method was first described in a 1986 paper by Casella and is pretty complicated. It is the preferred interval of the StatXact software from Cytel.

I have implemented it here: https://github.com/keithw/biostat , and worked to match the output of StatXAct.- $ ./bsctester 300 30 0.95

Observed 30 successes in 300 trials.

Blyth-Still-Casella 95% confidence interval:__[0.06849168 0.13821040]__

The Blyth-Still-Casella interval for this test case goes between**6.8% and 13.8%**and is "exact." That means that unlike the above two methods, going into the experiment this procedure is**guaranteed**to have at least the specified probability of including the true value, no matter what it is. For example, if you did N experiments, each with 300 trials and each with the same true value of the p parameter, and calculated a Blyth-Still-Casella 95% interval from each experiment, as N went to infinity at least 95% of the N intervals would include the right answer.

Blyth-Still-Casella is not the only "exact" interval for this situation -- the simpler Clopper-Pearson interval is often taught in introductory statistics classes and is also exact. But the Clopper-Pearson interval will be wider than the Blyth-Still-Casella interval. "Exact" doesn't mean the interval limits are particular values; it just means the guarantee works for all values of the p parameter. We can also create an exact interval procedure that always returns the entire interval [0,1], but this is not very helpful! - $ ./bsctester 300 30 0.95

## No comments:

## Post a Comment