Brighton Webs Ltd.
statistical and data services for industry
Home
Index
Feedback

Confidence limits for Proportion - Normal Approximation

There are several ways of obtaining the confidence level for a proportion.  Proportion is defined as the ratio of the number of successful outcomes to the number of trials:

The terms trial and success can have varied meanings ranging from the number of positive outcomes in the clinical trial of a pharmaceutical product to an affirmative answer to the question "I enjoy breakfast more if I have the results of a political poll to read" in a consumer behavior survey.

The method used to calculate the confidence interval should be appropriate to the application and sample size, there are several accepted methodologies including "Clopper-Pearson" which gives the exact and extreme limits and the "Wald Score" which yields a more conservative interval.  This page is based on the normal approximation to the binomial distribution which is appropriate to large samples drawn from large populations.  This method is used extensively by market research and poling organizations.

Related Pages

Supporting material can be found at:

Binomial Distribution
Normal Distribution
Standard Error

The Formula

The confidence interval is based on the z value appropriate to the confidence level and the standard error of the normal approximation to the binomial distribution, of the product of z and the standard error are often referred to as the margin of error.

For the 95% and 99% intervals, the value of z is 1.96 and 2.58 respectively.

Example

In a survey of 392 consumers, 182 responded that they would be more likely to buy a book if it won a major literary prize.  Provided the population from which the sample was drawn was very large, it can be inferred that this response is typical of 47.4% of people with a margin of error of 2.5%.

The calculation is show below, the estimated probability of a positive response to the question is

p = 186/392 = 0.474 (to 3 d.p.)

Plugging the numbers into the formula for the 95% confidence interval yields:

0.474 ± √(0.474(1-0.474)/392 = 0.025

In %age terms the confidence interval is 44.9 to 49.9%.

Sample Size

The formula for margin of error can be rearranged to provide the sample size for a given level of proportion and confidence level:

Fixing the confidence level at 95% (z = 1.96) and the margin of error at 2.5%, yields this curve:

In some situations, experience may suggest a value for p which can then be used as the basis for selecting sample size.  However, in many situations, the objective of the research is to measure p,  thus it is common practice to assume that p=0.5 which requires the largest sample size.  In which case, the sample size is determined by the margin of error, itself determined by the research budget.  Fixing p=0.5, simplifies the formula for sample size to:

For a confidence level of 95%, it can be seen that the sample size increases rapidly, the the margin of error decreases.

Population Size

This discussion is based on the assumption that the population which is being sampled is very large, if the population is small, then the sample size may have to be adjusted.

Page Updated: 26-Feb-2008

 

For more information: info@brighton-webs.co.uk