# Confidence intervals for percentages

As in the case of the universe mean, the researcher may also wish to construct confidence limits for universe percentages. Fortunately, the theory is identical to that used to construct confidence limits for the universe mean since a percentage is but a special case of the mean. It follows that the sampling distribution of a percentage is, for large samples, approximately normally distributed. The standard error of a percentage from a simple random sample is estimated by the formula
Sp = √ pq / n
Where:

p = Percentage of items in the sample possessing a given characteristics
q = percentage of items not possessing the characteristics
n = samples size

Example: A simple random sample of 100 families shows 40 own a dog and 60 do not. The estimated standard error Sp would be computed as follows:

Sp = √ (p) (q) / n = √ (40) (60) / 100 = 4.9%

The 95.4 percent confidence interval would be

p ± 2 Sp = 40 % ± 2 (4.9%) = 40% ± 9.8%

Thus, one would be 95.4 percent confident that the true percentage of dog ownership was between 30.2 and 49.8 percent. In the above example, if the sample size were 600 instead of 100, the confidence interval would be 40% ± 2(2%), or 36 – 44 percent. This again demonstrates how researchers can control the width of the confidence interval through the use of different sample sizes.

Larger samples may be needed for percentages

Caution is needed when constructing confidence intervals if values of p less than 30% or more than 70% are used. In such situations, a sample of more than 100 is needed if the normal approximation to the sampling distribution is to be satisfactory. For example, if only 2% of a universe had a certain characteristic, the sample percentage from that universe would not be normally distributed unless n were extremely large.

Concluding Comments on Interpretation of Confidence Interval Results:

Before leaving this subject, it must be emphasized that confidence interval interpretation is based on what would happen if a very large number of samples were drawn from the particular universe of interest. Ordinarily one sample is chosen and the calculated confidence interval either will or will not cover the universe mean. What is guaranteed when a specific confidence interval is used for example, a 95.4 percent is that 95.4 percent of the confidence interval statements will be correct in the long run.

For instance, in the food expenditure example, the confidence interval statement is that, with 95.4 percent confidence, the interval \$384–\$416 will bracket the universe mean expenditure. This particular statement is either true or false. What the researcher is guaranteed is that if a large number of samples were drawn and the interval calculated this way each time then about 95.4 percent of the calculated intervals would in fact include the universe mean.

Example: Consider the sample size problem posed at the beginning of this article. Incentive travel usage is projected to be 40 percent, and the hotel chain wishes to be virtually certain (i.e. 99.7 percent confident) that its sample based percentage will be within 10 percentage points of the actual universe percentage.

1) Use the desired precision and the desired confidence level to calculate the required standard error of the percentage. Because the hotel chain wishes to be virtually certain (99.7 percent) that the sample result will be within ± 10 percentage points of the universe percent, three standard errors of the percentages must be set equal to 10 percent. The sample size needed will then be such that:

3 Sp = 10%

Sp = 3.33%

2) Calculate the required sample size. Because incentive travel usage is projected to be 40 percent, that percentage can be used in the formula as an estimate of the percentage of items in the sample possessing a given characteristics (p). Then, using Sp = 3.33%, p = 40% and q = 60% in the above formula, the hotel chain can calculate that it must take a simple random sample of size:

n = pq / (Sp)2 = (40) (60) / (3.33)2 = 216

Because this sample size is less than 5 percent of the universe size (N= 21, 476), no adjustment is needed.