# Cluster Sampling

In the probability sampling methods, each item in the sample is chosen one at a time from a complete list of universe elements. In marketing research practice, it will sometimes be more efficient and expedient to select clusters or groups of universe elements, rather than to choose sample items individually. For example, it would be more expedient to interview in person a cluster of nearby households than to interview the same number of more geographically dispersed households.

Sampling methods in which universe elements are chosen in groups rather than individually are called cluster sampling methods. They are widely used in the sampling of human populations. When no complete universe listing exists (e.g. of adults over 40), a type of cluster sampling called area sampling may be the only practically feasible form of probability sampling.

Example: The concept of cluster sampling and how it differs from simple random sampling may be illustrated by an example. Consider the following universe of 16 houses located on four city blocks:

Block Houses

1 X1, X2, X3, X4
2 X5, X6, X7, X8
3 X9, X10, X11, X12
4 X13, X14, X15, X16

Suppose it is desired to choose a probability sample of eight houses from this universe.

One way would be to choose a simple random sample of eight houses. But suppose that, for some reason, it is not desirable to carry out such a sampling method. An alternative way would be to select two of the four blocks at random and then sample all four houses on each of the two selected blocks.

Such a sampling technique would be a probability sampling scheme because every possible sample of eight houses would have a known probability of being chosen – namely a chance of one in two.

However, every possible sample of eight houses would not have the same chance of being selected (as would be true in simple random sampling). The selection of one house on a block automatically means the inclusion in the sample of all other houses on the block. With this procedure (cluster sampling) it is impossible for certain random samples to be selected. For example, in the cluster sampling process described above, the following combination of houses could not occur: X1, X2, X5, X6, X9, X10, X13, X14. This is because the original universe of 16 houses has been redefined as a universe of four clusters and a random sampling of two clusters has been made.

This procedure amounts to sampling from a universe of groups or clusters of items rather than sampling from a universe of individual items.

Relative Efficiency of cluster sampling and Random Sampling: The statistical efficiency of two sampling systems can be evaluated by comparing the standard error of the means that result from each system when each uses the same sample size. The sampling system that results in the smaller standard error of the mean is judged the more statistically efficient sampling system.

The statistical efficiency if cluster sampling depends on the composition of the clusters. Cluster sampling will be more statistically efficient (i.e. have a smaller standard error for the same sample size) if each cluster can be made to represent most of the possible observations that can be obtained from the universe. In contrast, if each cluster represents only a few different universe observations then cluster sampling will be less statistically efficient (i.e. it will have a larger standard error) than a simple random sample of the same size.

For example, a sample of two city blocks and 30 households on each block would almost certainly be less statistically efficient than a simple random sample of 60 households drawn from the entire city. Intuitively, the reason is that two blocks (with a sample of 30 households on each) would be less likely to represent the entire city than would be a widely scattered simple random sample of 60 households drawn from the whole city.

In practice, cluster samples often are less efficient statistically than simple random samples of the same size. This is because the items in the clusters are relatively homogeneous. Clusters are often defined in terms of geographic areas such as city blocks or ZIP code areas, in order to reduce the travel time and interviewing costs associated with personal interviews. Because universe items in the same area often have similar demographic characteristics and are subject to common socio-economic forces they tend to give similar responses to marketing research questions.