Issues in the Selection of Stratified Random Sample

The preceding part dealt with how to make certain simple estimates from the data secured by a stratified random sample. It is now necessary to consider some of the issues involved in setting up a stratified radon sample:

1. What characteristics should be used to subdivide the universe into different strata?
2. How many strata should be constructed?
3. How many observations should be taken in each stratum?

In a practical situation, these questions must be answered before data can be gathered and the estimates prepared. They are considered at this point, rather than earlier, because it is easier to discuss them after the general procedure of stratified random sampling and estimation has been sketched out.

How should the universe be stratified? As a general rule, a reasonable approach is to create strata on the basis of a variable known to be correlated with the variable of interest, and for which information on each universe element is known. In the Wheaties example, the individual universe elements (stores) are classified into an appropriate number of strata based on the known variable (total store sales) that is correlated with the measurement being studied (Wheaties sales).

Strata should be constructed in a way which will minimize differences among sampling units within strata, and maximize differences among strata. As a result a relatively small sample within each stratum will provide a precise measurement of that stratum’s mean. The weighting together of the different stratum sample means will, generally provide a better estimate of the universe mean than would be provided by a simple random sample of the same total number of units.

In practical marketing research, it has been found that geography and population density are useful stratification bases when sampling human populations. When sampling institutions such as stores, manufacturers, and so on vary greatly in size, one will almost always stratify on some measure of size such as store dollar volume, number of employees, and so on. A measure of size is used because of the great variability typically found within such universes in such important variables as sales, inventories and the like.

How many Strata should be constructed? Common sense suggest as many strata as possible be used so that each stratum will be as homogeneous as possible. If estimates are wanted for particular universe subgroups (e.g. different store size categories), then it will be necessary to set up a separate stratum for each. Each stratum mean can then be estimated with high precision. In turn, the overall population mean will be estimated with high precision.

However, practical considerations limit the number of strata that is feasible. Costs of adding more strata may soon outrun benefits. Also, as with simple random sampling, one must have a separate listing of all the items comprising each individual stratum in order to sample separately from each. In many situations such lists are not available.

How many Observations should be taken in each Stratum? Once strata composition and number have been decided, the next question is how many sampling units should be drawn from each stratum? To illustrate the problem, suppose one has fixed budget and that the cost per observations is the same for all strata. This amounts to saying that the total sample size (for all strata) is fixed.

Proportional Allocation: The most obvious way and generally most common when sampling human populations is to use proportional allocation.

Disproportional Allocation: There are circumstances where some form of disproportional allocation should be considered (i.e. sampling different rates). These arise most commonly when sampling institutional universes (e.g. grocery stores, manufacturers) rather than human universes. As a general, principle when the variability among observations within a stratum is high, one samples that stratum at a higher rate than for strata with less internal variation.