# Sample Design

Sample design is a problem that must be addressed in any sampling operation. This subject may be divided into (1) determining sampling units, (2) selecting the sample items and (3) estimating universe characteristics from sample data.

The sampling section will be devoted to an examination of these topics, with particular emphasis on methods of sample selection.

Determining sampling Units: Consider that problem of finding the proportion of grocery stores in the New York metropolitan area that stock Claussen pickles. Here grocery stores would be the units observed and therefore, it would be reasonable to consider a direct sampling procedure. Given a list of all New York metropolitan area grocery stores, it would be relatively easy to choose a sample.

If no such list were available however it would be necessary to resort to some indirect method of sampling stores. One might for example, choose a sample of areas (such as city blocks) and observe all, or a specific fraction, of the grocery stores located in the chosen blocks. Thus, where a list of the units to be studied is not available, sampling units (such as blocks) that contain particular units being studied (such as stores) and for which a list does exist, can be used.

Selecting the sample:

Another part of the sample design problem is the method of choosing the sample items. Two general classes of methods exist of selecting samples probability methods and non-probability methods.

Probability sampling methods are these in which every item in the universe (e.g. every grocery store in the New York metropolitan area) has a known chance, or probability, of being hosen for the sample. This implies that the selection of sample items is independent of the person making the study – that is the sampling operation is controlled objectively so that the items will be chosen strictly at random.

Non probability sampling methods are those that do not provide every item in the universe with a known chance of being included in the sample. The section process is at least partially subjective.

The Claussen pickle example provides an illustration of each of these two general classes of sapling methods.

A Probability sampling Method:

From a list of all New York Metropolitan area grocery stores select a sample of 50 stores at random that is in such a way as to give store an equal chance of being selected. Field workers visit all 50 stores and observe whether Cluassen is in stock.

A Non probability sampling method: Ten New York metropolitan area field workers visit five ‘average’ grocery stores near their homes and observe stores near their homes and observe whether Clausen is in stock.

Each method will provide a sample of 50 stores. The first method will cost more because the sample stores will likely be distributed throughout the New York area. The use of such a method, however, would guarantee that every store had an equal chance of being included in the sample. The second method of sample selection would cost less, as the stores would be near the observers’ homes and the observers would not spend as much time traveling among the. But there is no rigorous way of determining whether the sample is representative of all the stores in the New York area.

The major emphasis will be placed on methods of probability sampling. One cannot think intelligently about non-probability samples without using probability sampling theory as a reference point. However, several non-probability techniques are widely used.

Estimating Universe Characteristics from Sample data:

Marketing researchers are often interested in numbers that describe particular universe properties for example, the arithmetic mean or the percentage possessing a certain characteristics. Because these descriptive numbers will usually be unknown, the researchers have to estimate them by measuring sample data. Thus, it is often necessary to rely on an estimate of a universe value, which will generally be different from the true universe value.

It is important to note that any universe value (e.g. mean or percentage) is fixed number, although generally unknown. In contrast, the estimate of the universe value obtained from sampling will vary from one sample to the next. For example, if one were to take 25 independently selected random samples each of 100 items and each from the same universe, a different sample mean would be expected each time. This would be anticipated even though there was only one real universe mean.

Example: Consider the universe that exists when the face cards are removed from a deck of playing cards. Each card in this universe of 40 items has a numerical value, and the universe mean is 5.5.

From this Universe of 40 cards 50 independent random samples of 5 cards each were drawn. After each was chosen the mean of the sample was calculated, the five cards selected were replaced in the universe; and new sample was drawn.