Such tests are similar to the side-by-side comparison tests discussed above; but they differ in that respondents use one product first and then, either days or weeks later, try the second product.
The identities of the two products are masked. One half of the respondents receive product A first and the other half receive product B first. Such a spilt is necessary to avoid “tried last” bias. Staggered comparison tests have many of the disadvantages of the paired comparison tests, but in theory they better replicate the actual market since customers usually buy one product at a time instead of two different brands of the same product at one time. In practice, however, there are little differences in the results obtained by the two tests.
Difficulties in Conducing Paired Comparison Tests: In appraising the paired comparison technique. It is important to keep in mind some basic weaknesses that reduce the confidence one should have in the results. For example, it is difficult to obtain and maintain the cooperation of members of a consumer use panel that is, to increase cooperation, incentives may be used. If respondents are paid, they are more apt to feel some responsibility for completing the test. Even so, the sample may be biased because some people will not participate.
Another difficulty comes from the fact that the test can never simulate precisely the conditions in the marketplace under which buying decision are made. In a paired comparison test, respondents usually have no knowledge of price differentials and must, therefore assume that no difference exists. It is doubtful if respondent statements about how much more they would be willing to pay for a given product are very meaningful.
There is further question of how valid the findings are relative to actual behavior. The typical user of a product does not compare the merits of one product with those of others on a side-by-side basis. Participants in a consumer test realize they are test subjects. They assume that differences exist among the test products and that their job is to find them. In other words, in a test situation, differences are apt to be magnified out of proportion to their importance in the ‘normal’ market.
Another difficulty is that consumers are inconsistent in their preferences over a number of trial uses. Ralph Day reports that choice reversals of 40 percent are not uncommon; this has led him to conclude that consumer behavior in preference tests should be viewed as probabilistic versus deterministic. This implies that the less the differences between the test products the greater the instability of the test results.
Order bias in paired comparison tests poses yet another problem. There seems little doubt that preferences are often related to the order in which the two items are tested. Further, there is evidence that, the greater the difference between the test units, the less important the order bias. Day explains the situations:
When faced with choice between two items perceived to be similar the consumer is prone to look for clues of any sort to help him [or her] choose and may react to extraneous factors or very weak stimuli. For example, a group of thirsty people who are presented two similar or identical samples of a cola drink may have a definite tendency to report that the first item was best. However, if there are substantial differences on one or more cola attributes such as sweetness, flavor strength, or carbonation the individual can recognize this and is much less likely to be influenced by any weak stimulus associated with the testing order.
Based on an evaluation of the results from some 50 paired comparison product tests, their data indicate a clear bias in favor of the product tried first and, especially, when the rating of the first item is accomplished prior to testing the second. They conclude there is no easy solution to the problem – since juxta-positioning products, undertaking sequential placement test, introducing a third product to be used as a reference point for the two least items, and use of an experimental design in which products are tested against themselves do not remove the difficulties and, in some cases, further confound the test situation. They advocate controlling the order of trial, but allowing respondents to try both products before committing themselves to any responses. A monadic rating of products after both products have been tried would be free of the “warming up” phenomenon between first and second trails and would thus allow an estimation of order bias, product preference and interaction. The danger is that, if interaction is identified, one may not be able to interpret any of the data.