Reliability, while indispensable, only tells you that the test is measuring something consistently. It does not prove that you are measuring what you intend to measure. A mis-manufactured 33-inch yardstick will consistently tell you that a 33-inch board is 33inches long. Unfortunately, if what youâ€™re looking for is a broad that is one full yard long, then your 33-inch yardstick though reliable, is misleading you. What you need is a valid yardstick. Reliability is the first major requirement for a test, since if itâ€™s not measuring whatever itâ€™s measuring consistently, then you canâ€™t trust it al all. Validity is the second major requirement for a test. Validity tells whether the test (or yardstick) is measuring what you think itâ€™s supposed to be measuring.
A test is a sample of a personâ€™s behavior, but some traits are more clearly representative of the behavior being sampled than others. A typing test, for example, clearly corresponds to an on-the-job behavior. At the other extreme, there may be no apparent relationship between the items on the test and the behavior. This is the case with projective personality tests. Thus, in the Thematic Apperception Test, the psychologist asks the person to explain how he or she interprets an ambiguous picture. The psychologist uses that interpretation to draw conclusions about the personâ€™s personality and behavior. In such tests, it is more difficult to prove that the tests are measuring what they are said to measure â€“ that theyâ€™re valid.
Test validity answers the question, â€œDoes this test measure, what itâ€™s supposed to measure?â€
Put another way, validity refers to the correctness of the inferences that we can make based on the test. For example, if JA gets a higher score on a mechanical comprehensive tests than JI, can we be sure that JA possesses more mechanical comprehensive than JI? With respect to employee selection tests, validity often refers to evidence that the test is job related â€“ in other words, that performance on the test is a valid predictor of subsequent performance on the job. A selection test must be valid since, without proof of validity, there are no logical or legally permissible reasons to continue using it to screen job applicants. In employment testing, there are two main ways to demonstrate a testâ€™s validity: criterion validity and content validity.
Demonstrating criterion validity means demonstrating that those who do well on the test also do well on the job, and that those who do poorly on the test do poorly on the job. Thus, the test has validity to the extent that the people with higher test scores perform better on the job. In psychological measurement, a predictor is the measurement (in the case, the test score) that you are trying to relate to a criterion, like performance on the job. The term criterion validity reflects that terminology.
Employers demonstrate the content validity of a test by showing that the test constitutes a fair sample of the content of the job. The basic procedure here is to identify job tasks and behaviors that are critical to performance, and then randomly select a sample of those tasks and behaviors to be tested. A data entry test used to hire a data entry clerk is an example. If the content you choose for the data entry test is a representative sample of what the person needs to know for the job, then the test is probably content valid.
Demonstrating content validity sounds easier than it is in practice. Demonstrating that (1) the tasks the person performs on the test are really a comprehensive and random sample of the tasks performed on the job, and (2) the conditions under which the person takes the test resemble the work situation is not always easy. For many jobs, employers opt to demonstrate other evidence of testâ€™s validity â€“ such as its criterion validity.