Evaluating Secondary Data

On many occasions, researchers must choose from two or more sources of data. The choice should be guided by the determination of which data score highest on the following considerations:

Pertinency: To be usable, the data must have the same units of measurement specified in the project, must be applicable to the periods of time in question, and must be derived from the universe. Classes of data must be constructed in the same way as in the project.

Who collected and Published the Data and Why: In evaluating secondary data, the researcher must examine the organization that collected the data and the purposes for which they were published. An organization that makes the collection and publication of data its chief functions is apt to furnish accurate data. Obviously, the success of such a firm depends on the long-run satisfaction of its client that the information supplied is accurate.

The ability of an organization to procure the wanted information is a pivotal consideration. This often reduces itself to a matter of authority and prestige. The US Bureau of Internal Revenue, for instance, can obtain accurate information about income more easily than any private firm simply because it has legal authority to do so.

When feasible, the capabilities and motivation of the individuals responsible for the data collection should also be appraised. Reputation, experience, and degree of independence on the particular project are all genuine considerations in assessing the reliability of an ‘expert’. An individual working for an independent research agency would be more likely to turn out an accurate report than the same individual working or an organization committed to one side of a question.

Discovering the purpose for which data are published is mandatory for an adequate evaluation of secondary data. Data published to promote the interest of a particular group, whether political commercial or social are suspect. At the same time, not all data credited to sources with an axe to grind should be dismissed out of hand. Nevertheless information so procured should always be handled with care.

Data Collection Methods: If a source fails to give a detailed description of its method of data collection, researchers should be hesitant about using the information provided. All too, often shyness about revealing the procedures used to collect data suggest the employment of inadequate methods. Most primary sources, however, describe their methods, even if only briefly.

When the methodology is described, researchers should subject it to a painstaking examination. Even if the procedures appear sound, caution must be exercised because weaknesses tend to be camouflaged. Searching questions must be answered positively before the data can be used. If a sample was used, was it selected objectively? Was it large enough particularly for the sub-samples? Was it chosen from the universe of interest? Was the questionnaire adequate or getting the desired information? What kind of supervision was exercised over the people who actually collected the data? Were any checks made on the accuracy of the field workers’ results?

General Evidences of Careful Work: An indispensable point of evaluation is the general evidence that the data have been collected and processed carefully. Is the information presented in a well-organized manner? Are the tables constructed properly, and are they consistent within and among themselves? Are the conclusions supported by the data.

Conflicting Data: If several sources of data relating to a researcher’s problem are available, the data can be submitted to a quality control analysis of the sort applied in production. After dividing the data into ‘good’ and ‘poor’ on the basis of criteria like those mentioned above, correlations on points of interest can be run between the two groups, and statistical tests can be made. In projects that rely heavily on secondary data, this technique is particularly valuable.