Establishing Categories and Coding after customer response

After the questionnaires have been edited, the responses to individual questions can be assigned numerical codes. Because dichotomous and multiple choice questions have specified responses categories coding the responses to such questions is based on the assignment of a different numerical code to each different response category (e.g. 1 = Yes; 2= Undecided; 3= No). Then when the editor sees the answers given to a certain question by a respondent, in the margin next to the question the editor writes in the numerical code corresponding to the answer which was given. On questionnaires where the dichotomous and multiple choice questions are pre-coded, the editor will not have to code. So, the coding of an individual’s responses to dichotomous and multiple choice questions is simple and straight forward; but unfortunately, this is not true for open questions.

Researchers must review the answers to each open question and establish meaningful categories that will effectively report the findings of these questions.

Example: In the yogurt study described at the beginning, one of the questions asked about the household’s consumption of yogurt in the previous 30 days. After reviewing the answers given to this question by a large number of respondents, researchers established the following four categories and codes.

If the respondent Answered Categories Established Codes

More than 5 containers/m Heavy consumer (1)
Between 2-5 containers/m Moderate consumers (2)
Less than 2 containers/m Light consumer (3)
Zero containers /m Non-consumer (4)

Some open questions can elicit answers that fall into many different (and some unforeseen) response categories. Researchers typically must make two decisions before such open questions can be coded.

Determine the most relevant set of factors: The researchers must first determine the most relevant set of factors to use when setting up the categories. This may not always be a simple decision, as the following example illustrates.

Example: A variety of answers would be given in response to the question, what do you dislike about the car you drive most frequently? The answers to this question might be grouped according to the various parts of the car, such as engine, body, and interior. Or, the answers might be grouped according to what these dislikes mean to the respondent, such as inconveniencies, discomfort, expenses, pride, and fear. These are two alternate sets of factors that could be used to establish categories, and researchers would have to choose between them before coding could begin.

Establish Appropriate Categories for each relevant factor: After the most relevant factors have been selected, researchers must establish categories that accurately reveal the information contained in the answers people gave to the question asked.

Example: In the study concerned with drivers’ dislikes about the car they drive, the researchers selected “inconveniencies” as one of the relevant factors. After doing so, a review of the responses to the open question led the researchers to establish the following six categories of “inconveniences”; (1) entering and exiting from the car: (2) the dashboard and the driver’s controls; (3) other parts of the interior; (4) the motor compartment; (5) the e trunk; and (6) other inconveniencies. Two or more categories were also established for each of the relevant factors selected e.g. discomfort, “expenses”, “pride” and “fear”.

After the relevant factors have been selected and the categories have been established for an open question, each questionnaire must be reviewed for the purpose of identifying the category into which a particular response falls.

Example: Assume the researchers established the categories and codes shown above for the “inconveniences” factor when they used the question, What do you dislike about the car you drive most frequently? The response on one questionnaire may be “the lack of leg room in the rear seat”, while the response on another questionnaire may be “the doors can only be locked from the outside with a key”. The editor may decide that the first response belongs in the “other parts of the interior” category and the second response belongs in the “entering and exiting from the car” category. Codes of “3” and “1” would be written in the margins of the respective questionnaires.

When classifying data from open questions, it is essential that the established categories be mutually exclusive and at the same time cover all possible answers. Ideally, each category should contain similar responses so that overall there will be homogeneity within categories and differences between categories.