Types of Questionnaire Questions in Research: From Dichotomous to Open-Ended and Sensitive Items

Explore key types of questionnaire questions — dichotomous, multiple-choice, rank order, rating scales, open-ended, and sensitive questions — with examples and expert guidance for effective research survey design.

What are Types of Questionnaire Questions in Research

Dichotomous Questions

A highly structured questionnaire will ask closed questions.

Definition and Examples

These can take several forms. Dichotomous questions require a ‘yes’/‘no’ response, e.g. ‘have you ever had to appear in court?’, ‘do you prefer didactic methods to child center methods’?

Advantages of Dichotomous Questions

The dichotomous question is useful, for it compels respondents to ‘come off the fence’ on an issue. Further, it is possible to code responses quickly, there being only two categories of response. A dichotomous question is also useful as a funneling or sorting device for subsequent questions, for example: ‘if you answered “yes” to question X, please go to question Y; if you answered “no” to question X, please go to question Z’. Sudman and Bradburn (1982:89) suggest that if dichotomous questions are being used, then it is desirable to use several to gain data on the same topic, in order to re duce the problems of respondents’ ‘guessing’ answers.

Limitations and Potential Bias

On the other hand, the researcher must ask, for instance, whether a ‘yes’/‘no’ response actually provides any useful information. Requiring respondents to make a ‘yes’/‘no’ decision may be inappropriate; it might be more appropriate to have a range of responses, for example in a rating scale. There may be comparatively few complex or subtle questions which can be answered with a simple ‘yes’ or ‘no’.

A ‘yes’ or a ‘no’ may be inappropriate for a situation whose complexity is better served by a series of questions which catch that complexity. Further, Youngman (1984:163) suggests that it is a natural human tendency to agree with a statement rather than to disagree with it; this suggests that a simple dichotomous question might build in respondent bias.

Dichotomous Variables in Research

In addition to dichotomous questions (‘yes’/ ‘no’ questions) a piece of research might ask for information about dichotomous variables, for example gender (male/female), type of school (elementary/secondary), type of course (vocational/non-vocational). In these cases only one of two responses can be selected. This enables nominal data to be gathered, which can then be processed using the chi-square statistic, the binomial test, the G-test, and cross-tabulations (see Cohen and Holliday (1996) for examples)

Multiple Choice Questions

To try to gain some purchase on complexity, the researcher can move towards multiple choice questions, where the range of choices is designed to capture the likely range of responses to given statements.

Definition and Examples

For example, the researcher might ask a series of questions about a new Chemistry scheme in the school; a statement precedes a set of responses thus: The New Intermediate Chemistry Education (NICE) is:

(a) A waste of time

(b) An extra burden on teachers

(d) A useful complementary scheme

(e) A useful core scheme throughout the school

(f) Well-presented and practicable

Guidelines for Constructing Multiple Choice Questions

The categories would have to be discrete (i.e. having no overlap and being mutually exclusive) and would have to exhaust the possible range of responses. Guidance would have to be given on the completion of the multiple-choice, clarifying, for example, whether respondents are able to tick only one response (a single answer mode) or several responses (multiple answer mode) from the list.

Advantages and Limitations

Like dichotomous questions, multiple choice questions can be quickly coded and quickly aggregated to give frequencies of response. If that is appropriate for the research, then this might be a useful instrument. Just as dichotomous questions have their parallel in dichotomous variables, so multiple choice questions have their parallel in multiple elements of a variable. For example, the researcher may be asking to which form a student belongs— there being up to, say, forty forms in a large school, or the researcher may be asking which post-16 course a student is following (e.g. academic, vocational, manual, non-manual).

In these cases only one response may be selected. As with the dichotomous variable, the listing of several categories or elements of a variable (e.g. form membership and course followed) enables nominal data to be collected and processed using the chi-square statistic, the G-test, and cross tabulations (Cohen and Holliday, 1996).

The multiple choice questionnaire seldom gives more than a crude statistic, for words are inherently ambiguous. In the example above the notion of ‘useful’ is unclear, as are ‘appropriate’, ‘practicable’ and ‘burden’. Respondents could interpret these words differently in their own contexts, thereby rendering the data a biguous. One respondent might see the utility of the chemistry scheme in one area and thereby say that it is useful—ticking (d). Another respondent might see the same utility in that same one area but, because it is only useful in that single area, may see this as a flaw and therefore not tick category (d).

With an anonymous questionnaire this difference would be impossible to detect. This is the heart of the problem of questionnaires—that different respondents interpret the same words differently. Anchor statements’ can be provided to allow a degree of discrimination in response (e.g. ‘strongly agree’, ‘agree’ etc.) but there is no guarantee that respondents will always interpret them in the way that was intended. In the example above this might not be problems as the researcher might only be seeking an index of utility—without wishing to know the areas of utility or the reasons for that utility.

The evaluator might only be wishing for a crude statistic (which might be very useful statistically in making a decisive judgement about a program) in which case this rough and ready statistic might be perfectly acceptable. What one can see in the example above is not only ambiguity in the wording but a very incomplete set of response categories which is hardly capable of representing all aspects of the chemistry scheme.

Bias and Wording Issues

That this might be politically expedient cannot be overlooked, for if the choice of responses is limited, then those responses might enable bias to be built into the research. For example, if the responses were limited to statements about the utility of the chemistry scheme, then the evaluator would have little difficulty in establishing that the scheme was useful. By avoiding the inclusion of negative statements or the opportunity to record a negative response the research will surely be biased. The issue of the wording of questions has been discussed earlier.

Rank Ordering Questions

The rank order question is akin to the multiple choice question in that it identifies options from which respondents can choose, yet it moves be yond multiple choice items in that it asks respondents to identify priorities. This enables a relative degree of preference, priority, intensity etc. to be charted.

Purpose and Structure

In the rank ordering exercise a list of factors is set out and the respondent is required to place them in a rank order, for example: Please indicate your priorities by placing numbers in the boxes to indicate the ordering of your views, 1=the highest priority, 2=the second highest, and so on.

Example of Rank Ordering in Education Research

The proposed amendments to the mathematics scheme might be successful if the following factors are addressed:

The appropriate material resources are in school
The amendments are made clear to all teachers
The amendments are supported by the mathematics team
The necessary staff development is assured
There are subsequent improvements to student achievement
The proposals have the agreement of all teachers
They improve student motivation
Parents approve of the amendments
They will raise the achievements of the brighter students
The work becomes more geared to problem solving in this example ten items are listed.

Whilst this might be enticing for the researcher, enabling fine distinctions possibly to be made in priori ties, it might be asking too much of the respondents to make such distinctions. They genuinely might not be able to differentiate their responses, or they simply might not feel strongly enough to make such distinctions. The inclusion of too long a list might be overwhelming.

Practical Limitations

Indeed Wilson and McLean (1994:26) suggest that it is unrealistic to ask respondents to arrange priori ties where there are more than five ranks that have been requested. In the case of the list of ten points above, the researcher might approach this problem in one of two ways.

How to Manage Large Lists

The list in the questionnaire item can be reduced to five items only, in which case the range and comprehensiveness of responses that fairly catches what the respondent feels is significantly reduced. Alternatively, the list of ten items can be retained, but the request can be made to the respondents only to rank their first five priorities, in which case the range is retained and the task is not overwhelming (though the problem of sorting the data for analysis is increased). Rankings are useful in indicating degrees of response. In this respect they are like rating scales, discussed below.

Rating Scales

Introduction to Rating Scales

One way in which degrees of response, intensity of response, and the move away from dichotomous questions has been managed can be seen in the notion of rating scales—Likert scales, semantic differential scales, Thurstone scales, Guttman scaling. These are very useful devices for the researcher, as they build in a degree of sensitivity and differentiation of response whilst still generating numbers. This topic will focus on the first two of these, though readers will find the others discussed in Oppenheim (1992).

Likert Scales

A Likert scale (named after its deviser, Rensis Likert, 1932) provides a range of responses to a given question or statement, for example: How important do you consider work placements to be for secondary school students? 1 = not at all 2 = very little 3 = a little 4=a lot 5 = a very great deal All students should have access to free higher education. 1 = strongly disagree 2 = disagree 3 = neither agree nor disagree 4 = agree 5 = strongly agree

Semantic Differential Scales

In these examples the categories need to be discrete and to exhaust the range of possible responses which respondents may wish to give. Notwithstanding the problems of interpretation which arise as in the previous example—one respondent’s ‘agree’ may be another’s ‘strongly agree’, one respondent’s ‘very little’ might be another’s ‘a little’—the greater subtlety of response which is built into a rating scale renders this a very attractive and widely used instrument in research.

These two examples both indicate an important feature of an attitude scaling instrument, viz. the assumption of uni-dimensionality in the scale; the scale should only be measuring one thing at a time (Oppenheim, 1992:187–8). In deed this is a cornerstone of Likert’s own thinking (1932). It is a very straightforward matter to convert a dichotomous question into a multiple choice question. For example, instead of asking the ‘do you?’, ‘have you?’, ‘are you?’, ‘can you?’ type questions in a dichotomous format, a simple addition to wording will convert it into a much more subtle rating scale, by substituting the words ‘to what extent?’, ‘how far?’, ‘how much?’ etc.

A semantic differential is a variation of a rating scale which operates by putting an adjective at one end of a scale and its opposite at the other, for example: How informative do you consider the new set of history text books to be? 1 2 3 4 5 6 7 useful —useless The respondent indicates on the scale by circling or putting a mark on that position which most represents what she or he feels.

Osgood et al. (1957), the pioneers of this technique, suggest that semantic differential scales are useful in three contexts: evaluative (e.g. valuable-valueless, useful—useless, good—bad); potency (e.g. large—small, weak—strong, light—heavy); and activity (e.g. quick—slow, active—passive, dynamic-lethargic).

Advantages of Rating Scales

Rating scales are widely used in research, and rightly so, for they combine the opportunity for a flexible response with the ability to determine frequencies, correlations and other forms of quantitative analysis. They afford the researcher the freedom to fuse measurement with opinion, quantity and quality.

Limitations and Common Errors

Though rating scales are powerful and useful in research, the researcher, nevertheless, needs to be aware of their limitations. For example, the researcher may not be able in infer a degree of sensitivity and subtlety from the data that they cannot bear. There are other cautionary factors about rating scales, be they Likert scales or se mantic differential scales:

There is no assumption of equal intervals between the categories, hence a rating of 4 indicates neither that it is twice as powerful as 2 nor that it is twice as strongly felt; one cannot infer that the intensity of feeling in the Likert scale between ‘strongly disagree’ and ‘disagree’ somehow matches the intensity of feeling between ‘strongly agree’ and ‘agree’. These are illegitimate inferences. The problem of equal intervals has been addressed in Thurstone scales (Thurstone and Chave, 1929; Oppenheim, 1992:190–5).

We have no check on whether the respondents are telling the truth. Some respondents may be deliberately falsifying their replies.
We have no way of knowing if the respondent might have wished to add any other comments about the issue under investigation. It might have been the case that there was something far more pressing about the issue than the rating scale included but which was condemned to silence for want of a category. A straightforward way to circumvent this issue is to run a pilot and also to include a category entitled ‘other (please state)’.
Most of us would not wish to be called extremists; we often prefer to appear like each other in many respects. For rating scales this means that we might wish to avoid the two extreme poles at each end of the continuum of the rating scales, reducing the number of positions in the scales to a choice of three (in a five-point scale). That means that in fact there could be very little choice for us. The way round this is to create a larger scale than a five-point scale, for example a seven-point scale. To go beyond a seven-point scale is to invite a degree of detail and precision which might be inappropriate for the item in question, particularly if the argument set out above is accepted, viz. that one respondent’s scale point three might be another’s scale point four.
On the scales so far there have been mid points; on the five-point scale it is category three, and on the seven-point scale it is category four. The use of an odd number of points on a scale enables this to occur.

However, choosing an even number of scale points, for example a six-point scale, might require a decision on rating to be indicated. For example, suppose a new staffing structure has been introduced into a school and the head teacher is seeking some guidance on its effectiveness. A six-point rating scale might ask respondents to indicate their response to the statement: The new staffing structure in the school has enabled teamwork to be managed within a clear model of line management. (Circle one number) 1 2 3 4 5 6 strongly — — — — — — strongly agree disagree.

Data Interpretation and Analysis

Let us say that one member of staff circled 1, eight staff circled 2, twelve staff circled 3, nine staff circled 4, two staff circled 5, and seven staff circled 6. There being no mid-point on this continuum, the researcher could infer that those respondents who circled 1, 2, or 3 were in some measure of agreement, whilst those respondents who circled 4, 5, or 6 were in some measure of disagreement.

That would be very useful for, say, a head teacher, in publicly displaying agreement, there being twenty-one staff (1+8+12) agreeing with the statement and eighteen (9+2+7) displays a measure of disagreement. However, one could point out that the measure of ‘strongly disagree’ attracted seven staff—a very strong feeling—which was not true for the ‘strongly agree’ category, which only attracted one member of staff.

The extremity of the voting has been lost in a crude aggregation. Further, if the researcher were to aggregate the scoring around the two mid-point categories (3 and 4) there would be twenty-one members of staff represented, leaving nine (1+8) from categories 1 and 2 and nine (2+7) from categories 5 and 6; adding together categories 1, 2, 5 and 6, a total of 18 is reached, which is less than the twenty one total of the two categories 3 and 4.

It seems on this scenario that it is far from clear that there was agreement with the statement from the staff; indeed taking the high incidence of ‘strongly disagree’, it could be argued that those staff who were perhaps ambivalent (categories 3 and 4), coupled with those who registered a ‘strongly disagree’ indicate not agreement but disagreement with the statement. The interpretation of data has to be handled very carefully; ordering them to suit a researcher’s own purposes might be very alluring but illegitimate. The golden rule here is that crude data can only yield crude interpretation; subtle statistics require subtle data. The interpretation of data must not distort the data unfairly.

Summary and Recommendations

It has been suggested that the attraction of rating scales is that they provide more opportunity than dichotomous questions for rendering data more sensitive and responsive to respondents. This makes rating scales particularly useful for tapping attitudes, perceptions and opinions of respondents. The need for a pilot to de vise and refine categories, making them exhaustive and discrete, has been suggested as a necessary part of this type of data collection. Questionnaires that are going to yield numerical or word-based data can be analyzed using computer programs (for example SPSS, SphinxSurvey or Ethnograph respectively).

If the researcher intends to process the data using a computer package it is essential that the layout and coding system of the questionnaire is appropriate for the computer package. Instructions for layout in order to facilitate data entry are contained in manuals that accompany such packages. Rating scales are more sensitive instruments than dichotomous scales. Nevertheless they are limited in their usefulness to researchers by their fixity of response caused by the need to select from a given choice. A questionnaire might be tailored even more to respondents by including open-ended questions to which respondents can reply in their own terms and own opinions, and these we now consider.

Open-Ended Questions

The open-ended question is a very attractive device for smaller scale research or for those sections of a questionnaire that invite an honest, personal comment from the respondents in addition to ticking numbers and boxes.

Definition and Purpose

The questionnaire simply puts the open-ended questions and leaves a space (or draws lines) for a free response.

Advantages of Open-Ended Questions

It is the open-ended responses that might contain the ‘gems’ of information that otherwise might not have been caught in the questionnaire. Further, it puts the responsibility for and ownership of the data much more firmly into the respondents’ hands. This is not to say that the open-ended question might well not frame the answer, just as the stem of a rating scale question might frame the response given.

However, an open-ended question can catch the authenticity, richness, depth of response, honesty and candour which, as is argued elsewhere in these topics are the hallmarks of qualitative data. Oppenheim (1992:56–7) suggests that a sentence-completion item is a useful adjunct to an open-ended question, for example: Please complete the following sentence in your own words: An effective teacher… or the main things that I find annoying with disruptive students are…

Limitations and Data Handling Challenges

Open-endedness also carries problems of data handling. For example, if one tries to convert opinions into numbers (e.g. so many people indicated some degree of satisfaction with the new principal’s management plan), then it could be argued that the questionnaire should have used rating scales in the first place. Further, it might well be that the researcher is in danger of violating one principle of word-based data, which is that they are not validly susceptible to aggregation, i.e. that it is trying to bring to word based data the principles of numerical data, borrowing from one paradigm (quantitative methodology) to inform another paradigm (qualitative methodology).

Further, if a genuinely open-ended question is being asked, it is perhaps unlikely that responses will bear such a degree of similarity to each other to enable them to be aggregated too tightly. Open-ended questions make it difficult for the researcher to make comparisons between respondents, as there may be little in common to compare. Moreover, to complete an open ended questionnaire takes much longer than placing a tick in a rating scale response box; not only will time be a constraint here, but there is an assumption that respondents will be sufficiently or equally capable of articulating their thoughts and committing them to paper.

Best Practices for Using Open-Ended Items

Despite these cautions, the space provided for an open-ended response is a window of opportunity for the respondent to shed light on an issue or course. Thus, an open-ended questionnaire has much to recommend it.

Asking Sensitive Questions

Sudman and Bradburn (1982) draw attention to the important issue of including sensitive items in a questionnaire.

Why Sensitivity Matters in Questionnaires

Whilst the anonymity of a questionnaire and, frequently, the lack of face-to-face contact between the re searcher and the respondents in a questionnaire might facilitate responses to sensitive material, the issues of sensitivity and threat cannot be avoided, as they might lead to under-reporting and over-reporting by participants.

Strategies for Handling Sensitive Topics

Sudman and Bradburn (1982:55–6) identify several important considerations in addressing potentially threatening or sensitive issues, for example socially undesirable behavior (e.g. drug abuse, sexual offences, violent behavior, criminality, illnesses, employment and unemployment, physical features, sexual activity, behavior and sexuality, gambling, drinking, family details, political beliefs, social taboos). They suggest that:

Open rather than closed questions might be more suitable to elicit information about socially undesirable behavior, particularly frequencies.
Long rather than short questions might be more suitable for eliciting information about socially undesirable behavior, particularly frequencies.
Using familiar words might increase the number of reported frequencies of socially undesirable behavior.
Using data gathered from informants, where possible, can enhance the likelihood of obtaining reports of threatening behavior.
Deliberately loading the question so that overstatements of socially desirable behavior and understatements of socially undesirable behavior are reduced might be a useful means of eliciting information.
With regard to socially undesirable behavior, it might be advisable, firstly, to ask whether the respondent has engaged in that behavior previously, and then move to asking about his or her current behavior. By contrast, when asking about socially acceptable behavior the reverse might be true, i.e. asking about current behavior before asking about everyday behavior.
In order to defuse threat, it might be useful to locate the sensitive topic within a discussion of other more or less sensitive matters, in order to suggest to respondents that this issue might not be too important.
Use alternative ways of asking standard questions, for example sorting cards, or putting questions in sealed envelopes, or repeating questions over time (this has to be handled sensitively, so that respondents do not feel that they are being ‘checked’), and in order to increase reliability.
Ask respondents to keep diaries in order to increase validity and reliability.
At the end of an interview ask respondents their views on the sensitivity of the topics that have been discussed questions.
If possible find ways of validating the data. Indeed the authors suggest (ibid.: 86) that, as the questions become more threatening and sensitive, it is wise to expect greater bias and unreliability. They draw attention to the fact (ibid.: 208) that several nominal, demographic details might be considered threatening by respondents. This has implications for their location within the questionnaire (discussed below).

Researcher Perspective vs Respondent Perspective

The issue here is that sensitivity and threat are to be viewed through the eyes of respondents rather than the questionnaire designer; what might appear innocuous to the researcher might be highly sensitive or offensive to the respondent.

https://nurseseducator.com/high-fidelity-simulation-use-in-nursing-education/

First NCLEX Exam Center In Pakistan From Lahore (Mall of Lahore) to the Global Nursing

Categories of Journals: W, X, Y and Z Category Journal In Nursing Education

AI in Healthcare Content Creation: A Double-Edged Sword and Scary

Social Links:

https://www.facebook.com/nurseseducator/

https://www.instagram.com/nurseseducator/

https://www.pinterest.com/NursesEducator/

https://www.linkedin.com/company/nurseseducator/

https://www.linkedin.com/in/nurseseducator/

https://www.researchgate.net/profile/Afza-Lal-Din

https://scholar.google.com/citations?hl=en&user=F0XY9vQAAAAJ