Nurses Educator

The Resource Pivot for Updated Nursing Knowledge

Selecting an Evaluation Instrument

After selecting a suitable evaluation model and identifying the variables and their relationships, the next step is choosing evaluation instruments that will efficiently collect the required data. The choice of these instruments depends on the evaluation question and the model chosen for the assessment.

Types of Instruments

Numerous evaluation instruments are available, often found through literature reviews. To use a published tool, it’s essential to contact the publisher and obtain permission.

Questionnaire

A questionnaire is a written form where individuals respond to questions, usually self-administered. Participants read and answer questions based on instructions provided. Although cost-effective, questionnaires may lack depth. The questions need to be clear, concise, and straightforward (Polit & Beck, 2013).

This tool is commonly used to measure qualitative variables such as attitudes or feelings. For instance, it can assess a student’s confidence level in clinical settings or measure student satisfaction with a nursing program after graduation.

Interview

An interview involves direct interaction with participants. For example, exit interviews are often conducted when faculty leave a nursing school or students graduate. Interviews can capture both qualitative and quantitative data and can be done individually or in focus groups. Students or external evaluators may be tasked with collecting the information.

Interviews should be conducted at a time convenient for both the interviewer and the interviewee, ideally in a quiet, private setting to encourage open communication. The interviewer must avoid personal biases and follow an objective outline during the interview. While interviews can yield rich data, they are often time-consuming (Polit & Beck, 2013).

Guidelines for interviews (Sanders & Sullins, 2006):

  1. Use language that matches the respondent’s comprehension level.
  2. Explain the purpose of the interview, who will access the transcripts or recordings, and how confidentiality will be maintained.
  3. Encourage honesty but allow respondents the option to skip questions. Build rapport with simple, impersonal questions at the start.
  4. Avoid long, ambiguous, or leading questions.
  5. Focus on one idea per question and don’t assume prior knowledge.

Rating Scale

A rating scale measures abstract concepts on a descriptive continuum, helping increase objectivity in evaluations. Although rating scales work well with norm-referenced evaluations, they may not be the best tool for this purpose. Ratings can also be converted into grades.

Checklist

A checklist is a two-dimensional tool that lists expected behaviors or competencies on one side and the degree to which these expectations are met on the other side. With detailed items and clearly defined criteria, a checklist helps the evaluator easily identify expected behaviors or levels of competence.

This tool is useful for both formative and summative evaluations. In clinical settings, a checklist may be used to assess whether a student is following the correct steps during a procedure. The evaluator can then mark whether each step was completed or missed.

Attitude Scale

An attitude scale measures a participant’s feelings about a topic at the time they respond. One of the most widely used types is the Likert scale, where respondents express their opinions on 10–15 statements about a specific issue.

  • For example, a Likert scale might ask respondents to share their views on diversity in nursing, with items expressing different opinions about Latino students. Participants indicate their level of agreement or disagreement. To minimize bias, the scale should include an equal number of positively and negatively worded items.

Another option is the semantic differential scale, which uses bipolar adjectives (e.g., good-bad, active-passive, positive-negative) to measure a participant’s reaction. Each item is followed by these adjectives, and the respondent selects their position on a scale with an odd number of intervals, ensuring that the middle point is neutral.

Typically, scales consist of five to seven intervals. For analysis, scores are summed similarly to the Likert scale (Polit & Beck, 2013). It’s essential to avoid treating Likert and semantic differential scale data as interval data. Instead, the Rasch model offers a more appropriate analysis method.

The Rasch model provides mathematical justification for treating Likert scale data as ordered categories, transforming raw data into interval scales based on empirical evidence rather than assumptions. This approach is more reliable than simply adding up ordinal data.

The Rasch model is the only model that offers the necessary objectivity for building scales independent of the respondent’s distribution of attributes (Bond & Fox, 2007). It is part of item response theory, which uses test scores and specific item responses to draw conclusions based on the mathematical relationship between abilities or attitudes and the respondent’s answers (Rudner, 2001). Through various diagnostic procedures, the Rasch model can assess the tool’s ability to measure both the author’s intent and the participant’s responses.

The design of a rating scale significantly impacts the quality of responses. The Rasch model’s diagnostic capabilities provide a powerful tool for creating, analyzing, and revising attitude scales, ensuring they are effective for evaluation.