Data Reliability and Its Measurement
Methods of Data Measurement Reliability , Rules Instruments and Attribute In Nursing Research.
Ideal Data
An ideal data collection procedure is one that captures a construct in a way that is relevant, credible, accurate, truthful, and sensitive. For most concepts of interest to nurse researchers, there are few data collection procedures that match this ideal. Bio physiologic methods have a higher chance of success in attaining these goals than self-report or observational methods, but no method is flawless.
In this chapter, we discuss criteria for evaluating the quality of data obtained in a study. We begin by discussing principles of measurement and assessments of quantitative data. Later in this chapter, we discuss assessments of qualitative data.
Measurement
Quantitative studies derive data through the measurement of variables. Measurement involves the assignment of numbers to represent the amount of an attribute present in an object or person, using a specified set of rules.
As this definition implies, quantification and measurement go hand in hand. An often-quoted statement by early American psychologist LL Thurstone advances a fundamental position:
“Whatever exists, exists in some amount and can be measured.”
Attributes are not constant:
They vary from day to day, from situation to situation, or from one person to another. This variability is presumed to be capable of a numeric expression that signifies how much of an attribute is present.
The purpose of assigning numbers is to differentiate between people or objects that possess varying degrees of critical attributes.
Rules and Measurement
Measurement involves assigning numbers to objects according to rules, rather than haphazardly. Rules for measuring temperature, weight, blood pressure, and other physical attributes are familiar to us.
Rules for measuring many variables for nursing research studies, however, have to be invented. Whether the data are collected by observation, self-report, or some other method, researchers must specify under what conditions and according to what criteria the numeric values are to be assigned to the characteristic of interest. As an example, suppose we were studying attitudes toward distributing condoms in school-based clinics and asked parents to express their extent of
Responses to this question can be quantified by developing a system for assigning numbers to them. Note that any rule would satisfy the definition of measurement. We could assign the value of 30 to “strongly agree,” 27 to “agree,” 20 to “slightly agree,” and so on, but there is no justification for doing so. In measuring attributes, researchers strive to use good, meaningful rules.
Without any a priori information about the “distance” between the seven options, the most defensible procedure is to assign a 1 to “strongly agree” and a 7 to “strongly disagree.” This rule would quantitatively differentiate, in increments of one point, among people with seven different reactions to the statement.
Innovation In Measuring Instruments
With a new instrument, researchers seldom know in advance if their rules are the best possible. New measurement rules reflect researchers’ hypotheses about how attributes function and vary. The adequacy of the hypotheses, that is, the worth of the instruments needs to be assessed empirically. Researchers endeavor to link numerical values to reality.
To state this goal more technically, measurement procedures must be isomorphic to reality. The term isomorphism signifies equivalence or similarity between two phenomena. An instrument cannot be useful unless the measures resulting from it correspond with the real world.
To illustrate the concept of isomorphism, suppose the Scholastic Assessment Test (SAT) were administered to 10 students, who obtained the following scores: 345, 395, 430, 435, 490, 505, 550, 570, 620, and 640. Now suppose that the true scores of these same students on a hypothetically perfect test were as follows: 360, 375, 430, 465, 470, 500, 550, 610, 590, and 670.
This figure shows that, although not perfect, the test came close to representing true scores; only two people (H and I) were improperly ordered in the actual test. This example illustrates a measure whose isomorphism with reality is high, but improbable. Researchers almost always work with fallible measures.
Instruments that measure psychological phenomena are less likely to correspond to reality than physical measures, but few instruments are error free. Advantages of Measurement What exactly does measurement accomplish?
Consider how handicapped health care professionals and researchers would be in the absence of measurement. What would happen, for example, if there were no measures of body temperature or blood pressure? Subjective evaluations of clinical outcomes would have to be used. A main strength of measurement is that it removes subjectivity and guesswork.
Rules for Measurement
Because measurement it based on explicit rules, resulting information tends to be objective, that is, it can be independently verified. Two people measuring the weight of a person using the same scale would likely get identical results. Not all measures are completely objective, but most incorporate mechanisms for minimizing subjectivity.
Measurement also makes it possible to obtain reasonably precise information. Instead of describing Nathan as “rather tall,” we can depict him as being 6 feet 2 inches tall. If we chose, we could obtain even greater precision. With precise measures, researchers can more readily differentiate among people with different degrees of attribute. Finally, measurement is a language of communication.
Numbers are less vague than words and therefore can communicate information more accurately. If a researcher reported that the average oral temperature of a sample of patients was “somewhat high,” different readers might develop different conceptions about the sample’s physiologic state.
However, if the researcher reported an average temperature of 99.6F, there would be no ambiguity. Errors of Measurement Both the procedures involved in applying measurements and the objects being measured are susceptible to influences that can alter the resulting data.
Some influences can be controlled to a certain degree, and attempts should always be made to do so, but such efforts are rarely completely successful. Instruments that are not perfectly accurate yield measurements containing some error. Conceptually, an observed (or obtained) score can be decomposed into two parts, an error component and a true component. This can be written symbolically as follows:
The first term in the equation is an observed score—for example, a systolic blood pressure reading or a score on an anxiety scale. XT is the value that would be obtained with an infallible measure. The true score is hypothetical—it can never be known because measures are not infallible. The final term in the equation is the error of measurement. The difference between true and obtained scores is the result of factors that distort the measurement.
Decomposing obtained scores in this fashion highlights an important point. When researchers measure an attribute, they are also measuring attributes that are not of interest. The true score component is what they hope to isolate; the error component is a composite of other factors that are also being measured, contrary to their wishes.
This concept can be illustrated with an exaggerated example. Suppose a researcher measured the weight of 10 people on a spring scale. As subjects step on the scale, the researcher places a hand on their shoulders and applies some pressure.
The resulting measures (the XOs) will be biased upward because the scores reflect both actual weight (XT) and the researcher’s pressure (XE). Errors of measurement are problematic because their value is unknown and also because they are variable. In this example, the amount of pressure applied would likely vary from one subject to the next.