Nursing Research Data Management
Data Management
Data
management is generally defined as the procedures taken to ensure the accuracy
of data, from data entry through data transformations. Although often a tedious
and time consuming process, data management is absolutely essential for good
science.
Data Entry
The
first step is data entry. Although this may occur in a variety of ways, from
being scanned in to being entered manually, the crucial point is that the
accuracy of the data be assessed before any manipulations are performed or
statistics produced.
Frequency distributions and descriptive statistics are
generated. Then each variable is inspected, as appropriate, for out of range
values, outliers, equality of groups, skewness, and missing data. Decisions
must be made about dealing with each of these.
Incorrect values must be
replaced with correct values or assigned to the missing values category.
Outliers must be investigated and dealt with. If a categorical variable is
supposed to have four categories but only three have adequate numbers of
subjects, one must decide about eliminating the fourth category or combining it
with one of the others.
If continuous variable is skewed, data transformations
may be attempted or nonparametric statistics employed. Once each variable has
been inspected and corrected where necessary, new variables may be created.
This might include the development of total scores for a group of items, sub
scores, and so forth. Each of these new variables must also be checked for
outliers, skewness, and out-of-range values. The creation of some new variables
may involve the use of sophisticated techniques such as factor and reliability
analyses.
Cautions About Data Management
Prior
to each statistical test, the assumptions underlying the test must be checked.
If violated, alternative approaches must be sought. Careful attention to data
management must underlie data analysis. It ensures the validity of the data and
the appropriateness of the analyses.