#
# Nursing Research and Factor Analysis

**What Is Factor Analysis**

**Factor
analysis is a multivariate technique for determining the underlying structure
and dimensionality of a set of variables.** By analyzing intercorrelations among

variables, factor analysis shows which variables cluster together to form

unidimensional constructs. It is useful in elucidating the underlying meaning

of concepts.

However, it involves a higher degree of subjective interpretation

than is common with most other statistical methods. In nursing research, factor

analysis is commonly used for instrument development ( Ferketich & Muller,

1990), theory development, and data reduction.

Therefore, factor analysis is

used for identifying the number, nature, and importance of factors, comparing

factor solutions for different groups, estimating scores on factors, and

testing theories (Nunnally & Bernstein, 1994).

## Types of Factor Analysis

There

are two major types of factor analysis: **exploratory and confirmatory**. In

exploratory factor analysis, the data are described and summarized by grouping

together related variables. The variables may or may not be selected with a

particular purpose in mind.

Exploratory factor analysis is commonly used in the

early stages of research, when it provides a method for consolidating variables

and generating hypotheses about underlying processes that affect the clustering

of the variables.

Confirmatory factor analysis is used in later stages of

research for theory testing related to latent processes or to examine

hypothesized differences in latent processes among groups of subjects. In

confirmatory factor analysis, the variables are carefully and specifically

selected to reveal underlying processes or associations.

## Variables Characteristics

**The
raw data should be at or applicable to the interval level**, such as the data

obtained with Likert-type measures. Next, a number of assumptions relating to

the sample, variables, and factors should be met.

First, the sample size must

be sufficiently large to avoid erroneous interpretations of random differences

in the magnitude of correlation coefficients.

As a rule of thumb, a minimum of

five cases for each observed variable is recommended however, Knapp and Brown

(1995) reported that ratios as low as three subjects per variable may be

acceptable. Others generally recommend that 100 to 200 is advisable (Nunnally

& Bernstein, 1994).

**Second,
the variables should be normally distributed, **with no substantial evidence of

skewness or kurtosis. Third, scatterplots should indicate that the associations

between pairs of variables should be linear.

Fourth, outliers among cases

should be identified and their influence reduced either by transformation or by

arbitrarily replacing the outlying value with a less extreme score.

Fifth,

instances of multicollinearity and singularity of the variables should be

deleted after examining to see if the determinant of the correlation matrix or

eigenvalues associated with some factors approach zero. In addition, a squared

multiple correlation equal to 1 indicates singularity; and if any of the

squared multiple correlations are close to 1, multicollinearity exists.

Sixth,

outliers among variables, indicated by low squared multiple correlation with

all other variables and low correlations with all important factors, suggest

the need for cautious interpretation and possible elimination of the variables

from the analysis.

Seventh, there should be adequate factorability within the

correlation matrix, which is indicated by several sizable correlations between

pairs of variables that exceed .30. Finally, screening is important for

identifying outlying cases among the factors.

If such outliers can be

identified by large Mahala Nobis distances (estimated as chi square values)

from the location of the case in the space defined by the factors to the

centroid of all cases in the same space, factor analysis is not considered

appropriate.

## Considering the Variables

When

planning for factor analysis, the first step is to identify a theoretical model

that will guide the statistical model ( Ferketich & Muller, 1990). The next

step is to select the psychometric measurement model, either classic or

neoclassical, that will reflect the nature of measurement error.

The classic

model assumes that all measurement error is random and that all variance is

unique to individual variables and not shared with other variables or factors.

The neoclassic model recognizes both random and systematic measurement error,

which may reflect common variance that is attributable to unmeasured or latent

factors.

The selection of the classic or neoclassical model influences whether

the researcher chooses principal-components analysis or common factor analysis

( Ferkerich & Mullerlly)

## Mathematical Description of Analysis

Mathematically

speaking, factor analysis generates factors that are linear combinations of

variables. The first step in factor analysis is factor extraction, which

involves the removal of as much variance as possible through the successive

creation of linear combinations that are orthogonal (unrelated) to previously

created combinations.

The principal-components method of extraction is widely

used for analyzing all the variance in the variables. However, other methods of

factor extraction, which analyze common factor variance ( ie ., variance that

is shared with other variables), include the principal-factors method, the

alpha method, and the maximum likelihood method (Nunnally & Bernstein,

1994).

Various criteria have been used to determine how many factors account

for a substantial amount of variance in the data set. One criterion is to

accept only those factors with an eigenvalue equal to or greater than 1.0

(Guttman, 1954).

An eigenvalue is a standardized index of the amount of the

variance extracted by each factor. Another approach is to use a screen test to

identify sharp discontinuities in the eigenvalues for successive factors

(Cattell, 1966).

## Outcomes of Analysis

Factor

extraction results in a factor matrix that shows the relationship between the

original variables and the factors by means of factor loadings. The factor

loadings, when squared, equal the variance in the variable accounted for by the

factor.

For all of the extracted factors, the sum of the squared loadings for

the variables represents the communality (shared variance) of the variables.

The sum of a factor’s squared loadings for all variables equals that factor’s

eigenvalue (Nunnally & Bernstein, 1994).

Because

the initial factor matrix may be difficult to interpret, factor rotation is

commonly used when more than one factor emerges. Factor rotation involves the

movement of the reference axes within the factor space so that the variables

align with a single factor (Nunnally & Bernstein, 1994).

Orthogonal

rotation keeps the reference axes at right angles and results in factors that

are uncorrelated. Orthogonal rotation is usually performed through a method

known as varimax, but other methods (quartic max and equal max) are also

available. Oblique rotation allows the reference axes to rotate into acute or

oblique angles, thus resulting in correlated factors (Nunnally &

Bernstein).

When oblique rotation is used, there are two resulting matrices: a

pattern matrix that reveals partial regression coefficients between variables

and factors and a structure matrix that shows variable to factor correlations,

Factors are interpreted by examining the pattern and magnitude of the factor

loadings in the rotated factor matrix (orthogonal rotation) or pattern matrix

(oblique rotation).

Ideally, there are one or more marker variables, variables

with a very high loading none and only one factor (Nunnally & Bernstein,

1994), that can help in the interpretation and naming of factors. Generally,

factor loadings of .30 and higher are large enough to be meaningful (Nunnally

& Bernstein).

Once a factor is interpreted and labeled, researchers usually

determine factor scores, which are scores on the abstract dimension defined by

the factor.

Replication

of factor solutions in subsequent analysis with different populations gives

increased credibility to the findings. Comparisons between factor-analytic

solutions can be made by visual inspection of the factor loadings or by using

formal statistical procedures, such as the computation of Cattell’s salient similarity

index and the use of confirmatory factor analysis (Gorsuch, 1983).