Guideline For Clinical Practice Evaluation
Clinical Practice Guidelines Development and Evaluation
View of Rona F. Levin and Susan Kaplan Jacobs
Clinical decision making that is grounded in the best available
evidence is essential to promote patient safety and quality health care
With the knowledge base for geriatric nursing rapidly expanding,
assessing geriatric clinical practice guidelines (CPGs) for their validity and
incorporation of the best available evidence is critical to the safety and
outcomes of care.
In the second edition of this book, Lucas and Fulmer (2003)
challenged geriatric nurses to take the lead in the assessment of geriatric
clinical practice guidelines (CPGs), recognizing that in the absence of best
evidence, guidelines and protocols have little value for clinical decision
In the third edition of this book, Levin, Singleton, and Jacobs (2008)
proposed a method for ensuring that the protocols included in the book were
based on a systematic review of the literature and synthesis of best evidence.
There was no standard process or specific criteria for protocol development nor
was there any indication of the “level of evidence” of each source cited in the
chapter ( ie ., the evidence base for the protocol).
Definition Of Terms
Evidence-based practice (EBP) is a framework for clinical practice
that integrates the best available scientific evidence with the expertise of
the clinician and with patients’ preferences and values to make decisions about
health care (Levin & Feldman, 2006; Straus, Richardson , Glasziou , &
Haynes, 2005).
Health care professionals often use the terms recommendations,
guidelines, and protocols interchangeably, but they are not synonymous.
Guideline Protocols
A recommendation is a suggestion for practice, not necessarily
sanctioned by a formal, expert group. A clinical practice guideline is an
“official recommendation” or suggested approach to diagnose and manage a broad
health condition (eg, heart failure, smoking cessation, or pain management).
protocol is a more detailed guide for approaching a clinical problem or health
condition and is tailored to a specific practice situation.
For example,
guidelines for falls prevention recommend developing a protocol for toileting
elderly, sedated, or confused patients (Rich & Newland, 2006). The specific
practices or protocol each agency implements, however, is agency specific.
validity of any of these practice guides can vary depending on the type and the
level of evidence on which they are based. Using standard criteria to develop
or refine CPGs or protocols ensures reliability of their content.
Standardization gives both nurses, who use the guideline/protocol, and
patients, who receive care based on the guideline/protocol, assurance that the
geriatric content and practice recommendations are based on the best evidence.
Standard Practices
In contrast to these practice guides, “standards of practice” are
not specific or necessarily evidence based; rather, they are a generally
accepted, formal, published framework for practice.
As an example, the American
Nurses Association document, Nursing Scope and Standards of Practice (American
Nurses Association, 2010), contains a standard regarding nurses’ accountability
for making an assessment of a patient’s health status. The standard is a
general statement.
A protocol, on the other hand, may specify the assessment
tool(s) to use in that assessment for example, an instrument to predict
pressure-ulcer risk.
The Agreement Instrument
The AGREE (Appraisal of Guidelines for Research & Evaluation)
instrument , created and evaluated by
international guideline developers and researchers for use by the National
Health Services (AGREE Collaboration, 2001).
It was initially supported by the
UK National Health Services Management Executive and later by the European
Union ( Cluzeau , Littlejohns , Grimshaw, Feder, & Moran, 1999).
instrument is to provide standard criteria with which to appraise CPGs. This
appraisal includes evaluation of the methods used to develop the CPG,
assessment of the validity of the recommendations made in the guideline, and
consideration of factors related to the use of the CPG in practice.
the AGREE instrument was created to critically appraise CPGs, the process and
criteria can also be applied to the development and evaluation of clinical
practice protocols.
Thus, the AGREE instrument has been expanded for that
purpose: to standardize the creation and revision of the geriatric nursing
practice protocols in this book.
Instruments for clinical Guideline Protocols
The initial AGREE instrument and the one used for clinical
guideline/protocol development in the third edition of this book has six
quality domains:
(a) scope and purpose
(b) stakeholder involvement
(c) rigor
of development
(d) clarity and presentation
(e) application
editorial independence.
A total of 23 items divided among the domains were
rated on a 4-point Likert-type scale from strongly disagree to strongly agree.
Appraisers evaluate how well the guideline they are assessing meets the
criteria ( ie ., items) of the six quality domains, For example, when
evaluating the rigor of development, appraisers rated seven items.
reliability of the AGREE instrument is increased when each guideline is
appraised by more than one appraiser. Each of the six domains receives an
individual domain score and, based on these scores, the appraiser subjectively
assesses the overall quality of a guideline.
Important to note, however, is that the original AGREE instrument
was revised in 2009 , is now called AGREE II, and
is the version that we used for this fourth edition (AGREE Next Steps
Consortium, 2009).
The revision added one new item to the rigor of development
domain. This is the current Item 9, which underscores the importance of
evaluating the evidence that is applied to practice. Item 9 reads: “The
strengths and limitations of the body of evidence are clearly described.”
remainder of the changes included a revision of the Likert-type scale used to
evaluate each item in the AGREE II, a reordering of the number assigned to each
item based on the addition of the new Item 9 and minor editing of items for
clarity. No other substantive changes were made.
standards for literature-searching and documenting the databases and terms
Adhering to these criteria to find and use the best available
evidence on a clinical question is critical to the validity of geriatric
nursing protocols and, ultimately, to patient safety and outcomes of care.
Published guidelines can be appraised using the AGREE instrument as
discussed previously. In the process of guideline development, however, the
clinician is faced with the added responsibility of appraising all available
evidence for its quality and relevance.
In other words, how well does the
available evidence support recommended clinical practices? The clinician needs
to be able to support or defend the inclusion of each recommendation in the
protocol based on its level of evidence. To do so, the guideline must reflect a
systematic, structured approach to find and assess the available evidence.
The Search for Evidence Process
Models of EBP describes the evidence-based process in five steps:
1. Develop an answerable question.
2. Locate the best evidence.
3. Critically appraise the evidence.
4. Integrate the evidence into practice using clinical expertise
with attention to patient’s values and perspectives.
5. Evaluate outcome(s).( Flemming , 1998; McKibbon , Wilczynski ,
Eady , & Marks, 2009; Melnyk & Fineout-Overholt, 2011).
evidence to support development of protocols, guidelines, and reviews requires
a comprehensive and systematic review of the published literature, following
Steps 1 and 2.
A search begins with Step 1, developing an answerable question,
which may be in the form of a specific “foreground” question (one that is
focused on a particular clinical issue), or it may be a broad question (one
that asks for overview information about a disease, condition, or aspect of
healthcare) ( Flemming , 1998; Melnyk & Fineout-Overholt, 2011; Straus et
al., 2005) to gain an overview of the practice problem and interventions and
gain insight into its significance.
This step is critical to identifying
appropriate search terms, possible synonyms, construction of a search strategy,
and retrieving relevant results.
One example of an answerable foreground
question asked in this book is “What is the effectiveness of restraints in
reducing the occurrence of falls in patients 65 years of age and older?”
Foreground questions are best answered by individual primary studies or
synthesizes of studies, such as systematic reviews or meta-analyses.
templates work best to gather the evidence for focused clinical questions (
Glasziou , Del Mar, & Salisbury, 2003). PICO is an acronym for population,
intervention (or occurrence or risk factor), comparison (or control), and
In the preceding question, the population is patients at risk of
falling, 65 years of age and older; the intervention is use of restraints; the
implied comparison or control is no restraints; and the desired outcome is
decreased incidence of falls.
An initial database search would consider the
problem (falls) and the intervention (restraints) to begin to cast a wide net
to gather evidence.
A broader research query, related to a larger category of
disease or problem and encompassing multiple interventions, might be “What is
the best available evidence regarding the use of restraints in residential
facilities?” (Grigs, 2009)
General or overview/background questions may be answered in
textbooks, review articles, and “point-of-care” tools that aggregate overviews
of best evidence, for example, online encyclopaedias, systematic reviews, and
synthesis tools (BMJ Publishing Group Limited; The Cochrane Collaboration;
Joanna Briggs Institute: UpToDate; Wolters Kluwer Health).
This may be helpful
in the initial steps of gathering external evidence to support the significance
of the problem you believe exists prior to developing your PICO question and
investing a great deal of time in a narrow question for which there might be
limited evidence.
Step 2, locating the evidence, requires a literature search based
on the elements identified in the clinical question.
Gathering the evidence for
the protocols in this book presented the challenge to conduct literature
reviews, encompassing both the breadth of overview information as well as the
depth of specificity represented in high-level systematic reviews and clinical
trials to answer specific clinical questions.
For All Nursing Professionals
educator, or administrator, has developed proficient database search skills to
conduct a literature review to locate evidence.
Beyond a basic knowledge of
Boolean logic, truncation, and applying categorical limits to filter results,
competence in “information literacy” (Association of College & Research
Libraries, 2000) requires experience with the idiosyncrasies of databases,
selection of terms, and ease with controlled vocabularies and database
Many nurses report that limited access to resources, gaps in
information literacy skills, and, most of all, a lack of time are barriers to
“readiness” for EBP ( Pravikoff , Tanner, & Pierce, 2005).
For both the third and current edition of this book, the authors
enlisted the assistance of a team of New York University health sciences
librarians to ensure a standard and efficient approach to collecting evidence
on clinical topics.
Librarians as intermediaries have been called “an essential
part of the health care team by allowing knowledge consumers to focus on the
wise interpretation and use of knowledge for critical decision making, rather
than spending unproductive time on its access and retrieval” (Homan. 2010,
The Cochrane Handbook for Systematic Reviews of Interventions points out
the complexity of conducting a systematic literature review and highly
recommends listing the help of a healthcare librarian when searching for
studies to support locating studies for systematic reviews (Section 6.3.1;
Higgins & Green, 2008 ).
The team of librarian/searchers were given the
topics, keywords, and suggested synonyms, as well as the evidence pyramid we
agreed upon, and they were asked to locate the best available evidence for each
broad area addressed in the following chapters.
Search Strategies for Broad Topics
The literature search begins with database selection and
translation of search terms into the controlled vocabulary of the database if
possible. The major databases for finding the best primary evidence for most
clinical nursing questions are CINAHL (Cumulative Index to Nursing and Allied
Health Literature) and MEDLINE.
The PubMed interface to MEDLINE was used, as it
includes added “unprocessed” records to provide access to the most recently
published citations. For most topics, the PsycINFO database was searched to
ensure capturing relevant evidence in the literature of psychology and
behavioral sciences.
The Cochrane Database of Systematic Reviews and the Joanna
Briggs Institute’s evidence summaries (The Cochrane Collaboration; Joanna
Briggs Institute) were also searched to provide authors with another
synthesized source of evidence for broad topic areas.
The AGREE II instrument was used as a standard against which we
could evaluate the process for evidence searching and use in chapter and
protocol development (AGREE Next Steps Consortium, 2009).
Domain 3, rigour of
development, Item 7. states: “The search strategy should be as comprehensive as
possible and executed in a manner free from potential biases and sufficiently
detailed to be replicated.”
Taking a tip from the Cochrane Handbook, a
literature search should capture both the subject terms and the methodological
aspects of studies when gathering relevant records (Higgins & Green, 2008).
Both of these directions were used to develop search strategies and deliver
results to chapter authors using the following guidelines:
To facilitate replication and update of searches in all databases,
search results sent to authors were accompanied by a search strategy. listing
the keywords/ descriptors and search string used in each database searched (eg,
The time period searched was specified (eg 2006-2010). Categorical
limits or methodological filters were specified. (Some examples are the article
type: “meta-analysis” or the “systematic review subset” in Pubmed ; the
“methodology” limit in PsycINFO for meta-analysis OR clinical trial; the
“research” limit in CINAHL.)
To facilitate replication and update of MEDLINE/PubMed searches,
searches were saved and chapter authors were supplied with a login and password
for a My NCBI account (National Centre for Biotechnology Information, US
National Library of Medicine), linking to Saved Searches to be rerun at later
The librarian then aggregated evidence in a Ref Works database and sent
this output to all chapter authors to enhance their knowledge base and provide
a foundation for further exploration of the literature.
Limits, Hedges, and Publication Types
architecture of the individual citations to limit to articles tagged with
publication types (such as “meta-analysis” or “randomized controlled trial” in
In CINAHL, methodological filters or “hedges” (Haynes, Wilczynski ,
McKibbon , Walker, & Sinclair, 1994) for publication types “systematic
review,” “clinical trials,” or “research” articles are available.
The commonly
used PubMed “Clinical Queries” feature is designed for specific clinical questions such as
the example mentioned previously. Gathering evidence to support broader topics,
such as the protocols in this book, presents the searcher with a larger
Limiting searches by methodology can unwittingly eliminate the best
evidence for study designs that do not lend themselves to these methods. For
example, a cross-sectional retrospective design may provide the highest level
of evidence for a study that examines “nurses’ perception” of the practice
environment (Boltz et al., 2008).
Methodological filters have other
limitations, such as retrieving citations tagged “randomized controlled trials
as topic” or abstracts that state a “systematic review of the literature” was
conducted (which is not the same as retrieving a study that is actually a
systematic review ).
Authors were cautioned that the CINAHL database
assigns publication type “systematic review” to numerous citations that upon
review, we judged to be “Level V” review articles (narrative reviews or
literature reviews), not necessarily the high level of evidence we would call
“Level I.” (which according to our scheme are studies that do a rigorous
synthesis and pooling or analysis of research results).
It may not be easily
discernible from an article title and abstract whether the study is a
systematic review with evidence synthesis or a narrative literature review (
Lindbloom , Brandt, Hough, & Meadows, 2007). These pitfalls of computerized
retrieval are justification for the review by the searcher to weed false hits
from the retrieved list of articles.
Precision and recall
An additional challenge to an intermediary searcher is the need to
balance the comprehensiveness of recall (or “sensitivity”) with precision
(“specificity”) to retrieve a “useful” number of references. The Cochrane
Handbook states: “Searches should seek high sensitivity, which may result in
relatively low precision” (Section 6.3; Higgins & Green, 2008).
retrieving a large set of articles may include many irrelevant hits.
Conversely, putting too many restrictions on a search may exclude relevant
The goal of retrieving the relevant studies for broad topic areas
required “sacrificing precision” and deferring to the chapter authors to filter
false or irrelevant hits (Jenkins, 2004; Matthews et al., 1999).
The iterative
nature of a literature search requires that an initial set of relevant
references for both broad or specific research questions serves to point
authors toward best evidence as an adjunct to their own knowledge, their own
pursuit of “chains of citation” (McLellan, 2001 ) and related records, and
their clinical expertise.
Thus, a list of core references on physical
restraints, supplied to a chapter author, might lead to exploring citations
related to wandering, psychogeriatric care, or elder abuse (Fulmer, 2002).
Levels of Evidence
Step 3, critical appraisal of the evidence, begins with identifying
the methodology used in a study (often evident from reviewing the article
abstract) followed by a critical reading and evaluation of the research
methodology and results.
The coding scheme described in the subsequent text
provides the first step in filtering retrieved studies based on research
Levels of evidence offer a schema that, once known, helps the
reader to understand the value of the information presented to the clinical
topic or question under review.
There are many schemas that are used to
identify the level of evidence sources. Although multiple schemas exist, they
have commonalities in their hierarchical structure, often represented by a
pyramid or “publishing wedge” ( DiCenso , Bayley, & Haynes, 2009; McKibbon
et al., 2009).
The highest level of evidence is seen at the top of a pyramid,
characterized by increased relevance to the clinical setting in a smaller
number of studies. The schema used by the authors in this book for rating the
level of evidence comes from the work of Stetler et al. (1998) and Melnyk and
A Level I evidence rating is given to evidence from synthesized
sources (systematic reviews), which can either be meta-analyses or structured
integrative reviews of evidence, and CPG’s based on Level I evidence.
rated as Level II comes from a randomized controlled trial. A
quasi-experimental study such as a nonrandomized controlled single group
pretest and posttest time series or matched case-controlled study is considered
Level III evidence.
Level IV evidence comes from a nonexperimental study, such
as correlational descriptive research and qualitative or case-control studies.
A narrative literature review, a case report systematically obtained and of
verifiable quality, or program evaluation data are rated as Level V.
Level VI
evidence is identified as the opinion of respected authorities (eg, nationally
known) based on their clinical experience or the opinions of an expert
committee, including their interpretation of non-research-based information.
This level also includes regulatory or legal opinions. Level I evidence is
considered the strongest.
in a searchable, web-based Ref Works “shared” folder and coded for level of
evidence. The authors were then charged with reviewing the evidence and
deciding on its quality and relevance for inclusion in their chapter or
The critical appraisal of research uses specialized tools designed to
evaluate the methodology of the study.
Examples are the AGREE instrument (which
this volume of protocols conforms to) (AGREE Next Steps Consortium, 2009), the
Critical Appraisal Skills Program (CASP) (Solutions for Public Health), and the
PRISMA Statement (PRISMA: Transparent reporting) among others.
An additional feature implemented in the previous edition of this
book is the inclusion of the level and type of evidence for each reference,
which leads to a recommendation for practice (See Exhibit 1.1).
Using this type
of standard approach ensures that this book contains protocols and
recommendations for use with geriatric patients and their families that are
based on the best available evidence.