Clinical Practice Guidelines Development and Evaluation

Guideline For Clinical Practice Evaluation

For the evaluation of clinical practice guideline developed. Purposes, Protocols, Agreement Instruments, Evidence Process, Search Strategies.

Clinical Practice Guidelines Development and Evaluation

View of Rona F. Levin and Susan Kaplan Jacobs

Clinical decision making that is grounded in the best available evidence is essential to promote patient safety and quality health care outcomes. With the knowledge base for geriatric nursing rapidly expanding, assessing geriatric clinical practice guidelines (CPGs) for their validity and incorporation of the best available evidence is critical to the safety and outcomes of care. In the second edition of this book, Lucas and Fulmer (2003) challenged geriatric nurses to take the lead in the assessment of geriatric clinical practice guidelines (CPGs), recognizing that in the absence of best evidence, guidelines and protocols have little value for clinical decision making. In the third edition of this book, Levin, Singleton, and Jacobs (2008) proposed a method for ensuring that the protocols included in the book were based on a systematic review of the literature and synthesis of best evidence.

Purposes

There was no standard process or specific criteria for protocol development nor was there any indication of the “level of evidence” of each source cited in the chapter (i.e., the evidence base for the protocol).

Definition Of Terms

Evidence-based practice (EBP) is a framework for clinical practice that integrates the best available scientific evidence with the expertise of the clinician and with patients’ preferences and values to make decisions about health care (Levin & Feldman, 2006; Straus, Richardson, Glasziou, & Haynes, 2005). Health care professionals often use the terms recommendations, guidelines, and protocols interchangeably, but they are not synonymous.

Guideline Protocols

A recommendation is a suggestion for practice, not necessarily sanctioned by a formal, expert group. A clinical practice guideline is an “official recommendation” or suggested approach to diagnose and manage a broad health condition (e.g., heart failure, smoking cessation, or pain management). A protocol is a more detailed guide for approaching a clinical problem or health condition and is tailored to a specific practice situation. For example, guidelines for falls prevention recommend developing a protocol for toileting elderly, sedated, or confused patients (Rich & Newland, 2006). The specific practices or protocol each agency implements, however, is agency specific. The validity of any of these practice guides can vary depending on the type and the level of evidence on which they are based. Using standard criteria to develop or refine CPGs or protocols ensures reliability of their content. Standardization gives both nurses, who use the guideline/protocol, and patients, who receive care based on the guideline/protocol, assurance that the geriatric content and practice recommendations are based on the best evidence.

Standard Practices

In contrast to these practice guides, “standards of practice” are not specific or necessarily evidence-based; rather, they are a generally accepted, formal, published framework for practice. As an example, the American Nurses Association document, Nursing Scope and Standards of Practice (American Nurses Association, 2010), contains a standard regarding nurses’ accountability for making an assessment of a patient’s health status. The standard is a general statement. A protocol, on the other hand, may specify the assessment tool(s) to use in that assessment for example, an instrument to predict pressure-ulcer risk.

The Agreement Instrument

The AGREE (Appraisal of Guidelines for Research & Evaluation) instrument, created and evaluated by international guideline developers and researchers for use by the National Health Services (AGREE Collaboration, 2001). It was initially supported by the UK National Health Services Management Executive and later by the European Union (Cluzeau, Littlejohns, Grimshaw, Feder, & Moran, 1999). Released in 2001 in its initial form, the purpose of the AGREE instrument is to provide standard criteria with which to appraise CPGs. This appraisal includes evaluation of the methods used to develop the CPG, assessment of the validity of the recommendations made in the guideline, and consideration of factors related to the use of the CPG in practice. Although the AGREE instrument was created to critically appraise CPGs, the process and criteria can also be applied to the development and evaluation of clinical practice protocols. Thus, the AGREE instrument has been expanded for that purpose: to standardize the creation and revision of the geriatric nursing practice protocols in this book.

Instruments for Clinical Guideline Protocols

The initial AGREE instrument and the one used for clinical guideline/protocol development in the third edition of this book has six quality domains:
(a) scope and purpose
(b) stakeholder involvement
(c) rigor of development
(d) clarity and presentation
(e) application
(f) editorial independence.

A total of 23 items divided among the domains were rated on a 4-point Likert-type scale from strongly disagree to strongly agree. Appraisers evaluate how well the guideline they are assessing meets the criteria (i.e., items) of the six quality domains, For example, when evaluating the rigor of development, appraisers rated seven items. The reliability of the AGREE instrument is increased when each guideline is appraised by more than one appraiser. Each of the six domains receives an individual domain score and, based on these scores, the appraiser subjectively assesses the overall quality of a guideline.

Important to note, however, is that the original AGREE instrument was revised in 2009, is now called AGREE II, and is the version that we used for this fourth edition (AGREE Next Steps Consortium, 2009). The revision added one new item to the rigor of development domain. This is the current Item 9, which underscores the importance of evaluating the evidence that is applied to practice. Item 9 reads: “The strengths and limitations of the body of evidence are clearly described.” The remainder of the changes included a revision of the Likert-type scale used to evaluate each item in the AGREE II, a reordering of the number assigned to each item based on the addition of the new Item 9 and minor editing of items for clarity. No other substantive changes were made. The rigor of development section of the AGREE instrument provides standards for literature-searching and documenting the databases and terms searched. Adhering to these criteria to find and use the best available evidence on a clinical question is critical to the validity of geriatric nursing protocols and, ultimately, to patient safety and outcomes of care.

Published guidelines can be appraised using the AGREE instrument as discussed previously. In the process of guideline development, however, the clinician is faced with the added responsibility of appraising all available evidence for its quality and relevance. In other words, how well does the available evidence support recommended clinical practices? The clinician needs to be able to support or defend the inclusion of each recommendation in the protocol based on its level of evidence. To do so, the guideline must reflect a systematic, structured approach to find and assess the available evidence.

The Search for Evidence Process

Models of EBP describes the evidence-based process in five steps:

  1. Develop an answerable question.
  2. Locate the best evidence.
  3. Critically appraise the evidence.
  4. Integrate the evidence into practice using clinical expertise with attention to patient’s values and perspectives.
  5. Evaluate outcome(s). (Flemming, 1998; McKibbon, Wilczynski, Eady, & Marks, 2009; Melnyk & Fineout-Overholt, 2011).

Locating evidence to support development of protocols, guidelines, and reviews requires a comprehensive and systematic review of the published literature, following Steps 1 and 2. A search begins with Step 1, developing an answerable question, which may be in the form of a specific “foreground” question (one that is focused on a particular clinical issue), or it may be a broad question (one that asks for overview information about a disease, condition, or aspect of healthcare) (Flemming, 1998; Melnyk & Fineout-Overholt, 2011; Straus et al., 2005) to gain an overview of the practice problem and interventions and gain insight into its significance. This step is critical to identifying appropriate search terms, possible synonyms, construction of a search strategy, and retrieving relevant results. One example of an answerable foreground question asked in this book is “What is the effectiveness of restraints in reducing the occurrence of falls in patients 65 years of age and older?” Foreground questions are best answered by individual primary studies or synthesizes of studies, such as systematic reviews or meta-analyses. PICO templates work best to gather the evidence for focused clinical questions (Glasziou, Del Mar, & Salisbury, 2003). PICO is an acronym for population, intervention (or occurrence or risk factor), comparison (or control), and outcome. In the preceding question, the population is patients at risk of falling, 65 years of age and older; the intervention is use of restraints; the implied comparison or control is no restraints; and the desired outcome is decreased incidence of falls. An initial database search would consider the problem (falls) and the intervention (restraints) to begin to cast a wide net to gather evidence.

A broader research query, related to a larger category of disease or problem and encompassing multiple interventions, might be “What is the best available evidence regarding the use of restraints in residential facilities?” (Grigs, 2009). General or overview/background questions may be answered in textbooks, review articles, and “point-of-care” tools that aggregate overviews of best evidence, for example, online encyclopedias, systematic reviews, and synthesis tools (BMJ Publishing Group Limited; The Cochrane Collaboration; Joanna Briggs Institute: UpToDate; Wolters Kluwer Health). This may be helpful in the initial steps of gathering external evidence to support the significance of the problem you believe exists prior to developing your PICO question and investing a great deal of time in a narrow question for which there might be limited evidence.

Step 2, locating the evidence, requires a literature search based on the elements identified in the clinical question. Gathering the evidence for the protocols in this book presented the challenge to conduct literature reviews, encompassing both the breadth of overview information as well as the depth of specificity represented in high-level systematic reviews and clinical trials to answer specific clinical questions.

For All Nursing Professionals

Not every nurse, whether he or she is a clinical practitioner, educator, or administrator, has developed proficient database search skills to conduct a literature review to locate evidence. Beyond a basic knowledge of Boolean logic, truncation, and applying categorical limits to filter results, competence in “information literacy” (Association of College & Research Libraries, 2000) requires experience with the idiosyncrasies of databases, selection of terms, and ease with controlled vocabularies and database functionality. Many nurses report that limited access to resources, gaps in information literacy skills, and, most of all, a lack of time are barriers to “readiness” for EBP (Pravikoff, Tanner, & Pierce, 2005).

For both the third and current edition of this book, the authors enlisted the assistance of a team of New York University health sciences librarians to ensure a standard and efficient approach to collecting evidence on clinical topics. Librarians as intermediaries have been called “an essential part of the health care team by allowing knowledge consumers to focus on the wise interpretation and use of knowledge for critical decision making, rather than spending unproductive time on its access and retrieval” (Homan, 2010, p.51). The Cochrane Handbook for Systematic Reviews of Interventions points out the complexity of conducting a systematic literature review and highly recommends listing the help of a healthcare librarian when searching for studies to support locating studies for systematic reviews (Section 6.3.1; Higgins & Green, 2008). The team of librarian/searchers were given the topics, keywords, and suggested synonyms, as well as the evidence pyramid we agreed upon, and they were asked to locate the best available evidence for each broad area addressed in the following chapters.

Search Strategies for Broad Topics

The literature search begins with database selection and translation of search terms into the controlled vocabulary of the database if possible. The major databases for finding the best primary evidence for most clinical nursing questions are CINAHL (Cumulative Index to Nursing and Allied Health Literature) and MEDLINE. The PubMed interface to MEDLINE was used, as it includes added “unprocessed” records to provide access to the most recently published citations. For most topics, the PsycINFO database was searched to ensure capturing relevant evidence in the literature of psychology and behavioral sciences. The Cochrane Database of Systematic Reviews and the Joanna Briggs Institute’s evidence summaries (The Cochrane Collaboration; Joanna Briggs Institute) were also searched to provide authors with another synthesized source of evidence for broad topic areas.

The AGREE II instrument was used as a standard against which we could evaluate the process for evidence searching and use in chapter and protocol development (AGREE Next Steps Consortium, 2009). Domain 3, rigour of development, Item 7. states: “The search strategy should be as comprehensive as possible and executed in a manner free from potential biases and sufficiently detailed to be replicated.” Taking a tip from the Cochrane Handbook, a literature search should capture both the subject terms and the methodological aspects of studies when gathering relevant records (Higgins & Green, 2008). Both of these directions were used to develop search strategies and deliver results to chapter authors using the following guidelines:

To facilitate replication and update of searches in all databases, search results sent to authors were accompanied by a search strategy. listing the keywords/ descriptors and search string used in each database searched (e.g., MEDLINE, PsycINFO, CINAHL). The time period searched was specified (e.g. 2006-2010). Categorical limits or methodological filters were specified. (Some examples are the article type: “meta-analysis” or the “systematic review subset” in Pubmed; the “methodology” limit in PsycINFO for meta-analysis OR clinical trial; the “research” limit in CINAHL.) To facilitate replication and update of MEDLINE/PubMed searches, searches were saved and chapter authors were supplied with a login and password for a My NCBI account (National Centre for Biotechnology Information, US National Library of Medicine), linking to Saved Searches to be rerun at later dates. The librarian then aggregated evidence in a Ref Works database and sent this output to all chapter authors to enhance their knowledge base and provide a foundation for further exploration of the literature.

Limits, Hedges, and Publication Types

Most bibliographic databases have the functionality to exploit the architecture of the individual citations to limit to articles tagged with publication types (such as “meta-analysis” or “randomized controlled trial” in MEDLINE). In CINAHL, methodological filters or “hedges” (Haynes, Wilczynski, McKibbon, Walker, & Sinclair, 1994) for publication types “systematic review,” “clinical trials,” or “research” articles are available. The commonly used PubMed “Clinical Queries” feature is designed for specific clinical questions such as the example mentioned previously. Gathering evidence to support broader topics, such as the protocols in this book, presents the searcher with a larger challenge. Limiting searches by methodology can unwittingly eliminate the best evidence for study designs that do not lend themselves to these methods. For example, a cross-sectional retrospective design may provide the highest level of evidence for a study that examines “nurses’ perception” of the practice environment (Boltz et al., 2008). Methodological filters have other limitations, such as retrieving citations tagged “randomized controlled trials as topic” or abstracts that state a “systematic review of the literature” was conducted (which is not the same as retrieving a study that is actually a systematic review).

Authors were cautioned that the CINAHL database assigns publication type “systematic review” to numerous citations that upon review, we judged to be “Level V” review articles (narrative reviews or literature reviews), not necessarily the high level of evidence we would call “Level I.” (which according to our scheme are studies that do a rigorous synthesis and pooling or analysis of research results). It may not be easily discernible from an article title and abstract whether the study is a systematic review with evidence synthesis or a narrative literature review (Lindbloom, Brandt, Hough, & Meadows, 2007). These pitfalls of computerized retrieval are justification for the review by the searcher to weed false hits from the retrieved list of articles.

Precision and Recall

An additional challenge to an intermediary searcher is the need to balance the comprehensiveness of recall (or “sensitivity”) with precision (“specificity”) to retrieve a “useful” number of references. The Cochrane Handbook states: “Searches should seek high sensitivity, which may result in relatively low precision” (Section 6.3; Higgins & Green, 2008). Thus, retrieving a large set of articles may include many irrelevant hits. Conversely, putting too many restrictions on a search may exclude relevant studies. The goal of retrieving the relevant studies for broad topic areas required “sacrificing precision” and deferring to the chapter authors to filter false or irrelevant hits (Jenkins, 2004; Matthews et al., 1999). The iterative nature of a literature search requires that an initial set of relevant references for both broad or specific research questions serves to point authors toward best evidence as an adjunct to their own knowledge, their own pursuit of “chains of citation” (McLellan, 2001) and related records, and their clinical expertise. Thus, a list of core references on physical restraints, supplied to a chapter author, might lead to exploring citations related to wandering, psychogeriatric care, or elder abuse (Fulmer, 2002).

Levels of Evidence

Step 3, critical appraisal of the evidence, begins with identifying the methodology used in a study (often evident from reviewing the article abstract) followed by a critical reading and evaluation of the research methodology and results. The coding scheme described in the subsequent text provides the first step in filtering retrieved studies based on research methods.

Levels of evidence offer a schema that, once known, helps the reader to understand the value of the information presented to the clinical topic or question under review. There are many schemas that are used to identify the level of evidence sources. Although multiple schemas exist, they have commonalities in their hierarchical structure, often represented by a pyramid or “publishing wedge” (DiCenso, Bayley, & Haynes, 2009; McKibbon et al., 2009). The highest level of evidence is seen at the top of a pyramid, characterized by increased relevance to the clinical setting in a smaller number of studies. The schema used by the authors in this book for rating the level of evidence comes from the work of Stetler et al. (1998) and Melnyk and Fineout-Overholt.

A Level I evidence rating is given to evidence from synthesized sources (systematic reviews), which can either be meta-analyses or structured integrative reviews of evidence, and CPG’s based on Level I evidence. Evidence rated as Level II comes from a randomized controlled trial. A quasi-experimental study such as a nonrandomized controlled single group pretest and posttest time series or matched case-controlled study is considered Level III evidence. Level IV evidence comes from a nonexperimental study, such as correlational descriptive research and qualitative or case-control studies. A narrative literature review, a case report systematically obtained and of verifiable quality, or program evaluation data are rated as Level V. Level VI evidence is identified as the opinion of respected authorities (e.g., nationally known) based on their clinical experience or the opinions of an expert committee, including their interpretation of non-research-based information. This level also includes regulatory or legal opinions. Level I evidence is considered the strongest.

For all topics, the results of literature searches were organized in a searchable, web-based Ref Works “shared” folder and coded for level of evidence. The authors were then charged with reviewing the evidence and deciding on its quality and relevance for inclusion in their chapter or protocol. The critical appraisal of research uses specialized tools designed to evaluate the methodology of the study. Examples are the AGREE instrument (which this volume of protocols conforms to) (AGREE Next Steps Consortium, 2009), the Critical Appraisal Skills Program (CASP) (Solutions for Public Health), and the PRISMA Statement (PRISMA: Transparent reporting) among others.

An additional feature implemented in the previous edition of this book is the inclusion of the level and type of evidence for each reference, which leads to a recommendation for practice (See Exhibit 1.1). Using this type of standard approach ensures that this book contains protocols and recommendations for use with geriatric patients and their families that are based on the best available evidence.

Leave a Comment