A journal dedicated to allied health professional practice and education
http://ijahsp.nova.edu      Vol. 3 No. 1    ISSN 1540-580X 

A Peer Reviewed Publication of the College of Allied Health & Nursing at Nova Southeastern University

Development of a Generic Critical Appraisal Tool by Consensus: Presentation of First Round Delphi Survey Results


Jeannie Burnett, M.App.Sc
The Centre for Allied Health Evidence
University of South Australia

Karen Grimmer, PhD
The Centre for Allied Health Evidence
University of South Australia
 

Saravana Kumar, M.App.Sc
The Centre for Allied Health Evidence
University of South Australia
 

 

Correspondence:
 

Karen Grimmer, PhD
The Centre for Allied Health Evidence
University of South Australia
North Terrace,
Adelaide 5001 .
 


Citation:
Burnett, J., Grimmer, K., Saravana, K. Development of a generic critical appraisal tool by consensus: presentation of first round Delphi survey results.  The Internet Journal of Allied Health Sciences and Practice. January 2005. Volume 3 Number 1.


Study Funded by: The Centre for Allied Health Evidence, University of South Australia
 

Abstract
 

The growing importance of evidence based practice is necessitating academics and clinicians to be able to make judgments about the quality of the body of research evidence pertaining to clinical questions. There are numerous critical appraisal tools to assist this process. These are mostly designed for specific research designs, and tend not to reflect the particular concerns of allied health professionals, such as accuracy of diagnosis, adequate description of intervention, and sensitivity and utility of outcome measures. This paper reports the findings of a study which sought expert opinion on the essential criteria for critical appraisal, and whether a generic critical appraisal tool could be developed for allied health use. A modified Delphi technique was used to identify experts, and determine key criteria.
 

Fifteen Australian allied health professionals participated, and identified key criteria as clinical relevance, methodological robustness, statistical robustness, aims that are clearly stated and conclusions that are reasonable considering the results. In terms of the development of a generic critical appraisal tool for all research designs, the opinion was that to adequately deal with critical appraisal of qualitative and quantitative research designs within a generic tool would be challenging.
 

Key words and terms: critical appraisal, allied health, delphi survey, questionnaire


Introduction
 

Critical appraisal forms the basis of uptake of evidence in clinical practice. It is through the application of critical appraisal that researchers, clinicians and other stakeholders in health care can evaluate the strength of available evidence.1-3 This process enables stakeholders to make informed judgments about the effectiveness of therapies.4-7 Historically, evidence based medicine began in medical disciplines, with a recent adoption into of allied health.4  A recent systematic review of critical appraisal tools found one hundred and ninety-three different published critical appraisal tools.8  The 108 papers that were included in the review were in the most part, specific to quantitative research designs with very few being developed specifically with allied health requirements  in mind. This review found no “Gold Standard” critical appraisal tool, and identified the need to further investigate the needs of allied health.

Allied health interventions differ from medical interventions in the following ways:
 

  • The use of clinically reasoned diagnosis as opposed to the applicability of diagnostic tests and imaging that is available in medical diagnosis.9

  • Often multiple interventions are provided in one treatment session. This requires research to provide reproducible descriptions of interventions in terms of their relevance to the diagnosis, their intensity and frequency, their order of administration and instructions given to the patient.10  

  • Clients are often seen over a period of time (an episode of care). This necessitates regular follow-up to ascertain short and long term effectiveness of interventions.11, 12

  • The need to demonstrate the use of appropriate outcome measures that can sensitively and reliably detect a change in impairment, function and participation status.13  Such outcome measures should reflect the needs of relevant stakeholders (clinician, patient, insurer etc).9, 10, 12
     

For these reasons critical appraisal tools that do not reflect the perspectives of allied health may not provide sufficiently sensitive or appropriate information about the quality of the body of research evidence for therapies. While there are critical tools developed by Allied Health, the systematic review by Katrak et al identified that consensus as to the appropriate criteria in critical appraisal tools is lacking.8, 14-16
 

The other feature of existing critical appraisal tools is that they are predominantly design- specific. The question therefore is whether a generic critical appraisal tool that can be applied across quantitative and qualitative research designs can be constructed. The abovementioned review of critical appraisal tools identified five papers that either presented a generic critical appraisal tool, or alluded to potential criteria relevant to such a tool.8, 17-21
 

Common themes were:
 

  • Sufficient number of people included in the study (power calculations for experimental designs).

  • Was the study methodology described in sufficient detail to allow replication?

  • Was subject compliance dealt with? eg was the intervention acceptable to subjects.

  • Do the conclusions make sense biologically, sociologically and economically?

  • Are the results of the study applicable in a clinical setting and was clinical relevance of the findings dealt with.
     

These common generic criteria provide a reasonable basis upon which to develop an allied health generic tool. The aim of this study was to determine consensus amongst content experts as to essential criteria for a generic tool that could be applied across research designs and which was applicable to allied health requirements.

 

Method

 

A Modified Delphi Technique was used to investigate this question.22 Delphi surveying technique is designed to turn opinion into consensus via asking content experts questions which are then coded into key issues. These issues are re-presented to the respondents for further consideration and comment.23-25 Delphi and other consensus gaining techniques have been previously been used in the development of critical appraisal tools. 1, 26-33  
 

Subject recruitment:

 

For the purposes of this study an expert was defined as someone who has a known or stated interest in the topic.34 In the first step, heads of Schools of Physiotherapy, Podiatry, Speech Pathology, Occupational Therapy, Nursing and Medical Radiations around Australia were contacted. They were asked to identify colleagues who had an interest in Evidence Based Practice. This list was supplemented by names of colleagues known to the authors as having an interest in evidence based practice. Those on this list were contacted via e-mail and invited to participate in the survey. They were advised that they would show their consent by completing and returning the questionnaire provided as an attachment to the email.

In step 2 a list of Allied Health professionals employed by university faculties was derived by performing an internet search of all Australian Allied Health faculties. Those who stated an interest in Evidence based practice in their staff biography. This list was mutually exclusive to the list derived in step.1

 

Inclusion and Exclusion Criteria:

 

Inclusion was based on a known or reported interest in EBP. Response to the invitation to participate was considered to fill the inclusion criteria, and implied consent to participate in the study. There were no exclusion criteria, as all individuals approached were considered to have a significant interest in evidence-based practice. 

Information provision

Those on the list were e-mailed a letter inviting them to fill out a questionnaire that was sent as an attachment. Consent was obtained through the completion of the questionnaire (see appendix 1 -consent paragraph).

The questionnaire recorded the following information:

  • Demographic data: profession and position currently held.

  •  How many times participants had used a critical appraisal tool in the past month.

  • Which critical appraisal tool(s) they used most frequently.

Data collection and synthesis:

Respondents to the questionnaire were asked to list the core elements of a critical appraisal tool (see Figure 1). These elements were divided into those pertaining to internal validity of a study, and those to external validity. Respondents were initially asked to rank the elements in order of importance using numbers (1 being the most important). Finally respondents were asked to identify up to three colleagues who also had an interest in EBP. Those named would also receive an invitation to participate by completing the questionnaire.


Figure 1: Questionnaire emailed to expert list.

FIRST ROUND DELPHI QUESTIONNAIRE

 Development of a critical appraisal tool for use with allied health research.

 Thank you for agreeing to fill out the following questionnaire.

 PROFESSION:

 CURRENT PLACE OF EMPLOYMENT AND POSITION HELD:

1. How many times in the past month have you used a critical appraisal tool (approx).

2. Please indicate the tool(s) that you use most frequently (optional).

3. Please list criteria relevant to internal validity.

In the right hand column rank numerically those that you consider essential.

CRITERIA

RATING

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4. Please list criteria relevant to external validity.

 
In the right hand column
rank numerically those that you consider essential 

CRITERIA

RATING

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4. Finally we ask that you identify up to 3 colleagues who share an interest in evidence based practice. They will be asked to fill out section 1 and 2 of this form. This section is optional.

 


  1.  

  2.  

Thank you for finding the time to fill out this questionnaire.


The interview delivered questionnaire was conducted in a semi-structured manner, hence rating of criteria was not called for. The interviews were conducted face to face or by telephone. The interviewer (JB) transcribed the interviews, which were then forwarded to the interviewee for verification. All interviewees were invited to alter the transcript to ensure that it represented their view correctly. This validated transcript was then used in data analysis. The interviewees were also asked for their opinion as to whether a generic tool would be useful in the process of collation of evidence for Allied Health therapies.

Due to the two methods of questionnaire delivery, a snowballing approach was taken when combining the findings to ensure validity of findings.

Data collation and analysis:

The responses were collated by one person (JB) and the list was cross-checked by a second independent person (KG) to address potential bias in the inclusion of appraisal elements.

Data collation was undertaken using an Excel spread sheet. The frequencies with which items were mentioned were tabulated. The frequency with which a criterion was entered in order of preference (as first in importance, second in importance) was determined in order to determine the relative importance of each criterion. A summary list that was reflective of responses was then developed.

Results

Responses to surveying

 Figure 2 provides a flow chart of responses to contacts via the two steps in the emailed surveys. 

 


The respondent sample characteristics are presented in Tables 1 and 2 in terms of profession and employment type.


Table 1: Proportion of respondent sample academic versus clinician.

 

Questionnaire (n=8)

Interview (n=7)

Total

Academic only

6

5

11(73%)

Clinical only

1

 

1(7%)

Clinical and Academic

1

2

3(20%)

Totals

8

7

15 (100%)


Table 2: Breakdown of respondent sample by profession.

Profession

Questionnaire (n=8)

Interview (n=7)

Total

Physiotherapist

7

2

9 (60%)

Occupational Therapist

1

3

4(26%)

Podiatrist

1

1

2(13%)

Social Worker

 

 

 

Speech Therapist

 

1

1(1%)


As is evident from the critical appraisal usage responses from the sample, detailed in Table 3; 10 of the respondents stated that they had not formally used a critical appraisal tool in the past month. Of these, one indicated that in the past 6 months she had read 10 articles at least per week for her doctoral studies and had used a mental checklist to rate the quality of the article. Four are involved in actively teaching evidence-based practice, one held a senior clinical position and the other two held academic positions. Thus despite the lack of recent use of a critical appraisal we were satisfied that those responding fitted our definition of an expert.34


Table 3: Frequency of use of critical appraisal tools.

Times in the past month used CAT

 

Daily

1

2-3 daily

1

2 times past month

1

4

1

5

1

0

10


Tables 4 and 5 present the frequencies of criterion as mentioned in the questionnaire and interviews.


Table 4: Criteria code for internal validity and occurrence within the responses.

Criteria

Questionnaire

Interview

TOTAL

Internal validity

Frequency

Frequency

 

Outcome measures psychometry (ie reliable and valid).

5

4

9

Baseline equivalence in characteristics (including outcome measures) and inclusion/exclusion criteria stated.

5

1

6

Appropriate study question and design.

3

4

7

Randomisation

4

1

5

Assessor blinding

4

 

4

Sampling (technique, power, size)

3

12

15

Intention to treat, follow-up of dropouts, dropout <85%

5

1

6

Therapist blinding

1

 

1

Subject blind and compliant

1

 

1

Presence of bias.

1

 

1

Triangulation (Qual)

2

 

2

Quotes provided (qual)

1

 

1

Ethics approvals noted

1

 

1

History (consideration of natural history as effect or maturation)

2

 

2

Appropriate  statistical tests

5

1

6

Confounders and effect modifiers dealt with

1

1

2

Methodological robustness

 

3

3


Table 5: Occurrence of criterion for external validity within the responses.

External validity:

Frequency

Frequency

Total

Clinical relevance (includes applicability)

9

9

18

Statistical significance (including effect size)

6

 

6

Sample representative of larger population

3

4

7

Effect modifiers identified.

1

 

1

Aims clear and contextualized

1

2

3

Conclusions appropriate and relevant to results.

2

3

5

Drop outs reported

1

 

1

Time study undertaken

1

 

1

Patient compliance reported.

1

 

1

Intervention described

1

1

2

Applicability to population (culturally sensitive)

 

1

1

Divergent findings

 

1

1

Presentation of stream to coding.

 

1

1

Issues re bias in publication.

 

1

1



Coding the ‘clinical relevance’ responses
 

The most complex responses pertained to clinical relevance of the research.   Examples are provided below of quotations that were coded as ‘clinical relevance’.

  • “generalisability 2.extent to which theory derived from the research can be applied in other settings 3. Reproducibility of methods/findings."
     

  • “Discussion of clinical importance”
     

  • “Implication for field research. Clinical relevance”
     

  • “Specification of the intervention (dosage-response effect and timing; and disease outcomes, disease severity, co-interventions and patient characteristics.”
     

  • “..intervention appropriate to clinical setting (number of treatments. type of treatment etc).”

Coding the responses relevant to the other themes was straightforward and thus pertinent quotations were not considered necessary for this section.  

In the interview round of surveying only five new criteria were mentioned. These were;

1.   Methodological Robustness

2.   Applicability to Population (in particular taking into account the cultural mix of the target population).

3.   Divergent findings are presented.

4.   Presentation of stream to coding.

5.   Issues concerning bias in publication stated..

It illustrates that snowballing occurred with this second sample. It is apparent that some measure of how well the study was carried out (within the boundaries of whatever design has been chosen) was considered important. Therefore in reality only four new criteria were mentioned as” methodological robustness” really encompasses the other criteria stated that deal with design specific requirements (eg sampling technique and randomisation for certain experimental designs). 

The common themes of criteria from the sample were:

  • Having a justified size, well described sample was important in making a judgment about the internal validity as well as how generalisable (external validity) the results were.
     

  • That an appropriate study design for the research question was important.
     

  • Statistical tests that are appropriate to the design and research question.

    Design specific indicators of robustness are important. Twenty nine of the responses were related to design specific features. These came from categories relating to blinding, randomisation, and statistical appropriateness for quantitative, and triangulation and provision of streaming to coding for qualitative.
     

  • Clinical relevance and applicability is vital in giving research results meaning, particularly when investigating clinical interventions. It is of note that one respondent indicated that clinical relevance was not an important part of critical appraisal.  It was her opinion that this was the responsibility of the reader, not the author, to determine this construct. Clinical relevance can encompass the consideration of whether there is a clinically significant effect size, whether an intervention is applicable to the clinical setting, and whether the intervention or phenomena under investigation is important to the various stakeholders.
     

  • Outliers in the data set need to be investigated, using intention to treat analysis where appropriate or presentation of outlying data or opinions.  Also important here was that dropouts were at an acceptable level.

Opinion surrounding the development of a generic critical appraisal tool for Allied Health.

In comments made as part of the questionnaire all eight of the questionnaire respondents indicated that design specific tools are perhaps the most valid. For example “I believe that the most appropriate items in a critical review form will depend on the type of evidence that is evaluated. For example, the qualitative and quantitative paradigms are completely different and therefore different items would be appropriate in each situation."

The range of responses in the interview to the question of the usefulness and plausibility of developing a generic critical appraisal tool are quoted below;

  • “Not useful at all. Critical Appraisal Tools need to be design specific. Can’t see the call for it. As a profession physiotherapy deal with multiple outcome measures, so can deal with interpreting systematic reviews based on different critical appraisal tools.” (PT)
     

  • “Did mention that a tool for Allied Health would be beneficial as tools (eg Sackett criteria) are based on medical model.”(OT)
     

  • “Challenge is to capture the issues of internal validity that are design specific” (ST)
     

  • “Yes……Problem with PEDro (14) is that it only applies to certain designs…a tool that applied to cohorts, case control as well for example would be helpful. Interested to see design features pertaining to internal validity included.”(PT)
     

  • “Not sure that Generic tool would be useful”(OT)   
     

  • “….Qualitative  and quantitative studies have a different place in evidence. They ask quite different questions. Qualitative asks for opinion, experience of the subjects….quantitative looks for an effect. Therefore potentially a generic tool not helpful in determining levels of evidence.”

The last quotation came from a participant who had a self professed bias towards qualitative research.

Usefulness of a generic critical appraisal tool

Table 6 outlines an even spread of opinion (for and against) the development of a generic critical appraisal tool.  The themes relevant to the comments from those respondents who were not in favour of a generic tool were around the challenge of including sufficient sensitivity regarding design specific features.   Those that were for the development of a generic critical appraisal tool highlighted the need for consensus to develop such a critical appraisal tool.  This group also pointed out the challenge of dealing with design specific criterion within a generic tool.

Discussion

This is the first known Australian study which has attempted to develop a general critical appraisal tool relevant to allied health.  The overall findings from this study were the importance of clinical relevance, sample size and characteristics and design specific robustness as features of a critical appraisal tool (CAT). The sample in this study was biased towards academics (73% employed in academic, 23% in combined clinical and academic roles). Only one respondent came from a clinical background. It is reasonable to expect that a sample of clinicians may provide a different set of criteria. The sampling process used in this study attempted to apply a systematic approach to inviting both clinicians and academics to respond. It is perhaps indicative of where the interest and work in evidence based practice is done that such an academic sample was attained. Despite the loading of academics in the sample it is noteworthy that criteria pertaining to clinical relevance were a prominent feature of the criteria list.

The question of whether this sample is representative of the broader Allied health professions remains. The sample size was small (n=15), and the major proportion of respondents came form the Physiotherapy and Occupational professions. No Social workers responded to our invitation to participate. Two speech therapists agreed and one was interviewed; the other for practical reasons could not be contacted within the time frame of the study. Future work into developing cons