Background

JMIR

J Med Internet Res

Journal of Medical Internet Research

1438-8871

JMIR Publications

Toronto, Canada

v25i1e43928

37279050

10.2196/43928

Original Paper

Similar Outcomes of Web-Based and Face-to-Face Training of the GRADE Approach for the Certainty of Evidence: Randomized Controlled Trial

Leung

Tiffany

Santesso

Nancy

Shubina

Ivanna

Ganesh

Shankar

Tokalić

Ružica

MD, PhD 1 2

https://orcid.org/0000-0002-0037-4905

Poklepović Peričić

Tina

DMD, PhD 1 2

https://orcid.org/0000-0002-9686-5062

Marušić

Ana

MD, PhD 1

Department of Research in Biomedicine and Health Center for Evidence-based Medicine University of Split School of Medicine

Šoltanska 2

Split, 21000

Croatia 385 21 557 812 ana.marusic@mefst.hr

https://orcid.org/0000-0001-6272-0917

1 Department of Research in Biomedicine and Health Center for Evidence-based Medicine University of Split School of Medicine

Split

Croatia 2 Cochrane Croatia University of Split School of Medicine

Split

Croatia

Corresponding Author: Ana Marušić ana.marusic@mefst.hr

2023

6 6 2023

e43928

30 10 2022 14 12 2022 21 2 2023 14 3 2023

©Ružica Tokalić, Tina Poklepović Peričić, Ana Marušić. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 06.06.2023.

2023

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

Background

The GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach is a system for transparent evaluation of the certainty of evidence used in clinical practice guidelines and systematic reviews. GRADE is a key part of evidence-based medicine (EBM) training of health care professionals.

Objective

This study aimed to compare web-based and face-to-face methods of teaching the GRADE approach for evidence assessment.

Methods

A randomized controlled trial was conducted on 2 delivery modes of GRADE education integrated into a course on research methodology and EBM with third-year medical students. Education was based on the Cochrane Interactive Learning “Interpreting the findings” module, which had a duration of 90 minutes. The web-based group received the web-based asynchronous training, whereas the face-to-face group had an in-person seminar with a lecturer. The main outcome measure was the score on a 5-question test that assessed confidence interval interpretation and overall certainty of evidence, among others. Secondary outcomes included writing a recommendation for practice and course satisfaction.

Results

In all, 50 participants received the web-based intervention, and 47 participants received the face-to-face intervention. The groups did not differ in the overall scores for the Cochrane Interactive Learning test, with a median of 2 (95% CI 1.0-2.0) correct answers for the web-based group and 2 (95% CI 1.3-3.0) correct answers for the face-to-face group. Both groups gave the most correct answers to the question about rating a body of evidence (35/50, 70% and 24/47, 51% for the web-based and face-to-face group, respectively). The face-to-face group better answered the question about the overall certainty of evidence question. The understanding of the Summary of Findings table did not differ significantly between the groups, with a median of 3 correct answers to 4 questions for both groups (P=.352). The writing style for the recommendations for practice also did not differ between the 2 groups. Students’ recommendations mostly reflected the strengths of the recommendations and focused on the target population, but they used passive words and rarely mentioned the setting for the recommendation. The language of the recommendations was mostly patient centered. Course satisfaction was high in both groups.

Conclusions

Training in the GRADE approach could be equally effective when delivered asynchronously on the web or face-to-face.

Trial Registration

Open Science Framework akpq7; https://osf.io/akpq7/

Grading of Recommendations Assessment, Development and Evaluation GRADE education online face-to-face evidence-based medicine guideline randomized controlled trial RCT randomized evidence assessment teaching medical education research method online education library science information science medical librarian

Introduction Background

The extent of confidence in the desirable effects of an intervention outweighing the undesirable ones is a valuable indicator in the strength of recommendations for clinical practice [1]. The GRADE (Grading of Recommendations Assessment, Development and Evaluation) approach is a system that has been designed by a group of international guideline developers for transparent evaluation of the certainty of evidence and the development of transparent, robust, and trustworthy guidelines. The process of GRADE-ing should begin with a formulated health care question in a patient, population, or problem; intervention; comparison; and outcome format that is then used to systematically search relevant databases and select relevant studies. Patient-relevant outcomes are then assessed for each included study while focusing on study limitations, precision, the directness of evidence, the consistency of results, and possible confounders, among others. The certainty of evidence is then decided for each outcome, ranging from very low to high. The overall quality of evidence for the main health care question will depend on the lowest-rated critical outcome. To decide on the final recommendation, guideline developers use the overall certainty of the evidence, balance of unwanted effects and values, and preferences of the patients [2,3].

GRADE is not the first nor the only approach for assessing the certainty of evidence and assigning the strength of recommendations. It is, however, the approach used in Cochrane systematic reviews and many clinical practice guidelines. The GRADE approach includes a Summary of Findings (SoF) table to make the process of judging evidence and translating it into a recommendation more accessible for a broader audience, primarily for end users—clinicians, patients, and policy makers. It is a systematic, transparent, and concise report of key information that includes the certainty of evidence and the effect size of an intervention used for each outcome and across outcomes [3]. SoF tables in systematic reviews ease the understanding of the certainty of evidence and the review’s key points [4].

To effectively apply clinical practice guidelines and other summarized formats of evidence, health care providers need to have the evidence-based medicine (EBM) skills necessary for understanding and the application of clinical practice guidelines. Aside from occasional specialized courses and a short video series [5], there are no official GRADE educational resources. Research methodology and statistics courses in medical schools provide the basis necessary for understanding GRADE [6].

Web-based education uses web-based technologies for knowledge and skills improvement. It can be asynchronous, in which users can individually access it anytime and progress through it at their own pace. It can also be synchronous, in which users have to access it at certain times, usually in some form of webinars. Web-based educational interventions have shown noninferior results in learning and participant satisfaction outcomes compared to face-to-face learning in medicine, including communication skills and cardiology [7,8]. Asynchronous web-based education was successfully used as supplementary learning in emergency medicine and for knowledge on systematic reviews [9,10]. In addition, for EBM, the cost-effectiveness of web-based education was superior to that of traditional face-to-face learning [11]. In this study, we wanted to test if the addition of a GRADE-focused educational content into a basic EBM course could increase the understanding of SoF tables among medical students and whether the delivery mode of that content influences the learning outcomes. We also assessed how the 2 modes of GRADE training affected the application of the GRADE approach in providing recommendations for clinical practice.

Aim

The aim of this study was to determine the effectiveness of a web-based educational intervention for the GRADE approach to evidence assessment, compared to traditional classroom education, in terms of knowledge and the understanding of the SoF table.

Methods Trial Design and Participants

This was a parallel-group randomized controlled trial. Participants were third-year medical students in Croatian- and English-language programs at the University of Split School of Medicine. Students were attending a mandatory course on research methodology and EBM, described in a previously published study [6]. To be a part of the study, they had to be 18 years or older and fluent in English, both of which are a part of our School of Medicine requirements for enrollment. All students had to pass 2 previous courses (in the first and second years of their medical studies, respectively) to attend the third-year course and had a similar level of knowledge about research methodology and EBM [6,12].

Intervention

The web-based educational intervention was based on the Cochrane Interactive Learning (CIL) module 7, titled “Interpreting the findings” [13]. This educational module is completely on the web and asynchronous. The duration of the module is 90 minutes, and it covers the interpretation of statistical results, risk of bias, and interpretation of levels of evidence using the GRADE approach. The intervention group got the access to the web-based module in the classroom at the same time as the control group. Participants used the faculty-provided electronic devices to access the module. The control group had a traditional face-to-face seminar in the classroom, which was taught by a lecturer with knowledge and experience in GRADE and is a Cochrane systematic review author but had no involvement in the CIL module development. The presentations for the face-to-face seminar had the exact same content as the web-based module, equal in graphical design and duration.

After 90 minutes, the participants took the same test, hosted on the SurveyMonkey platform (SurveyMonkey Inc) [14]. There were no time limitations for the test, that is, the participants could spend as much time as they needed to complete the test. During the test, one of the researchers was present in the classroom of each trial group.

The first part of the SurveyMonkey test was a brief sociodemographic questionnaire, which included questions on participants age, gender, level of education, current research activities, and authorship of research publications. The participants were also asked to assess their knowledge of the GRADE approach (ranging from 1=little to none to 5=excellent), as well as their familiarity with Cochrane and systematic reviews (from 1=not at all to 5=extremely familiar).

After that, the participants took a test that assessed their knowledge of the GRADE approach. Five multiple-choice questions on statistical terms and their evaluation were taken from the official assessment for the CIL module [13]. The questions covered the topics of confidence interval interpretation using a forest plot, the expression of standardized mean difference, funnel plot interpretation, the assessment of overall certainty of evidence, and the rating of a body of evidence. Some of the questions had only one correct answer, whereas some had multiple correct statements. For those questions, points were awarded only if the participants selected all of the correct statements.

The final part of the test evaluated the participants’ understanding of an SoF table, which was evaluated with 4 open-ended questions linked to an SoF example [15]. The participants were asked to (1) give a recommendation for clinical practice based on the SoF table information, (2) determine target patient groups and possible exceptions or exclusion criteria, (3) find sample sizes for specific outcome analyses, and (4) find reasons for grading some of the evidence as very low certainty.

The SoF table and the questions from the test are available in Multimedia Appendix 1 [15].

Outcomes

The primary outcome for this study was the knowledge measured by the 5 questions from the formal CIL module. The knowledge was measured in 2 ways: as the overall scores for the test and the number of students correctly answering each question.

There were 3 secondary outcomes:

The understanding of the SoF table was measured by 4 questions related to an SoF table example. The results were expressed as the number and percentage of students with correct answer to individual questions and the total score for the whole group.

Participants’ satisfaction with and opinion about the course was measured using 10 questions with Likert-type statements, with scoring ranging from 1=I do not agree at all to 7=I fully agree.

The style of writing of a recommendation for clinical practice in the answer to the first of the 4 questions about the SoF table: we assessed the style according to the National Institute for Health and Care Excellence (NICE) instructions for writing recommendations [16]. These instructions advise that recommendations focus on the action and procedure, reflect the strength of the recommendation, and have clear and precise patient-oriented language. The 3 main categories from the NICE writing recommendations were used:

The focus of the action was assessed according to three elements from the guidebook: (1) the verb use, (2) target population, and (3) context or setting for the recommendation. Each element was graded as 1 if the element was present in the text or 0 if it was not present.

The opinion of the students about the strength of recommendation was graded as 1 or 0, based on the use of verbs for 3 levels of recommendations: “must” or “must not” for interventions that must be considered, “should” or “should not be offered” for interventions that should be considered, and “could offer” or “consider” for interventions that could be considered in clinical practice.

Patient-centered language assessment was guided by the guidebook recommendation to use verbs such as “offer,” “discuss,” and “consider,” instead of “give” and “prescribe.” Responses for this outcome were grouped into three categories: (1) offer (including “offer,” “consider,” “suggest,” “recommend,” “advise,” and “could help”); (2) give and prescribe (including “(do not) give,” “(do not) prescribe,” “supplement with,” and passive voice); and (3) no recommendation (response included no elements of clinical decision-making).

Two independent assessors (RT and TPP) rated all of the responses. Inconsistencies in their ratings were resolved with the help of a third author. κ statistics were used to determine the level of agreement for each of the 3 categories.

Sample Size

Based on the primary outcome and the assumption that there would be no significant differences between groups, we calculated the minimal sample size using a web-based calculator [17]. The allocation ratio was set to 1:1, the α value was .05, and the power was 0.8. We calculated that we needed 16 participants per group to obtain a 10% difference (out of a maximum of 11 correct statements). We hypothesized that there would be a lesser, nonsignificant difference than that in our previous studies, as we did not expect the groups to differ [6,16]. With the predicted attrition rate of 10%, we aimed at 36 participants in total.

Randomization and Masking

We used a simple randomization method [18]. One author prepared the list of the participants and randomized them into 2 groups. The group allocation was posted on the web the day before the intervention took place. Participants did not know beforehand which group would attend the web-based course. Web-based group participants got access to the web-based content (access usernames and passwords) at the very beginning of the intervention. Because of the nature of the intervention, it was not possible to completely mask the participants. The groups got the same treatment, setting, and measurements, aside from the intervention. As we could not control for individual students’ study time with regard to EBM and GRADE, we included the questions on self-assessed knowledge of the GRADE approach and familiarity with Cochrane collaboration in the demographic part of the test. Data analysis was masked: the researchers who analyzed the responses and compared the groups were not aware of group allocation.

Statistical Methods

Sociodemographic characteristics of participants are presented as absolute numbers and percentages. Group results are presented as medians and 95% Cis. The distribution of results was tested using the Kolmogorov-Smirnov test, and group results were compared using the Mann-Whitney U test. The results for separate questions and recommendations were presented as absolute values and percentages of correct answers and compared between groups using the Fischer exact test. To address multiple comparison bias, we performed a sequential Holm-Bonferroni adjustment. Analysis was conducted using MedCalc Statistical Software (version 16.4.3; MedCalc Software bvba) [19] and JASP software (version 0.8.6; JASP Team).

Ethics Approval

The study was approved by the Ethics Committee of the University of Split School of Medicine (class 003-08/19-03/0003; registration 2181-198-03-04-19-0044). The participants gave informed consent, and the data were kept according to the General Data Protection Regulation.

Results

The participant flow diagram is shown in Figure 1. In all, 50 participants received the web-based intervention, and 47 participants received the face-to-face intervention. Two participants did not attend the allocated intervention due to personal reasons. The median age was 21 (IQR 21-23) years. The groups did not differ in their current research experience or self-assessed GRADE knowledge (Table 1). Satisfaction with the intervention was high in both groups, and both groups reported that they will apply the knowledge in their work and learn more about interpreting and grading the quality of evidence. (Table 2).

Figure 1

Flow diagram of the participants in the study.

Table 1

Demographic data, previous research experience, and self-assessed knowledge of the GRADE^a approach^b.

Item	Web-based training (n=50)	Face-to-face training (n=47)
Gender (female; total: n=90, web-based training: n=48, face-to-face training: n=42), n (%)	29 (60)	27 (64)
Age (years; total: n=89, web-based training: n=47, face-to-face training: n=42), median (IQR)	21.0 (21-23)	21.0 (21-23)
Level of completed education (high school; total: n=89, web-based training: n=47, face-to-face training: n=42), n (%)	45 (96)	40 (95)
Are you currently involved in research activities? (yes; total: n=85, web-based training: n=48, face-to-face training: n=37), n (%)	1 (2)	0 (0)
Authorship of a research publication in the last 5 years (yes; total: n=80, web-based training: n=47, face-to-face training: n=33), n (%)	1 (2)	1 (3)
Authorship of a systematic review (yes; total: n=88, web-based training: n=47, face-to-face training: n=41), n (%)	1 (2)	0 (0)
Authorship of a clinical practice guideline (yes; total: n=89, web-based training: n=48, face-to-face training: n=41), n (%)	0 (0)	0 (0)
How familiar are you with Cochrane collaboration? (1=not at all, 5=extremely familiar; n=89), median (95% CI)	2.0 (2-2)	2.0 (2-2)
How would you grade your knowledge of GRADE approach? (1=very low, 5=very high; n=88), median (95% CI)	2.0 (2-2)	2.0 (2-2)

^aGRADE: Grading of Recommendations Assessment, Development and Evaluation.

^bThe numbers in parentheses indicate the number of responses in the questionnaire.

Table 2

Participants’ satisfaction with the training session, presented as median scores with 95% CI^a.

Item	Web-based training (n=50), median (95% CI)	Face-to-face training (n=47), median (95% CI)
Overall, I am satisfied with the course (n=85)	5 (5-5)	5 (5-6)
This course was really useful (n=87)	5 (4-5)	5 (5-5.8)
This is a good way for learning GRADE^b approach for quality of evidence (n=87)	5 (4-5)	5 (4-5)
This course helped me to better understand the concepts related to GRADE (n=87)	5 (4-5)	4 (4-5)
The course covered too much content in a short period of time (n=88)	4 (4-5)	4 (3.2-5)
I think there was sufficient amount of interaction during this course (n=86)	5 (5-6)	5 (4-6)
I would recommend this course to my colleagues (n=85)	5 (4-5.1)	5 (4-5.1)
I did not find this course useful (n=84)	2 (2-4)	2 (2-3)
In future, I will apply what I learned at this course in my work and research (n=85)	5 (5-5)	5 (5-5)
In future, I will learn more about interpreting and grading the quality of evidence (n=84)	5 (4-6)	5 (4.8-5.2)

^aThe numbers in parentheses indicate the number of responses in the questionnaire.

^bGRADE: Grading of Recommendations Assessment, Development and Evaluation.

The groups did not differ in the overall scores for the CIL test (P=.251, Mann-Whitney U test; Table 3). The median number of correct answers for both groups was 2 out of 4 (Table 3). Most students from both groups gave correct answers on the question about rating the body of evidence (35/50, 70% and 24/47, 51% of students from web-based and face-to-face training groups, respectively). The only difference between the 2 groups was in their answer to the question about the overall certainty of evidence, where the face-to-face group had significantly more correct answers then the web-based training group (P=.008, Fischer exact test; Table 3).

Understanding of the SoF table also did not differ significantly between the groups (median of 3 correct questions to 4 questions for both groups; P=.352, Fischer exact test; Table 4). Most correct answers in both groups were to questions about the number of participants in specific trials and the reason for low grading of the quality of evidence for some patient groups (Table 4).

For the analysis of how students phrased their recommendation for practice (Table 5), the κ score for the 2 assessors was for 0.68 for the patient-centered language, 0.88 for the strength of recommendation, and 1.0 for the focus of the action category. The students in both groups rarely used active verbs in writing their recommendations, and there was no difference between the groups (P=.118). The majority of them did not specify the setting for the recommendation in both groups (P=.151). Very few students in both groups (P=.151) addressed the setting (time or context) of their recommendation for practice. Both groups (P=.683) also wrote in a more patient-centered language, using verbs such as “offer,” “discuss,” and “consider.”

Table 3

Number (%) of correct answers and overall score for the 5 questions of the Cochrane Interactive Learning test.

Item	Web-based training (n=50)	Face-to-face training (n=47)	P value^a
Understanding of confidence intervals in the interpretation of results of meta-analysis, n (%)	18 (36)	16 (34)	>.99
Identify ways of re-expressing the standardized mean difference, n (%)	14 (28)	25 (53)	.014
Interpret a funnel plot asymmetry, n (%)	9 (18)	12 (26)	.461
Determine the overall certainty of the evidence, n (%)	15 (30)	27 (57)	.008
Decide on rating up a body of evidence, n (%)	35 (70)	24 (51)	.064
Overall score, median (95% CI)	2 (1.0-2.0)	2 (1.3-3.0)	.251^b

^aFischer exact test. P value for significance was set to .010 after Holm-Bonferroni adjustment.

^bMann-Whitney U test.

Table 4

Number (%) of correct answers and overall score on 4 test questions related to the Summary of Findings table.

Item	Web-based training (n=50)	Face-to-face training (n=47)	P value^a
Based on this information, how would you formulate a recommendation for clinical practice? (would recommend; total: n=83, web-based training: n=48, face-to-face training: n=35), n (%)	27 (56)	15 (43)	.270
Would you consider any subgroups of patients, and if so, how?, n (%)	24 (48)	22 (47)	>.99
How many participants were there in trials that assessed death as an outcome?, n (%)	39 (78)	30 (64)	.178
Why was the quality of evidence for hospitalized children graded as very low?, n (%)	31 (62)	39 (83)	.025
Overall score, median (95% CI)	3 (2-3)	2 (2-2)	.35^b

^aFischer exact test. P value for significance was set to .0125 after Holm-Bonferroni adjustment.

^bMann-Whitney U test.

Table 5

Number (%) of students who used specific writing style in their clinical practice recommendation based on the Summary of Findings table in the test^a.

Category of writing recommendation and element		Web-based training (n=50), n (%)	Face-to-face training (n=47), n (%)	P value^b
Focus of the action
	Active verb	4 (8)	0 (0)	.118
	Target population	24 (48)	10 (21)	.010
	Setting (time or context)	2 (4)	6 (13)	.151
Reflects the strength of recommendation		28 (56)	32 (68)	.296
Patient-centered language
	Offer	31 (62)	27 (57)	.683
	Give and prescribe	10 (20)	1 (2)	.008
	No recommendation	9 (18)	19 (40)	.024

^aStudents’ recommendations written in the answer to the first Summary of Findings question were assessed according to the presence of categories from the National Institute for Health and Care Excellence (NICE) writing recommendations. The number of categories was greater than the number of students as their recommendation could include more than one category element of the writing style.

^bFischer exact test. P value for significance was set to .007 after Holm-Bonferroni adjustment.

Discussion Principal Findings

Our study showed that web-based education about GRADE methodology may not be different to face-to-face education, as measured by the CIL module overall test results. The face-to-face group was better at assessing evidence using the risk of bias. There were no differences between the groups in the overall understanding of the SoF table. Both groups had high levels of satisfaction with the intervention. These results should be evaluated in the context of additional educational resources in EBM courses for medical school students and taking into account a limited sample size.

Generalizability and Interpretation

The satisfaction with the course was high in both groups, and participants found both educational interventions to be sufficiently interactive. It has been shown that participants’ satisfaction influences their academic scores [20] and learning motivation [21]. One of our concerns about the web-based training was the limitation of direct communication with the instructors. Participants might appreciate the ease of asking a question and getting an immediate reaction and answer in a face-to-face setting. However, newer technologies and new generations of students have superseded those concerns. It has been reported that web-based students ask more complex questions and more time is assigned by teachers for answering them [22].

There were no differences between the groups for the overall test results and SoF table understanding. Although median scores were low for both, they were lower for the 5 questions on methodology. These questions might be too advanced for the third-year medical students. Some of the individual questions had better response rates. The participants in the face-to-face group better answered the question on assessing the overall certainty of the evidence, taking into account the risk of bias, as well as other domains. Previous research in risk-of-bias education involved doctoral students [23], in which students who had a more intense training, including active discussions and feedback from lecturers, had better results. It is likely that our results stem from a similar involvement of participants and the lecturer, who could clarify certain aspects of evidence assessment students might have struggled with.

Both groups had high scores in the understanding of the SoF table, which is consistent with previous research on understanding evidence presented in this way [4,24]. The face-to-face training group was better at recognizing the reasons for assessing the evidence as very low, which is a question of accessibility and the ease of use of SoF tables. It is not known if the current SoF table format influences these outcomes [25], and our sample size limits the generalizability of such conclusions.

Clear and understandable communication of evidence that is important for clinical practice is a previously recognized priority [26]. Even though methodological guides for clinical practice guidelines and systematic reviews have instructions on how to write conclusions and recommendations [16,27,28], there are few, if any, assessments of the effectiveness of those instructions. One study found that using words related to activity and behavior, along with simple language and the avoidance of highly specialized terms and passive verbs, improved end users’ attitudes toward recommendations [29]. Our participants received no instructions on how to write a recommendation in this intervention, so the analysis of their recommendations using the NICE guideline instructions shows how an “instructions-naive” third-year medical students with limited clinical experience would write a recommendation.

Students often used scientific language and formed their answers as conclusions without elements of a clinical decision. Such distancing from recommending a clear action or against one might be explained as the result of students’ lack of clinical experience but also as a part of the culture of defensive medicine, in which medical professionals avoid a decisive action due to the fear of complications and responsibility [30]. It might also be a result of the culture of EBM, in which uncertainty and the need for more evidence are sometimes overemphasized [31]. Both of these elements are a part of the hidden medical curriculum that influences all students, and there might be a possibility of such a curriculum having a bigger impact in face-to-face education. Our findings are from a small sample size and warrant further research. Both groups of students mostly used patient-oriented language, with verbs such as “offer” instead of “give” and “prescribe.” Students should be taught to use more active, personal, and patient-oriented language in recommendations, because it gives patients a greater sense of control over their condition and behavior and improves their intention to use them [29].

Limitations

This study included a sample of third-year medical students, with limited clinical experience. Clinical experience might alter the perception of outcome importance and the severity of unwanted effects, both of which can influence recommendations for clinical practice. This trial did not include clinically experienced medical students or other health care workers, and its results might not translate to such populations. Another possible limitation of this study is that it did not involve an official GRADE training. There are no official criteria or consensus for defining what constitutes a GRADE methodologist—someone who support the creation of a guideline or help systematic review authors in evidence assessment [32]. As there is no consensus for GRADE experts, it is not unexpected that there is no consensus in the required education for GRADE end users. As previous research showed that web-based learning is as effective as face-to-face learning in health training in general [33,34], and specifically for EBM teaching to medical students [35], we hypothesized there would be no significant differences between the 2 groups. Furthermore, our study used a single training session, which might seem insufficient for such a complex topic. However, the intervention was embedded in a regular EBM course and was focused on specific issues. We have used this approach in our previous research on EBM education [36,37] and demonstrated that even a single intervention can make a difference in outcomes, at least in short-term knowledge.

Conclusions

EBM skills are necessary for decision-making in health care, but the transfer of this knowledge to practice is often inadequate [38]. The lack of knowledge, time, and access are some of the identified barriers to EBM application [39-41]. Our study has added to the body of evidence that shows the effectiveness of web-based EBM education [42-44] for undergraduate and postgraduate students [34,45]. The results of our study can be used as a starting point for future research on GRADE education as a part of EBM training for practicing physicians, perhaps in a clinically integrated manner, which was shown to be effective in improving EBM behavior as well as knowledge [46]. Preclinical medical students might benefit from this education as well. Even though they might lack nuances of clinical experience, early exposure to EBM principles and critical evidence assessment increases the likelihood of them using EBM principles in practice later on [38,47]. Our results provide encouraging data on the effectiveness and acceptability of a completely web-based, asynchronous educational content for that purpose.

Multimedia Appendix 1

Summary of Findings table and questions from the test.

Multimedia Appendix 2

CONSORT checklist.

Abbreviations

CIL

Cochrane Interactive Learning

EBM

evidence-based medicine

GRADE

Grading of Recommendations Assessment, Development and Evaluation

NICE

National Institute for Health and Care Excellence

SoF

Summary of Findings

This study was funded by the Croatian Science Foundation (Professionalism in Health - Decision making in practice and research [ProDeM]; grant IP-2019-04-4882). The funder had no role in the design of this study during its execution and data interpretation.

Data Availability

The data sets generated during this study are available from the first author (RT) on reasonable request.

AM and TPP gave the idea for the initial design of the study. RT and TPP developed the protocol. RT and TPP conducted the trial and collected the data. RT conducted statistical analysis, and all authors interpreted the results. RT drafted the manuscript. All authors revised the manuscript critically for important intellectual content and approved the final version of the manuscript for submission.

All authors are members of Cochrane.

Editorial Notice

This randomized study was retrospectively registered. The editor granted an exception from ICMJE rules mandating prospective registration of randomized trials since the trial was not health-related. However, readers are advised to carefully assess the validity of any potential explicit or implicit claims related to primary outcomes or effectiveness, as retrospective registration does not prevent authors from changing their outcome measures retrospectively.

Guyatt

Oxman

Kunz

Falck-Ytter

Vist

Liberati

Schünemann

Holger J

GRADE Working Group

Going from evidence to recommendations

BMJ 2008 05 10 336 7652 1049 51

10.1136/bmj.39493.646875.AE

18467413

336/7652/1049

PMC2376019

Atkins

Best

Briss

Eccles

Falck-Ytter

Flottorp

Guyatt

Harbour

Haugh

Henry

Hill

Jaeschke

Leng

Liberati

Magrini

Mason

Middleton

Mrukowicz

O'Connell

Oxman

Phillips

Schünemann

Holger J

Edejer

Varonen

Vist

Williams

Zaza

GRADE Working Group

Grading quality of evidence and strength of recommendations

BMJ 2004 06 19 328 7454 1490

10.1136/bmj.328.7454.1490

15205295

328/7454/1490

PMC428525

Guyatt

Oxman

Akl

Kunz

Vist

Brozek

Norris

Falck-Ytter

Glasziou

DeBeer

Hans

Jaeschke

Roman

Rind

David

Meerpohl

Joerg

Dahm

Philipp

Schünemann

Holger J

GRADE guidelines: 1. introduction-GRADE evidence profiles and summary of findings tables

J Clin Epidemiol 2011 04 64 4 383 94

10.1016/j.jclinepi.2010.04.026

21195583

S0895-4356(10)00330-6

Rosenbaum

Glenton

Oxman

Summary-of-findings tables in Cochrane reviews improved understanding and rapid retrieval of key information

J Clin Epidemiol 2010 06 63 6 620 6

10.1016/j.jclinepi.2009.12.014

20434024

S0895-4356(10)00028-4

GRADE online learning modules

GRADE Centre McMaster University 2021-04-01

https://cebgrade.mcmaster.ca/

Buljan

Marušić

Matko

Tokalić

Viđak

Peričić

Hren

Marušić

Ana

Cognitive levels in testing knowledge in evidence-based medicine: a cross sectional study

BMC Med Educ 2021 01 07 21 1 25

10.1186/s12909-020-02449-y

33413344

10.1186/s12909-020-02449-y

PMC7791849

Kyaw

Posadzki

Paddock

Car

Campbell

Tudor Car

Effectiveness of digital education on communication skills among medical students: systematic review and meta-analysis by the Digital Health Education Collaboration

J Med Internet Res 2019 08 27 21 8 e12967

10.2196/12967

31456579

v21i8e12967

PMC6764329

Liu

Peng

Zhang

Yan

The effectiveness of blended learning in health professions: systematic review and meta-analysis

J Med Internet Res 2016 01 04 18 1 e2

10.2196/jmir.4807

26729058

v18i1e2

PMC4717286

Tat

Shaukat

Zaveri

Kou

Jarvis

Developing and integrating asynchronous web-based cases for discussing and learning clinical reasoning: repeated cross-sectional study

JMIR Med Educ 2022 12 08 8 4 e38427

10.2196/38427

36480271

v8i4e38427

PMC9782361

Krnic Martinic

Čivljak

Marušić

Ana

Sapunar

Poklepović Peričić

Buljan

Tokalić

Mališa

Snježana

Neuberg

Ivanišević

Kata

Aranza

Skitarelić

Zoranić

Mikšić

Štefica

Čavić

Puljak

Web-based educational intervention to improve knowledge of systematic reviews among health science professionals: randomized controlled trial

J Med Internet Res 2022 08 25 24 8 e37000

10.2196/37000

36006686

v24i8e37000

PMC9459937

Maloney

Nicklen

Rivers

Foo

Ooi

Reeves

Walsh

Ilic

A cost-effectiveness analysis of blended versus face-to-face delivery of evidence-based medicine to medical students

J Med Internet Res 2015 07 21 17 7 e182

10.2196/jmir.4346

26197801

v17i7e182

PMC4527010

Buljan

Jerončić

Malički

Marušić

Matko

Marušić

Ana

How to choose an evidence-based medicine knowledge test for medical students? comparison of three knowledge measures

BMC Med Educ 2018 12 04 18 1 290

10.1186/s12909-018-1391-z

30514288

10.1186/s12909-018-1391-z

PMC6278026

Cochrane Interactive Learning

The Cochrane Collaboration 2018

2021-04-01

http://training.cochrane.org/interactivelearning/

SurveyMonkey 2023-05-25

https://www.surveymonkey.com/

Lazzerini

Wanzira

Oral zinc for treating diarrhoea in children

Cochrane Database Syst Rev 2016 12 20 12 12 CD005436

10.1002/14651858.CD005436.pub5

27996088

PMC5450879

Developing NICE guidelines: the manual—chapter 9: writing the guideline

NICE 2018

2021-04-01

https://www.nice.org.uk/process/pmg20/chapter/writing-the-guideline#wording-the-recommendations

Sample size calculator

ClinCalc 2023-05-25

https://clincalc.com/stats/samplesize.aspx

Dallal

Randomization.com 2023-05-25

http://www.randomization.com/

MedCalc 2023-05-25

https://www.medcalc.org

Doménech-Betoret

Fernando

Abellán-Roselló

Laura

Gómez-Artiga

Amparo

Self-efficacy, satisfaction, and academic achievement: the mediator role of students' expectancy-value beliefs

Front Psychol 2017 07 18 8 1193

10.3389/fpsyg.2017.01193

28769839

PMC5513915

Nortvig

Petersen

Balle

A literature review of the factors influencing e-learning and blended learning in relation to learning outcome, student satisfaction and engagement

The Electronic Journal of e-Learning 2018 2 1 16 1 46 55

Caton

Chung

Adeniji

Hom

Brar

Gallant

Bryant

Hain

Basaviah

Hosamani

Student engagement in the online classroom: comparing preclinical medical student question-asking behaviors in a videoconference versus in-person learning environment

FASEB Bioadv 2021 02 11 3 2 110 117

10.1096/fba.2020-00089

33615156

FBA21185

PMC7876702

da Costa

Beckett

Diaz

Resta

Johnston

Egger

Jüni

Peter

Armijo-Olivo

Effect of standardized training on the reliability of the Cochrane risk of bias assessment tool: a prospective study

Syst Rev 2017 03 03 6 1 44

10.1186/s13643-017-0441-7

28253938

10.1186/s13643-017-0441-7

PMC5335785

Akl

Maroun

Guyatt

Oxman

Alonso-Coello

Vist

Devereaux

Montori

Schünemann

Holger J

Symbols were superior to numbers for presenting strength of recommendations to health care consumers: a randomized trial

J Clin Epidemiol 2007 12 60 12 1298 305

10.1016/j.jclinepi.2007.03.011

17998085

S0895-4356(07)00109-6

Yepes-Nuñez

Juan José

Morgan

Mbuagbaw

Carrasco-Labra

Chang

Hempel

Shekelle

Helfand

Baldeh

Schünemann

Holger J

Two alternatives versus the standard Grading of Recommendations Assessment, Development and Evaluation (GRADE) summary of findings (SoF) tables to improve understanding in the presentation of systematic review results: a three-arm, randomised, controlled, non-inferiority trial

BMJ Open 2018 01 23 8 1 e015623

10.1136/bmjopen-2016-015623

29362242

bmjopen-2016-015623

PMC5786134

Santesso

Glenton

Dahm

Garner

Akl

Alper

Brignardello-Petersen

Carrasco-Labra

De Beer

Hultcrantz

Kuijpers

Meerpohl

Morgan

Mustafa

Skoetz

Sultan

Wiysonge

Guyatt

Schünemann

Holger J

GRADE Working Group

GRADE guidelines 26: informative statements to communicate the findings of systematic reviews of interventions

J Clin Epidemiol 2020 03 119 126 135

10.1016/j.jclinepi.2019.10.014

31711912

S0895-4356(19)30416-0

SIGN 50: a guideline developer's handbook

Scottish Intercollegiate Guidelines Network 2011 11

2021-04-01

https://www.sign.ac.uk/assets/sign50_2011.pdf

Schünemann

Vist

Higgins

JPT

Santesso

Deeks

Glasziou

Akl

Higgins

JPT

Thomas

Chandler

Cumpston

Page

Welch

Chapter 15: interpreting resultsdrawing conclusions

Cochrane Handbook for Systematic Reviews of Interventions (version 6.2) 2021 2

Hoboken, NJ

The Cochrane Collaboration and John Wiley & Sons Ltd

Michie

Lester

Words matter: increasing the implementation of clinical guidelines

Qual Saf Health Care 2005 10 01 14 5 367 70

10.1136/qshc.2005.014100

16195572

14/5/367

PMC1744083

Sekhar

Vyas

Defensive medicine: a bane to healthcare

Ann Med Health Sci Res 2013 04 3 2 295 6

10.4103/2141-9248.113688

23919211

AMHSR-3-295

PMC3728884

Akl

Guyatt

Irani

Feldstein

Wasi

Shaw

Shaneyfelt

Levine

Schünemann

Holger J

"Might" or "suggest"? no wording approach was clearly superior in conveying the strength of recommendation

J Clin Epidemiol 2012 03 65 3 268 75

10.1016/j.jclinepi.2011.08.001

22075112

S0895-4356(11)00247-2

Norris

Meerpohl

Akl

Schünemann

Holger J

Gartlehner

Chen

Whittington

Grading of Recommendations Assessment‚ Development and Evaluation (GRADE) Working Group

The skills and experience of GRADE methodologists can be assessed with a simple tool

J Clin Epidemiol 2016 11 79 150 158.e1

10.1016/j.jclinepi.2016.07.001

27421684

S0895-4356(16)30193-7

McCutcheon

Lohan

Traynor

Martin

A systematic review evaluating the impact of online or blended learning vs. face-to-face learning of clinical skills in undergraduate nurse education

J Adv Nurs 2015 02 19 71 2 255 70

10.1111/jan.12509

25134985

George

Papachristou

Belisario

Wang

Wark

Cotic

Rasmussen

Sluiter

Riboli-Sasco

Tudor Car

Lorainne

Musulanov

Molina

Heng

Zhang

Wheeler

Al Shorbaji

Najeeb

Majeed

Car

Online eLearning for undergraduates in health professions: a systematic review of the impact on knowledge, skills, attitudes and satisfaction

J Glob Health 2014 06 4 1 010406

10.7189/jogh.04.010406

24976965

jogh-04-010406

PMC4073252

Kumaravel

Stewart

Ilic

Face-to-face versus online clinically integrated EBM teaching in an undergraduate medical school: a pilot study

BMJ Evid Based Med 2022 06 11 27 3 162 168

10.1136/bmjebm-2021-111776

34635481

bmjebm-2021-111776

Banožić

Buljan

Malički

Marušić

Matko

Marušić

Ana

Short- and long-term effects of retrieval practice on learning concepts in evidence-based medicine: experimental study

J Eval Clin Pract 2018 02 31 24 1 262 263

10.1111/jep.12740

28370993

Krnic Martinic

Malisa

Aranza

Civljak

Marušić

Ana

Sapunar

Poklepovic Pericic

Buljan

Tokalic

Cavic

Puljak

Creating an online educational intervention to improve knowledge about systematic reviews among healthcare workers: mixed-methods pilot study

BMC Med Educ 2022 10 14 22 1 722

10.1186/s12909-022-03763-3

36242036

10.1186/s12909-022-03763-3

PMC9562058

Hecht

Buhse

Meyer

Effectiveness of training in evidence-based medicine skills for healthcare professionals: a systematic review

BMC Med Educ 2016 04 04 16 1 103

10.1186/s12909-016-0616-2

27044264

10.1186/s12909-016-0616-2

PMC4820973

Halalau

Holmes

Rogers-Snyr

Donisan

Nielsen

Cerqueira

Guyatt

Evidence-based medicine curricula and barriers for physicians in training: a scoping review

Int J Med Educ 2021 05 28 12 101 124

10.5116/ijme.6097.ccc0

34053914

ijme.12.101124

PMC8411338

van Dijk

Hooft

Wieringa-de Waard

What are the barriers to residents' practicing evidence-based medicine? a systematic review

Acad Med 2010 07 85 7 1163 70

10.1097/ACM.0b013e3181d4152f

20186032

Sadeghi-Bazargani

Tabrizi

Azami-Aghdash

Barriers to evidence-based medicine: a systematic review

J Eval Clin Pract 2014 12 20 6 793 802

10.1111/jep.12222

25130323

Maggio

Tannery

Chen

ten Cate

Olle

O'Brien

Bridget

Evidence-based medicine training in undergraduate medical education: a review and critique of the literature published 2006-2011

Acad Med 2013 07 88 7 1022 8

10.1097/ACM.0b013e3182951959

23702528

Schilling

Wiecha

Polineni

Khalil

An interactive web-based curriculum on evidence-based medicine: design and effectiveness

Fam Med 2006 02 38 2 126 32

16450235

Davis

Crabb

Rogers

Zamora

Khan

Computer-based teaching is as good as face to face lecture-based teaching of evidence based medicine: a randomized controlled trial

Med Teach 2008 30 3 302 7

10.1080/01421590701784349

18484458

790929335

George

Zhabenko

Kyaw

Antoniou

Posadzki

Saxena

Semwal

Tudor Car

Zary

Lockwood

Car

Online digital education for postregistration training of medical doctors: systematic review by the Digital Health Education Collaboration

J Med Internet Res 2019 02 25 21 2 e13269

10.2196/13269

30801252

v21i2e13269

PMC6410118

Coomarasamy

Khan

What is the evidence that postgraduate teaching in evidence based medicine changes anything? a systematic review

BMJ 2004 10 30 329 7473 1017

10.1136/bmj.329.7473.1017

15514348

329/7473/1017

PMC524555

Nieman

Cheng

Foxhall

Teaching first-year medical students to apply evidence-based practices to patient care

Fam Med 2009 05 41 5 332 6

19418281