Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
15,062
result(s) for
"Educational Measurement -- methods"
Sort by:
Evaluating podcasts as a tool for OSCE training: a randomized trial using generative AI-powered simulation
by
Pers, Yves-Marie
,
Guerrot, Dominique
,
Figueres, Lucile
in
Adult
,
Artificial Intelligence
,
Clinical Competence - standards
2025
Introduction
Objective Structured Clinical Examinations (OSCEs) are critical for assessing clinical competencies in medical education. While traditional teaching methods remain prevalent, this study introduces an innovative approach by evaluating the effectiveness of an OSCE preparation podcast in improving medical students’ OSCE performance using nephrology as a proof of concept. This novel method offers a flexible and accessible format for supplementary learning, potentially revolutionizing medical education.
Methods
A mono-centric randomized controlled trial was conducted among 50 fourth-year medical students. Participants were randomly assigned to either the podcast intervention group or a control group. Both groups completed six nephrology-specific OSCE stations on DocSimulator, a generative AI-powered virtual patient platform. Scores from three baseline and three post-intervention OSCE stations were compared. The primary outcome was the change in OSCE scores. Secondary outcomes included interest in nephrology and students’ self-reported competence in nephrology-related skills.
Results
The baseline OSCE scores did not differ between the two groups (23.8 ± 3.9 vs. 23.3 ± 5.3;
p
= 0.77). After the intervention, the podcast group demonstrated a significantly higher OSCE score compared to the control group (27.6 ± 3.6 vs. 23.6 ± 5.0;
p
= 0.002) with a greater improvement in OSCE scores (+ 3.52[0.7,6.5] vs. -1.22[-3,5.5];
p
= 0.03). While the podcast did not increase students’ intention to specialize in nephrology (4.2% vs. 4.0%;
p
= 0.99), it significantly improved their confidence in nephrology-related clinical skills (41.7% vs. 16%,
p
= 0.04). 68% of students in the podcast group found OSCE training podcast useful for their OSCE preparation, and 96% reported they would use it again.
Conclusions
The use of an OSCE preparation podcast significantly enhanced students’ performance in AI-based simulations and confidence in nephrology clinical competencies. Podcasts represent a valuable supplementary tool for medical education, providing flexibility and supporting diverse learning styles.
Trial Registration
Not applicable.
Journal Article
Use of the Smartphone App WhatsApp as an E-Learning Method for Medical Residents: Multicenter Controlled Randomized Trial
by
Gilles Lebuffe
,
Vincent Compere
,
Thomas Clavier
in
[SDV]Life Sciences [q-bio]
,
Adult
,
anesthesiology
2019
The WhatsApp smartphone app is the most widely used instant messaging app in the world. Recent studies reported the use of WhatsApp for educational purposes, but there is no prospective study comparing WhatsApp's pedagogical effectiveness to that of any other teaching modality.
The main objective of this study was to measure the impact of a learning program via WhatsApp on clinical reasoning in medical residents.
This prospective, randomized, multicenter study was conducted among first- and second-year anesthesiology residents (offline recruitment) from four university hospitals in France. Residents were randomized in two groups of online teaching (WhatsApp and control). The WhatsApp group benefited from daily delivery of teaching documents on the WhatsApp app and a weekly clinical case supervised by a senior physician. In the control group, residents had access to the same documents via a traditional computer electronic learning (e-learning) platform. Medical reasoning was self-assessed online by a script concordance test (SCT; primary parameter), and medical knowledge was assessed using multiple-choice questions (MCQs). The residents also completed an online satisfaction questionnaire.
In this study, 62 residents were randomized (32 to the WhatsApp group and 30 to the control group) and 22 residents in each group answered the online final evaluation. We found a difference between the WhatsApp and control groups for SCTs (60% [SD 9%] vs 68% [SD 11%]; P=.006) but no difference for MCQs (18/30 [SD 4] vs 16/30 [SD 4]; P=.22). Concerning satisfaction, there was a better global satisfaction rate in the WhatsApp group than in the control group (8/10 [interquartile range 8-9] vs 8/10 [interquartile range 8-8]; P=.049).
Compared to traditional e-learning, the use of WhatsApp for teaching residents was associated with worse clinical reasoning despite better global appreciation. The use of WhatsApp probably contributes to the dispersion of attention linked to the use of the smartphone. The impact of smartphones on clinical reasoning should be studied further.
Journal Article
Use of very short answer questions compared to multiple choice questions in undergraduate medical students: An external validation study
by
Rohling, Jos H. T.
,
Dekker, Friedo W.
,
Janse, Roemer J.
in
Acceptability
,
Automation
,
Biology and Life Sciences
2023
Multiple choice questions (MCQs) offer high reliability and easy machine-marking, but allow for cueing and stimulate recognition-based learning. Very short answer questions (VSAQs), which are open-ended questions requiring a very short answer, may circumvent these limitations. Although VSAQ use in medical assessment increases, almost all research on reliability and validity of VSAQs in medical education has been performed by a single research group with extensive experience in the development of VSAQs. Therefore, we aimed to validate previous findings about VSAQ reliability, discrimination, and acceptability in undergraduate medical students and teachers with limited experience in VSAQs development. To validate the results presented in previous studies, we partially replicated a previous study and extended results on student experiences. Dutch undergraduate medical students (n = 375) were randomized to VSAQs first and MCQs second or vice versa in a formative exam in two courses, to determine reliability, discrimination, and cueing. Acceptability for teachers (i.e., VSAQ review time) was determined in the summative exam. Reliability (Cronbach’s α) was 0.74 for VSAQs and 0.57 for MCQs in one course. In the other course, Cronbach’s α was 0.87 for VSAQs and 0.83 for MCQs. Discrimination (average R ir ) was 0.27 vs. 0.17 and 0.43 vs. 0.39 for VSAQs vs. MCQs, respectively. Reviewing time of one VSAQ for the entire student cohort was ±2 minutes on average. Positive cueing occurred more in MCQs than in VSAQs (20% vs. 4% and 20.8% vs. 8.3% of questions per person in both courses). This study validates the positive results regarding VSAQs reliability, discrimination, and acceptability in undergraduate medical students. Furthermore, we demonstrate that VSAQ use is reliable among teachers with limited experience in writing and marking VSAQs. The short learning curve for teachers, favourable marking time and applicability regardless of the topic suggest that VSAQs might also be valuable beyond medical assessment.
Journal Article
Integrating ChatGPT in Orthopedic Education for Medical Undergraduates: Randomized Controlled Trial
2024
ChatGPT is a natural language processing model developed by OpenAI, which can be iteratively updated and optimized to accommodate the changing and complex requirements of human verbal communication.
The study aimed to evaluate ChatGPT's accuracy in answering orthopedics-related multiple-choice questions (MCQs) and assess its short-term effects as a learning aid through a randomized controlled trial. In addition, long-term effects on student performance in other subjects were measured using final examination results.
We first evaluated ChatGPT's accuracy in answering MCQs pertaining to orthopedics across various question formats. Then, 129 undergraduate medical students participated in a randomized controlled study in which the ChatGPT group used ChatGPT as a learning tool, while the control group was prohibited from using artificial intelligence software to support learning. Following a 2-week intervention, the 2 groups' understanding of orthopedics was assessed by an orthopedics test, and variations in the 2 groups' performance in other disciplines were noted through a follow-up at the end of the semester.
ChatGPT-4.0 answered 1051 orthopedics-related MCQs with a 70.60% (742/1051) accuracy rate, including 71.8% (237/330) accuracy for A1 MCQs, 73.7% (330/448) accuracy for A2 MCQs, 70.2% (92/131) accuracy for A3/4 MCQs, and 58.5% (83/142) accuracy for case analysis MCQs. As of April 7, 2023, a total of 129 individuals participated in the experiment. However, 19 individuals withdrew from the experiment at various phases; thus, as of July 1, 2023, a total of 110 individuals accomplished the trial and completed all follow-up work. After we intervened in the learning style of the students in the short term, the ChatGPT group answered more questions correctly than the control group (ChatGPT group: mean 141.20, SD 26.68; control group: mean 130.80, SD 25.56; P=.04) in the orthopedics test, particularly on A1 (ChatGPT group: mean 46.57, SD 8.52; control group: mean 42.18, SD 9.43; P=.01), A2 (ChatGPT group: mean 60.59, SD 10.58; control group: mean 56.66, SD 9.91; P=.047), and A3/4 MCQs (ChatGPT group: mean 19.57, SD 5.48; control group: mean 16.46, SD 4.58; P=.002). At the end of the semester, we found that the ChatGPT group performed better on final examinations in surgery (ChatGPT group: mean 76.54, SD 9.79; control group: mean 72.54, SD 8.11; P=.02) and obstetrics and gynecology (ChatGPT group: mean 75.98, SD 8.94; control group: mean 72.54, SD 8.66; P=.04) than the control group.
ChatGPT answers orthopedics-related MCQs accurately, and students using it excel in both short-term and long-term assessments. Our findings strongly support ChatGPT's integration into medical education, enhancing contemporary instructional methods.
Chinese Clinical Trial Registry Chictr2300071774; https://www.chictr.org.cn/hvshowproject.html ?id=225740&v=1.0.
Journal Article
AI-assisted grading and personalized feedback in large political science classes: Results from randomized controlled trials
by
Park, Sanghoon
,
Chen, Kuan-Wu
,
Heinrich, Tobias
in
Artificial Intelligence
,
Biology and Life Sciences
,
Critical thinking
2025
Grading and providing personalized feedback on short-answer questions is time consuming. Professional incentives often push instructors to rely on multiple-choice assessments instead, reducing opportunities for students to develop critical thinking skills. Using large-language-model (LLM) assistance, we augment the productivity of instructors grading short-answer questions in large classes. Through a randomized controlled trial across four undergraduate courses and almost 300 students in 2023/2024, we assess the effectiveness of AI-assisted grading and feedback in comparison to human grading. Our results demonstrate that AI-assisted grading can mimic what an instructor would do in a small class.
Journal Article
Spatial ability and 3D model colour-coding affect anatomy performance: a cross-sectional and randomized trial
2023
Photorealistic 3D models (PR3DM) have great potential to supplement anatomy education; however, there is evidence that realism can increase cognitive load and negatively impact anatomy learning, particularly in students with decreased spatial ability. These differing viewpoints have resulted in difficulty in incorporating PR3DM when designing anatomy courses. To determine the effects of spatial ability on anatomy learning and reported intrinsic cognitive load using a drawing assessment, and of PR3DM versus an Artistic colour-coded 3D model (A3DM) on extraneous cognitive load and learning performance. First-year medical students participated in a cross-sectional (Study 1) and a double-blind randomised control trial (Study 2). Pre-tests analysed participants' knowledge of anatomy of the heart (Study 1,
N
= 50) and liver (Study 2,
N
= 46). In Study 1, subjects were first divided equally using a mental rotations test (MRT) into low and high spatial ability groups. Participants memorised a 2D-labeled heart valve diagram and sketched it rotated 180°, before self-reporting their intrinsic cognitive load (ICL). For Study 2, participants studied a liver PR3DM or its corresponding A3DM with texture-homogenisation, followed by a liver anatomy post-test, and reported extraneous cognitive load (ECL). All participants reported no prior anatomy experience. Participants with low spatial ability (
N
= 25) had significantly lower heart drawing scores (
p
= 0.001) than those with high spatial ability (
N
= 25), despite no significant differences in reported ICL (
p
= 0.110). Males had significantly higher MRT scores than females (
p
= 0.011). Participants who studied the liver A3DM (
N
= 22) had significantly higher post-test scores than those who studied the liver PR3DM (
N
= 24) (
p
= 0.042), despite no significant differences in reported ECL (
p
= 0.720). This investigation demonstrated that increased spatial ability and colour-coding of 3D models are associated with improved anatomy performance without significant increase in cognitive load. The findings are important and provide useful insight into the influence of spatial ability and photorealistic and artistic 3D models on anatomy education, and their applicability to instructional and assessment design in anatomy.
Journal Article
Efficacy of Virtual Reality Simulation in Teaching Basic Life Support and Its Retention at 6 Months
by
López, Alejandro
,
Gallart, Alberto
,
Rodríguez, Carmen
in
Cardiac patients
,
Cardiopulmonary Resuscitation - education
,
Clinical Competence
2023
Educational efficiency is the predetermining factor for increasing the survival rate of patients with cardiac arrest. Virtual reality (VR) simulation could help to improve the skills of those undergoing basic life support–automated external defibrillation (BLS–AED) training. Our purpose was to evaluate whether BLS–AED with virtual reality improves the skills and satisfaction of students enrolled in in-person training after completing the course and their retention of those skills 6 months later. This was an experimental study of first-year university students from a school of health sciences. We compared traditional training (control group—CG) with virtual reality simulation (experimental group—EG). The students were evaluated using a simulated case with three validated instruments after the completion of training and at 6 months. A total of 241 students participated in the study. After the training period, there were no statistically significant differences in knowledge evaluation or in practical skills when assessed using a feedback mannequin. Statistically significant results on defibrillation were poorer in the EG evaluated by the instructor. Retention at 6 months decreased significantly in both groups. The results of the teaching methodology using VR were similar to those obtained through traditional methodology: there was an increase in skills after training, and their retention decreased over time. Defibrillation results were better after traditional learning.
Journal Article
Comparing Virtual Reality–Based and Traditional Physical Objective Structured Clinical Examination (OSCE) Stations for Clinical Competency Assessments: Randomized Controlled Trial
2025
Objective structured clinical examinations (OSCEs) are a widely recognized and accepted method to assess clinical competencies but are often resource-intensive.
This study aimed to evaluate the feasibility and effectiveness of a virtual reality (VR)-based station (VRS) compared with a traditional physical station (PHS) in an already established curricular OSCE.
Fifth-year medical students participated in an OSCE consisting of 10 stations. One of the stations, emergency medicine, was offered in 2 modalities: VRS and PHS. Students were randomly assigned to 1 of the 2 modalities. We used 2 distinct scenarios to prevent content leakage among participants. Student performance and item characteristics were analyzed, comparing the VRS with PHS as well as with 5 other case-based stations. Student perceptions of the VRS were collected through a quantitative and qualitative postexamination online survey, which included a 5-point Likert scale ranging from 1 (minimum) to 5 (maximum), to evaluate the acceptance and usability of the VR system. Organizational and technical feasibility as well as cost-effectiveness were also evaluated.
Following randomization and exclusions of invalid data sets, 57 and 66 participants were assessed for the VRS and PHS, respectively. The feasibility evaluation demonstrated smooth implementation of both VR scenarios (septic and anaphylactic shock) with 93% (53/57) of students using the VR technology without issues. The difficulty levels of the VRS scenarios (septic shock: P=.67; anaphylactic shock: P=.58) were comparable to the average difficulty of all stations (P=.68) and fell within the reference range (0.4-0.8). In contrast, VRS demonstrated above-average values for item discrimination (septic shock: r'=0.40; anaphylactic shock: r'=0.33; overall r'=0.30; with values >0.3 considered good) and discrimination index (septic shock: D=0.25; anaphylactic shock: D=0.26; overall D=0.16, with 0.2-0.3 considered mediocre and <0.2 considered poor). Apart from some hesitancy toward its broader application in future practical assessments (mean 3.07, SD 1.37 for VRS vs mean 3.65, SD 1.18 for PHS; P=.03), there were no other differences in perceptions between VRS and PHS. Thematic analysis highlighted the realistic portrayal of medical emergencies and fair assessment conditions provided by the VRS. Regarding cost-effectiveness, initial development of the VRS can be offset by long-term savings in recurring expenses like standardized patients and consumables.
Integration of the VRS into the current OSCE framework proved feasible both technically and organizationally, even within the strict constraints of short examination phases and schedules. The VRS was accepted and positively received by students across various levels of technological proficiency, including those with no prior VR experience. Notably, the VRS demonstrated comparable or even superior item characteristics, particularly in terms of discrimination power. Although challenges remain, such as technical reliability and some acceptance concerns, VR remains promising in applications of clinical competence assessment.
Journal Article
Game-Based E-Learning Is More Effective than a Conventional Instructional Method: A Randomized Controlled Trial with Third-Year Medical Students
2013
When compared with more traditional instructional methods, Game-based e-learning (GbEl) promises a higher motivation of learners by presenting contents in an interactive, rule-based and competitive way. Most recent systematic reviews and meta-analysis of studies on Game-based learning and GbEl in the medical professions have shown limited effects of these instructional methods.
To compare the effectiveness on the learning outcome of a Game-based e-learning (GbEl) instruction with a conventional script-based instruction in the teaching of phase contrast microscopy urinalysis under routine training conditions of undergraduate medical students.
A randomized controlled trial was conducted with 145 medical students in their third year of training in the Department of Urology at the University Medical Center Freiburg, Germany. 82 subjects where allocated for training with an educational adventure-game (GbEl group) and 69 subjects for conventional training with a written script-based approach (script group). Learning outcome was measured with a 34 item single choice test. Students' attitudes were collected by a questionnaire regarding fun with the training, motivation to continue the training and self-assessment of acquired knowledge.
The students in the GbEl group achieved significantly better results in the cognitive knowledge test than the students in the script group: the mean score was 28.6 for the GbEl group and 26.0 for the script group of a total of 34.0 points with a Cohen's d effect size of 0.71 (ITT analysis). Attitudes towards the recent learning experience were significantly more positive with GbEl. Students reported to have more fun while learning with the game when compared to the script-based approach.
Game-based e-learning is more effective than a script-based approach for the training of urinalysis in regard to cognitive learning outcome and has a high positive motivational impact on learning. Game-based e-learning can be used as an effective teaching method for self-instruction.
Journal Article
Introducing Computer-Based Testing in High-Stakes Exams in Higher Education: Results of a Field Experiment
by
Boevé, Anja J.
,
Bosker, Roel J.
,
Meijer, Rob R.
in
Ability tests
,
Acceptance tests
,
Achievement tests
2015
The introduction of computer-based testing in high-stakes examining in higher education is developing rather slowly due to institutional barriers (the need of extra facilities, ensuring test security) and teacher and student acceptance. From the existing literature it is unclear whether computer-based exams will result in similar results as paper-based exams and whether student acceptance can change as a result of administering computer-based exams. In this study, we compared results from a computer-based and paper-based exam in a sample of psychology students and found no differences in total scores across the two modes. Furthermore, we investigated student acceptance and change in acceptance of computer-based examining. After taking the computer-based exam, fifty percent of the students preferred paper-and-pencil exams over computer-based exams and about a quarter preferred a computer-based exam. We conclude that computer-based exam total scores are similar as paper-based exam scores, but that for the acceptance of high-stakes computer-based exams it is important that students practice and get familiar with this new mode of test administration.
Journal Article