Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Reading LevelReading Level
-
Content TypeContent Type
-
YearFrom:-To:
-
More FiltersMore FiltersItem TypeIs Full-Text AvailableSubjectCountry Of PublicationPublisherSourceTarget AudienceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
790
result(s) for
"Taylor, Richard Andrew"
Sort by:
Minstrels and minstrelsy in late medieval England
by
Rastall, Richard, author
,
Taylor, Andrew, 1958- author
in
Minstrels England History To 1500.
,
Performing Arts.
2023
A major new study piecing together the intriguing but fragmentary evidence surrounding the lives of minstrels to highlight how these seemingly peripheral figures were keenly involved with all aspects of late medieval communities.
How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment
by
Gilson, Aidan
,
Safranek, Conrad W
,
Chi, Ling
in
Application programming interface
,
Chatbots
,
Cognition & reasoning
2023
Chat Generative Pre-trained Transformer (ChatGPT) is a 175-billion-parameter natural language processing model that can generate conversation-style responses to user input.
This study aimed to evaluate the performance of ChatGPT on questions within the scope of the United States Medical Licensing Examination (USMLE) Step 1 and Step 2 exams, as well as to analyze responses for user interpretability.
We used 2 sets of multiple-choice questions to evaluate ChatGPT's performance, each with questions pertaining to Step 1 and Step 2. The first set was derived from AMBOSS, a commonly used question bank for medical students, which also provides statistics on question difficulty and the performance on an exam relative to the user base. The second set was the National Board of Medical Examiners (NBME) free 120 questions. ChatGPT's performance was compared to 2 other large language models, GPT-3 and InstructGPT. The text output of each ChatGPT response was evaluated across 3 qualitative metrics: logical justification of the answer selected, presence of information internal to the question, and presence of information external to the question.
Of the 4 data sets, AMBOSS-Step1, AMBOSS-Step2, NBME-Free-Step1, and NBME-Free-Step2, ChatGPT achieved accuracies of 44% (44/100), 42% (42/100), 64.4% (56/87), and 57.8% (59/102), respectively. ChatGPT outperformed InstructGPT by 8.15% on average across all data sets, and GPT-3 performed similarly to random chance. The model demonstrated a significant decrease in performance as question difficulty increased (P=.01) within the AMBOSS-Step1 data set. We found that logical justification for ChatGPT's answer selection was present in 100% of outputs of the NBME data sets. Internal information to the question was present in 96.8% (183/189) of all questions. The presence of information external to the question was 44.5% and 27% lower for incorrect answers relative to correct answers on the NBME-Free-Step1 (P<.001) and NBME-Free-Step2 (P=.001) data sets, respectively.
ChatGPT marks a significant improvement in natural language processing models on the tasks of medical question answering. By performing at a greater than 60% threshold on the NBME-Free-Step-1 data set, we show that the model achieves the equivalent of a passing score for a third-year medical student. Additionally, we highlight ChatGPT's capacity to provide logic and informational context across the majority of answers. These facts taken together make a compelling case for the potential applications of ChatGPT as an interactive medical education tool to support learning.
Journal Article
The Authors Respond to Reader Comment Regarding: “Automated Computation of the HEART Score with the GPT-4 Large Language Model”
by
Taylor, Richard Andrew
,
Wright, Donald S.
in
Emergency
,
Emergency medical care
,
Large language models
2025
Comparative evaluations between regionally adapted indices such as HASI and LLM-driven tools may indeed help clarify how best to integrate AI in diverse health system contexts. Continuous evaluation will also be needed in these frameworks, to ensure that the clinical concept being captured truly maps to the LLM application of the tool which may differ systematically from human use of the tool [ 2, 3]. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Journal Article
Patient-Representing Population's Perceptions of GPT-Generated Versus Standard Emergency Department Discharge Instructions: Randomized Blind Survey Assessment
by
Sangal, Rohit B
,
Dilip, Monisha
,
Chartash, David
in
Adult
,
Artificial Intelligence
,
Emergency service
2024
Discharge instructions are a key form of documentation and patient communication in the time of transition from the emergency department (ED) to home. Discharge instructions are time-consuming and often underprioritized, especially in the ED, leading to discharge delays and possibly impersonal patient instructions. Generative artificial intelligence and large language models (LLMs) offer promising methods of creating high-quality and personalized discharge instructions; however, there exists a gap in understanding patient perspectives of LLM-generated discharge instructions.
We aimed to assess the use of LLMs such as ChatGPT in synthesizing accurate and patient-accessible discharge instructions in the ED.
We synthesized 5 unique, fictional ED encounters to emulate real ED encounters that included a diverse set of clinician history, physical notes, and nursing notes. These were passed to GPT-4 in Azure OpenAI Service (Microsoft) to generate LLM-generated discharge instructions. Standard discharge instructions were also generated for each of the 5 unique ED encounters. All GPT-generated and standard discharge instructions were then formatted into standardized after-visit summary documents. These after-visit summaries containing either GPT-generated or standard discharge instructions were randomly and blindly administered to Amazon MTurk respondents representing patient populations through Amazon MTurk Survey Distribution. Discharge instructions were assessed based on metrics of interpretability of significance, understandability, and satisfaction.
Our findings revealed that survey respondents' perspectives regarding GPT-generated and standard discharge instructions were significantly (P=.01) more favorable toward GPT-generated return precautions, and all other sections were considered noninferior to standard discharge instructions. Of the 156 survey respondents, GPT-generated discharge instructions were assigned favorable ratings, \"agree\" and \"strongly agree,\" more frequently along the metric of interpretability of significance in discharge instruction subsections regarding diagnosis, procedures, treatment, post-ED medications or any changes to medications, and return precautions. Survey respondents found GPT-generated instructions to be more understandable when rating procedures, treatment, post-ED medications or medication changes, post-ED follow-up, and return precautions. Satisfaction with GPT-generated discharge instruction subsections was the most favorable in procedures, treatment, post-ED medications or medication changes, and return precautions. Wilcoxon rank-sum test of Likert responses revealed significant differences (P=.01) in the interpretability of significant return precautions in GPT-generated discharge instructions compared to standard discharge instructions but not for other evaluation metrics and discharge instruction subsections.
This study demonstrates the potential for LLMs such as ChatGPT to act as a method of augmenting current documentation workflows in the ED to reduce the documentation burden of physicians. The ability of LLMs to provide tailored instructions for patients by improving readability and making instructions more applicable to patients could improve upon the methods of communication that currently exist.
Journal Article
Incorporating Domain Knowledge Into Language Models by Using Graph Convolutional Networks for Assessing Semantic Textual Similarity: Model Development and Performance Comparison
2021
Although electronic health record systems have facilitated clinical documentation in health care, they have also introduced new challenges, such as the proliferation of redundant information through the use of copy and paste commands or templates. One approach to trimming down bloated clinical documentation and improving clinical summarization is to identify highly similar text snippets with the goal of removing such text.
We developed a natural language processing system for the task of assessing clinical semantic textual similarity. The system assigns scores to pairs of clinical text snippets based on their clinical semantic similarity.
We leveraged recent advances in natural language processing and graph representation learning to create a model that combines linguistic and domain knowledge information from the MedSTS data set to assess clinical semantic textual similarity. We used bidirectional encoder representation from transformers (BERT)-based models as text encoders for the sentence pairs in the data set and graph convolutional networks (GCNs) as graph encoders for corresponding concept graphs that were constructed based on the sentences. We also explored techniques, including data augmentation, ensembling, and knowledge distillation, to improve the model's performance, as measured by the Pearson correlation coefficient (r).
Fine-tuning the BERT_base and ClinicalBERT models on the MedSTS data set provided a strong baseline (Pearson correlation coefficients: 0.842 and 0.848, respectively) compared to those of the previous year's submissions. Our data augmentation techniques yielded moderate gains in performance, and adding a GCN-based graph encoder to incorporate the concept graphs also boosted performance, especially when the node features were initialized with pretrained knowledge graph embeddings of the concepts (r=0.868). As expected, ensembling improved performance, and performing multisource ensembling by using different language model variants, conducting knowledge distillation with the multisource ensemble model, and taking a final ensemble of the distilled models further improved the system's performance (Pearson correlation coefficients: 0.875, 0.878, and 0.882, respectively).
This study presents a system for the MedSTS clinical semantic textual similarity benchmark task, which was created by combining BERT-based text encoders and GCN-based graph encoders in order to incorporate domain knowledge into the natural language processing pipeline. We also experimented with other techniques involving data augmentation, pretrained concept embeddings, ensembling, and knowledge distillation to further increase our system's performance. Although the task and its benchmark data set are in the early stages of development, this study, as well as the results of the competition, demonstrates the potential of modern language model-based systems to detect redundant information in clinical notes.
Journal Article
Correction: How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment
2024
[This corrects the article DOI: 10.2196/45312.].
Journal Article
Adaptive decision support for addiction treatment to implement initiation of buprenorphine for opioid use disorder in the emergency department: protocol for the ADAPT Multiphase Optimization Strategy trial
by
Delgado, Mucio Kit
,
Oladele, Carol
,
D’Onofrio, Gail
in
Addictions
,
Buprenorphine - therapeutic use
,
Clinical decision making
2025
IntroductionDespite the current opioid crisis resulting in tens of thousands of deaths every year, buprenorphine, a medication that can reduce opioid-related mortality, withdrawal, drug use and craving, is still underprescribed in the emergency department (ED) for treatment of opioid use disorder (OUD). The EMergency department-initiated BuprenorphinE for opioid use Disorder (EMBED) trial introduced a clinical decision support (CDS) tool that improved the proportion of ED physicians prescribing buprenorphine but did not affect patient-level rates of buprenorphine initiation. The present trial aims to build on these findings by optimising CDS use through iterative improvements, refined interventions and clinician feedback to enhance OUD treatment initiation in EDs.Methods and analysisThe Adaptive Decision support for Addiction Treatment (ADAPT) trial employs the Multiphase Optimization Strategy (MOST) framework to refine a multicomponent CDS tool designed to facilitate buprenorphine initiation for OUD in ED settings. Using a pragmatic, learning health system approach in three phases, the trial applies plan–do–study–act cycles for continuous CDS refinement. The CDS will be updated in the preparation phase to reflect new evidence. The optimisation phase will include a 2×2×2 factorial trial, testing the impact of various intervention components, followed by rapid, serial randomised usability testing to reduce user errors and enhance CDS workflow efficiency. In the evaluation phase, the optimised CDS package will be tested in a randomised trial to assess its effectiveness in increasing ED initiation of buprenorphine compared with the original EMBED CDS.Ethics and disseminationThe protocol has received approval from our institution’s institutional review board (protocol #2000038624) with a waiver of informed consent for collecting non-identifiable information only. Given the minimal risk involved in implementing established best practices, an independent study monitor will oversee the study instead of a Data Safety Monitoring Board. Findings will be submitted to ClinicalTrials.gov, published in open-access, peer-reviewed journals, presented at national conferences and shared with clinicians at participating sites through email notification.Trial registration number NCT06799117.
Journal Article
Formative evaluation of an emergency department clinical decision support system for agitation symptoms: a study protocol
by
Kumar, Anusha
,
Adapa, Karthik
,
Faustino, Isaac V
in
ACCIDENT & EMERGENCY MEDICINE
,
Adult
,
Best practice
2024
IntroductionThe burden of mental health-related visits to emergency departments (EDs) is growing, and agitation episodes are prevalent with such visits. Best practice guidance from experts recommends early assessment of at-risk populations and pre-emptive intervention using de-escalation techniques to prevent agitation. Time pressure, fluctuating work demands, and other systems-related factors pose challenges to efficient decision-making and adoption of best practice recommendations during an unfolding behavioural crisis. As such, we propose to design, develop and evaluate a computerised clinical decision support (CDS) system, Early Detection and Treatment to Reduce Events with Agitation Tool (ED-TREAT). We aim to identify patients at risk of agitation and guide ED clinicians through appropriate risk assessment and timely interventions to prevent agitation with a goal of minimising restraint use and improving patient experience and outcomes.Methods and analysisThis study describes the formative evaluation of the health record embedded CDS tool. Under aim 1, the study will collect qualitative data to design and develop ED-TREAT using a contextual design approach and an iterative user-centred design process. Participants will include potential CDS users, that is, ED physicians, nurses, technicians, as well as patients with lived experience of restraint use for behavioural crisis management during an ED visit. We will use purposive sampling to ensure the full spectrum of perspectives until we reach thematic saturation. Next, under aim 2, the study will conduct a pilot, randomised controlled trial of ED-TREAT at two adult ED sites in a regional health system in the Northeast USA to evaluate the feasibility, fidelity and bedside acceptability of ED-TREAT. We aim to recruit a total of at least 26 eligible subjects under the pilot trial.Ethics and disseminationEthical approval by the Yale University Human Investigation Committee was obtained in 2021 (HIC# 2000030893 and 2000030906). All participants will provide informed verbal consent prior to being enrolled in the study. Results will be disseminated through publications in open-access, peer-reviewed journals, via scientific presentations or through direct email notifications.Trial registration numberNCT04959279; Pre-results.
Journal Article
Identifying Deprescribing Opportunities With Large Language Models in Older Adults: Retrospective Cohort Study
by
Chi, Ling
,
Dien, Christine
,
Sasidhar Kanaparthy, Naga
in
Aged
,
Aged, 80 and over
,
Deprescriptions
2025
Polypharmacy, the concurrent use of multiple medications, is prevalent among older adults and associated with increased risks for adverse drug events including falls. Deprescribing, the systematic process of discontinuing potentially inappropriate medications, aims to mitigate these risks. However, the practical application of deprescribing criteria in emergency settings remains limited due to time constraints and criteria complexity.
This study aims to evaluate the performance of a large language model (LLM)-based pipeline in identifying deprescribing opportunities for older emergency department (ED) patients with polypharmacy, using 3 different sets of criteria: Beers, Screening Tool of Older People's Prescriptions, and Geriatric Emergency Medication Safety Recommendations. The study further evaluates LLM confidence calibration and its ability to improve recommendation performance.
We conducted a retrospective cohort study of older adults presenting to an ED in a large academic medical center in the Northeast United States from January 2022 to March 2022. A random sample of 100 patients (712 total oral medications) was selected for detailed analysis. The LLM pipeline consisted of two steps: (1) filtering high-yield deprescribing criteria based on patients' medication lists, and (2) applying these criteria using both structured and unstructured patient data to recommend deprescribing. Model performance was assessed by comparing model recommendations to those of trained medical students, with discrepancies adjudicated by board-certified ED physicians. Selective prediction, a method that allows a model to abstain from low-confidence predictions to improve overall reliability, was applied to assess the model's confidence and decision-making thresholds.
The LLM was significantly more effective in identifying deprescribing criteria (positive predictive value: 0.83; negative predictive value: 0.93; McNemar test for paired proportions: χ
=5.985; P=.02) relative to medical students, but showed limitations in making specific deprescribing recommendations (positive predictive value=0.47; negative predictive value=0.93). Adjudication revealed that while the model excelled at identifying when there was a deprescribing criterion related to one of the patient's medications, it often struggled with determining whether that criterion applied to the specific case due to complex inclusion and exclusion criteria (54.5% of errors) and ambiguous clinical contexts (eg, missing information; 39.3% of errors). Selective prediction only marginally improved LLM performance due to poorly calibrated confidence estimates.
This study highlights the potential of LLMs to support deprescribing decisions in the ED by effectively filtering relevant criteria. However, challenges remain in applying these criteria to complex clinical scenarios, as the LLM demonstrated poor performance on more intricate decision-making tasks, with its reported confidence often failing to align with its actual success in these cases. The findings underscore the need for clearer deprescribing guidelines, improved LLM calibration for real-world use, and better integration of human-artificial intelligence workflows to balance artificial intelligence recommendations with clinician judgment.
Journal Article
Authors’ Reply to: Variability in Large Language Models’ Responses to Medical Licensing and Certification Examinations
2023
Related Articles Comment on: https://mededu.jmir.org/2023/1/e48305/Comment on: https://mededu.jmir.org/2023/1/e45312
Journal Article