Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Series TitleSeries Title
-
Reading LevelReading Level
-
YearFrom:-To:
-
More FiltersMore FiltersContent TypeItem TypeIs Full-Text AvailableSubjectPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
73,119
result(s) for
"clinical decision"
Sort by:
Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study
by
Kamineni, Meghana
,
Prasad, Anoop K
,
Kim, John
in
Accuracy
,
Artificial
,
Artificial Intelligence
2023
Large language model (LLM)-based artificial intelligence chatbots direct the power of large training data sets toward successive, related tasks as opposed to single-ask tasks, for which artificial intelligence already achieves impressive performance. The capacity of LLMs to assist in the full scope of iterative clinical reasoning via successive prompting, in effect acting as artificial physicians, has not yet been evaluated.
This study aimed to evaluate ChatGPT's capacity for ongoing clinical decision support via its performance on standardized clinical vignettes.
We inputted all 36 published clinical vignettes from the Merck Sharpe & Dohme (MSD) Clinical Manual into ChatGPT and compared its accuracy on differential diagnoses, diagnostic testing, final diagnosis, and management based on patient age, gender, and case acuity. Accuracy was measured by the proportion of correct responses to the questions posed within the clinical vignettes tested, as calculated by human scorers. We further conducted linear regression to assess the contributing factors toward ChatGPT's performance on clinical tasks.
ChatGPT achieved an overall accuracy of 71.7% (95% CI 69.3%-74.1%) across all 36 clinical vignettes. The LLM demonstrated the highest performance in making a final diagnosis with an accuracy of 76.9% (95% CI 67.8%-86.1%) and the lowest performance in generating an initial differential diagnosis with an accuracy of 60.3% (95% CI 54.2%-66.6%). Compared to answering questions about general medical knowledge, ChatGPT demonstrated inferior performance on differential diagnosis (β=-15.8%; P<.001) and clinical management (β=-7.4%; P=.02) question types.
ChatGPT achieves impressive accuracy in clinical decision-making, with increasing strength as it gains more clinical information at its disposal. In particular, ChatGPT demonstrates the greatest accuracy in tasks of final diagnosis as compared to initial diagnosis. Limitations include possible model hallucinations and the unclear composition of ChatGPT's training data set.
Journal Article
Data mining in biomedical imaging, signaling, and systems
\"Data mining has rapidly emerged as an enabling, robust, and scalable technique to analyze data for novel patterns, trends, anomalies, structures, and features that can be employed for a variety of biomedical and clinical domains. Approaching the techniques and challenges of image mining from a multidisciplinary perspective, this book presents data mining techniques, methodologies, algorithms, and strategies to analyze biomedical signals and images. Written by experts, the text addresses data mining paradigms for the development of biomedical systems. It also includes special coverage of knowledge discovery in mammograms and emphasizes both the diagnostic and therapeutic fields of eye imaging\"--Provided by publisher.
Utility of ChatGPT in Clinical Practice
2023
ChatGPT is receiving increasing attention and has a variety of application scenarios in clinical practice. In clinical decision support, ChatGPT has been used to generate accurate differential diagnosis lists, support clinical decision-making, optimize clinical decision support, and provide insights for cancer screening decisions. In addition, ChatGPT has been used for intelligent question-answering to provide reliable information about diseases and medical queries. In terms of medical documentation, ChatGPT has proven effective in generating patient clinical letters, radiology reports, medical notes, and discharge summaries, improving efficiency and accuracy for health care providers. Future research directions include real-time monitoring and predictive analytics, precision medicine and personalized treatment, the role of ChatGPT in telemedicine and remote health care, and integration with existing health care systems. Overall, ChatGPT is a valuable tool that complements the expertise of health care providers and improves clinical decision-making and patient care. However, ChatGPT is a double-edged sword. We need to carefully consider and study the benefits and potential dangers of ChatGPT. In this viewpoint, we discuss recent advances in ChatGPT research in clinical practice and suggest possible risks and challenges of using ChatGPT in clinical practice. It will help guide and support future artificial intelligence research similar to ChatGPT in health.
Journal Article
Explainability for artificial intelligence in healthcare: a multidisciplinary perspective
by
Vayena, Effy
,
Blasimme, Alessandro
,
Frey, Dietmar
in
Algorithms
,
Analysis
,
Artificial Intelligence
2020
Background
Explainability is one of the most heavily debated topics when it comes to the application of artificial intelligence (AI) in healthcare. Even though AI-driven systems have been shown to outperform humans in certain analytical tasks, the lack of explainability continues to spark criticism. Yet, explainability is not a purely technological issue, instead it invokes a host of medical, legal, ethical, and societal questions that require thorough exploration. This paper provides a comprehensive assessment of the role of explainability in medical AI and makes an ethical evaluation of what explainability means for the adoption of AI-driven tools into clinical practice.
Methods
Taking AI-based clinical decision support systems as a case in point, we adopted a multidisciplinary approach to analyze the relevance of explainability for medical AI from the technological, legal, medical, and patient perspectives. Drawing on the findings of this conceptual analysis, we then conducted an ethical assessment using the “Principles of Biomedical Ethics” by Beauchamp and Childress (autonomy, beneficence, nonmaleficence, and justice) as an analytical framework to determine the need for explainability in medical AI.
Results
Each of the domains highlights a different set of core considerations and values that are relevant for understanding the role of explainability in clinical practice. From the technological point of view, explainability has to be considered both in terms how it can be achieved and what is beneficial from a development perspective. When looking at the legal perspective we identified informed consent, certification and approval as medical devices, and liability as core touchpoints for explainability. Both the medical and patient perspectives emphasize the importance of considering the interplay between human actors and medical AI. We conclude that omitting explainability in clinical decision support systems poses a threat to core ethical values in medicine and may have detrimental consequences for individual and public health.
Conclusions
To ensure that medical AI lives up to its promises, there is a need to sensitize developers, healthcare professionals, and legislators to the challenges and limitations of opaque algorithms in medical AI and to foster multidisciplinary collaboration moving forward.
Journal Article
Effects of workload, work complexity, and repeated alerts on alert fatigue in a clinical decision support system
by
Mauer, Elizabeth
,
Kaushal, Rainu
,
Nosal, Sarah
in
Adult
,
Alert fatigue
,
Alert Fatigue, Health Personnel
2017
Background
Although alert fatigue is blamed for high override rates in contemporary clinical decision support systems, the concept of alert fatigue is poorly defined. We tested hypotheses arising from two possible alert fatigue mechanisms: (A)
cognitive overload
associated with amount of work, complexity of work, and effort distinguishing informative from uninformative alerts, and (B)
desensitization
from repeated exposure to the same alert over time.
Methods
Retrospective cohort study using electronic health record data (both drug alerts and clinical practice reminders) from January 2010 through June 2013 from 112 ambulatory primary care clinicians. The cognitive overload hypotheses were that alert acceptance would be lower with higher workload (number of encounters, number of patients), higher work complexity (patient comorbidity, alerts per encounter), and more alerts low in informational value (repeated alerts for the same patient in the same year). The desensitization hypothesis was that, for newly deployed alerts, acceptance rates would decline after an initial peak.
Results
On average, one-quarter of drug alerts received by a primary care clinician, and one-third of clinical reminders, were repeats for the same patient within the same year. Alert acceptance was associated with work complexity and repeated alerts, but not with the amount of work. Likelihood of reminder acceptance dropped by 30% for each additional reminder received per encounter, and by 10% for each five percentage point increase in proportion of repeated reminders. The newly deployed reminders did not show a pattern of declining response rates over time, which would have been consistent with desensitization. Interestingly, nurse practitioners were 4 times as likely to accept drug alerts as physicians.
Conclusions
Clinicians became less likely to accept alerts as they received more of them, particularly more repeated alerts. There was no evidence of an effect of workload per se, or of desensitization over time for a newly deployed alert. Reducing within-patient repeats may be a promising target for reducing alert overrides and alert fatigue.
Journal Article
Large Language Models lack essential metacognition for reliable medical reasoning
2025
Large Language Models have demonstrated expert-level accuracy on medical board examinations, suggesting potential for clinical decision support systems. However, their metacognitive abilities, crucial for medical decision-making, remain largely unexplored. To address this gap, we developed MetaMedQA, a benchmark incorporating confidence scores and metacognitive tasks into multiple-choice medical questions. We evaluated twelve models on dimensions including confidence-based accuracy, missing answer recall, and unknown recall. Despite high accuracy on multiple-choice questions, our study revealed significant metacognitive deficiencies across all tested models. Models consistently failed to recognize their knowledge limitations and provided confident answers even when correct options were absent. In this work, we show that current models exhibit a critical disconnect between perceived and actual capabilities in medical reasoning, posing significant risks in clinical settings. Our findings emphasize the need for more robust evaluation frameworks that incorporate metacognitive abilities, essential for developing reliable Large Language Model enhanced clinical decision support systems.
Large Language Models demonstrate expert-level accuracy in medical exams, supporting their potential inclusion in healthcare settings. Here, authors reveal that their metacognitive abilities are underexplored, showing significant gaps in recognizing knowledge limitations, difficulties in modulating their confidence, and challenges in identifying when a problem cannot be answered due to insufficient information.
Journal Article
Diagnosis and management of migraine in ten steps
by
del Rio Margarita Sanchez
,
Skorobogatykh Kirill
,
Mitsikostas, Dimos D
in
Clinical decision making
,
Headaches
,
Migraine
2021
Migraine is a disabling primary headache disorder that directly affects more than one billion people worldwide. Despite its widespread prevalence, migraine remains under-diagnosed and under-treated. To support clinical decision-making, we convened a European panel of experts to develop a ten-step approach to the diagnosis and management of migraine. Each step was established by expert consensus and supported by a review of current literature, and the Consensus Statement is endorsed by the European Headache Federation and the European Academy of Neurology. In this Consensus Statement, we introduce typical clinical features, diagnostic criteria and differential diagnoses of migraine. We then emphasize the value of patient centricity and patient education to ensure treatment adherence and satisfaction with care provision. Further, we outline best practices for acute and preventive treatment of migraine in various patient populations, including adults, children and adolescents, pregnant and breastfeeding women, and older people. In addition, we provide recommendations for evaluating treatment response and managing treatment failure. Lastly, we discuss the management of complications and comorbidities as well as the importance of planning long-term follow-up.In this Consensus Statement, which is endorsed by the European Headache Federation and the European Academy of Neurology, an expert panel provides recommendations for the diagnosis and management of migraine to support clinical decision-making by general practitioners, neurologists and headache specialists.
Journal Article
Artificial intelligence, bias and clinical safety
by
Denny, Joshua
,
Edwards, Tom
,
Tsaneva-Atanasova, Krasimira
in
Accuracy
,
Algorithms
,
Artificial Intelligence
2019
Introduction In medicine, artificial intelligence (AI) research is becoming increasingly focused on applying machine learning (ML) techniques to complex problems, and so allowing computers to make predictions from large amounts of patient data, by learning their own associations.1 Estimates of the impact of AI on the wider economy globally vary wildly, with a recent report suggesting a 14% effect on global gross domestic product by 2030, half of which coming from productivity improvements.2 These predictions create political appetite for the rapid development of the AI industry,3 and healthcare is a priority area where this technology has yet to be exploited.2 3 The digital health revolution described by Duggal et al 4 is already in full swing with the potential to ‘disrupt’ healthcare. Trends in ML research Clinical decision support systems (DSS) are in widespread use in medicine and have had most impact providing guidance on the safe prescription of medicines,12 guideline adherence, simple risk screening13 or prognostic scoring.14 These systems use predefined rules, which have predictable behaviour and are usually shown to reduce clinical error,12 although sometimes inadvertently introduce safety issues themselves.15 16 Rules-based systems have also been developed to address diagnostic uncertainty17–19 but have struggled to deal with the breadth and variety of information involved in the typical diagnostic process, a problem for which ML systems are potentially better suited. [...]of this gap, the bulk of research into medical applications of ML has focused on diagnostic decision support, often in a specific clinical domain such as radiology, using algorithms that learn to classify from training examples (supervised learning). A similar fail-safe may be needed if the system has insufficient input information or detects an ‘out-of-sample’ situation as described above.46 Medium-term issues Automation complacency As humans, clinicians are susceptible to a range of cognitive biases which influence their ability to make accurate decisions.47 Particularly relevant is ‘confirmation bias’ in which clinicians give excessive significance to evidence which supports their presumed diagnosis and ignore evidence which refutes it.25 Automation bias48 describes the phenomenon whereby clinicians accept the guidance of an automated system and cease searching for confirmatory evidence (eg, see Tsai et al 49), perhaps transferring responsibility for decision-making onto the machine—an effect reportedly strongest when a machine advises that a case is normal.48 Automation complacency is a related concept48 in which people using imperfect DSS are least likely to catch errors if they are using a system which has been generally reliable, they are loaded with multiple concurrent tasks and they are at the end of their shift.
Journal Article