Catalogue Search | MBRL

Large language models in health care: Development, applications, and challenges

by Yang, Rui , Lu, Wei , Tan, Ting Fang in Artificial intelligence , Chatbots , Data mining

2023

Recently, the emergence of ChatGPT, an artificial intelligence chatbot developed by OpenAI, has attracted significant attention due to its exceptional language comprehension and content generation capabilities, highlighting the immense potential of large language models (LLMs). LLMs have become a burgeoning hotspot across many fields, including health care. Within health care, LLMs may be classified into LLMs for the biomedical domain and LLMs for the clinical domain based on the corpora used for pre‐training. In the last 3 years, these domain‐specific LLMs have demonstrated exceptional performance on multiple natural language processing tasks, surpassing the performance of general LLMs as well. This not only emphasizes the significance of developing dedicated LLMs for the specific domains, but also raises expectations for their applications in health care. We believe that LLMs may be used widely in preconsultation, diagnosis, and management, with appropriate development and supervision. Additionally, LLMs hold tremendous promise in assisting with medical education, medical writing and other related applications. Likewise, health care systems must recognize and address the challenges posed by LLMs.

Journal Article

Share this book

Add to My Shelf

ChatGPT in ophthalmology: the dawn of a new era?

by Ting, Daniel Shu Wei , Tan, Ting Fang , Ting, Darren Shu Jeng in 692/308 , 692/700 , 706/648/160

2024

Journal Article

Share this book

Add to My Shelf

Vision-language large learning model, GPT4V, accurately classifies the Boston Bowel Preparation Scale score

by Tan, Chee Kiat , Lim, Daniel Yan Zheng , Ong, Jasmine Chiat Ling in Application programming interface , Artificial Intelligence , Automation

2025

IntroductionLarge learning models (LLMs) such as GPT are advanced artificial intelligence (AI) models. Originally developed for natural language processing, they have been adapted for multi-modal tasks with vision-language input. One clinically relevant task is scoring the Boston Bowel Preparation Scale (BBPS). While traditional AI techniques use large amounts of data for training, we hypothesise that vision-language LLM can perform this task with fewer examples.MethodsWe used the GPT4V vision-language LLM developed by OpenAI, via the OpenAI application programming interface. A standardised prompt instructed the model to grade BBPS with contextual references extracted from the original paper describing the BBPS by Lai et al (GIE 2009). Performance was tested on the HyperKvasir dataset, an open dataset for automated BBPS grading.ResultsOf 1794 images, GPT4V returned valid results for 1772 (98%). It had an accuracy of 0.84 for two-class classification (BBPS 0–1 vs 2–3) and 0.74 for four-class classification (BBPS 0, 1, 2, 3). Macro-averaged F1 scores were 0.81 and 0.63, respectively. Qualitatively, most errors arose from misclassification of BBPS 1 as 2. These results compare favourably with current methods using large amounts of training data, which achieve an accuracy in the range of 0.8–0.9.ConclusionThis study provides proof-of-concept that a vision-language LLM is able to perform BBPS classification accurately, without large training datasets. This represents a paradigm shift in AI classification methods in medicine, where many diseases lack sufficient data to train traditional AI models. An LLM with appropriate examples may be used in such cases.

Journal Article

Share this book

Add to My Shelf

Comparison of the Quality of Discharge Letters Written by Large Language Models and Junior Clinicians: Single-Blinded Study

by Lim, Daniel Yan Zheng , Tan, Ting Fang , Ong, Jasmine Chiat Ling in Archives & records , Artificial intelligence , Biopsy

2024

Discharge letters are a critical component in the continuity of care between specialists and primary care providers. However, these letters are time-consuming to write, underprioritized in comparison to direct clinical care, and are often tasked to junior doctors. Prior studies assessing the quality of discharge summaries written for inpatient hospital admissions show inadequacies in many domains. Large language models such as GPT have the ability to summarize large volumes of unstructured free text such as electronic medical records and have the potential to automate such tasks, providing time savings and consistency in quality. The aim of this study was to assess the performance of GPT-4 in generating discharge letters written from urology specialist outpatient clinics to primary care providers and to compare their quality against letters written by junior clinicians. Fictional electronic records were written by physicians simulating 5 common urology outpatient cases with long-term follow-up. Records comprised simulated consultation notes, referral letters and replies, and relevant discharge summaries from inpatient admissions. GPT-4 was tasked to write discharge letters for these cases with a specified target audience of primary care providers who would be continuing the patient's care. Prompts were written for safety, content, and style. Concurrently, junior clinicians were provided with the same case records and instructional prompts. GPT-4 output was assessed for instances of hallucination. A blinded panel of primary care physicians then evaluated the letters using a standardized questionnaire tool. GPT-4 outperformed human counterparts in information provision (mean 4.32, SD 0.95 vs 3.70, SD 1.27; P=.03) and had no instances of hallucination. There were no statistically significant differences in the mean clarity (4.16, SD 0.95 vs 3.68, SD 1.24; P=.12), collegiality (4.36, SD 1.00 vs 3.84, SD 1.22; P=.05), conciseness (3.60, SD 1.12 vs 3.64, SD 1.27; P=.71), follow-up recommendations (4.16, SD 1.03 vs 3.72, SD 1.13; P=.08), and overall satisfaction (3.96, SD 1.14 vs 3.62, SD 1.34; P=.36) between the letters generated by GPT-4 and humans, respectively. Discharge letters written by GPT-4 had equivalent quality to those written by junior clinicians, without any hallucinations. This study provides a proof of concept that large language models can be useful and safe tools in clinical documentation.

Journal Article

Share this book

Add to My Shelf

The next generation of healthcare ecosystem in the metaverse

by Gunasekeran, Dinesh Visva , RaviChandran, Narrendar , Ting, Daniel S W in Avatars , Delivery of Health Care , Ecosystem

2024

The Metaverse has gained wide attention for being the application interface for the next generation of Internet. The potential of the Metaverse is growing, as Web 3·0 development and adoption continues to advance medicine and healthcare. We define the next generation of interoperable healthcare ecosystem in the Metaverse. We examine the existing literature regarding the Metaverse, explain the technology framework to deliver an immersive experience, along with a technical comparison of legacy and novel Metaverse platforms that are publicly released and in active use. The potential applications of different features of the Metaverse, including avatar-based meetings, immersive simulations, and social interactions are examined with different roles from patients to healthcare providers and healthcare organizations. Present challenges in the development of the Metaverse healthcare ecosystem are discussed, along with potential solutions including capabilities requiring technological innovation, use cases requiring regulatory supervision, and sound governance. This proposed concept and framework of the Metaverse could potentially redefine the traditional healthcare system and enhance digital transformation in healthcare. Similar to AI technology at the beginning of this decade, real-world development and implementation of these capabilities are relatively nascent. Further pragmatic research is needed for the development of an interoperable healthcare ecosystem in the Metaverse.

Journal Article

Share this book

Add to My Shelf

Large language models approach expert-level clinical knowledge and reasoning in ophthalmology: A head-to-head cross-sectional study

by Chong, Yu Jeat , Thirunavukarasu, Arun James , Mahmood, Shathar in Accuracy , Benchmarks , Biology and Life Sciences

2024

Large language models (LLMs) underlie remarkable recent advanced in natural language processing, and they are beginning to be applied in clinical contexts. We aimed to evaluate the clinical potential of state-of-the-art LLMs in ophthalmology using a more robust benchmark than raw examination scores. We trialled GPT-3.5 and GPT-4 on 347 ophthalmology questions before GPT-3.5, GPT-4, PaLM 2, LLaMA, expert ophthalmologists, and doctors in training were trialled on a mock examination of 87 questions. Performance was analysed with respect to question subject and type (first order recall and higher order reasoning). Masked ophthalmologists graded the accuracy, relevance, and overall preference of GPT-3.5 and GPT-4 responses to the same questions. The performance of GPT-4 (69%) was superior to GPT-3.5 (48%), LLaMA (32%), and PaLM 2 (56%). GPT-4 compared favourably with expert ophthalmologists (median 76%, range 64–90%), ophthalmology trainees (median 59%, range 57–63%), and unspecialised junior doctors (median 43%, range 41–44%). Low agreement between LLMs and doctors reflected idiosyncratic differences in knowledge and reasoning with overall consistency across subjects and types ( p >0.05). All ophthalmologists preferred GPT-4 responses over GPT-3.5 and rated the accuracy and relevance of GPT-4 as higher ( p <0.05). LLMs are approaching expert-level knowledge and reasoning skills in ophthalmology. In view of the comparable or superior performance to trainee-grade ophthalmologists and unspecialised junior doctors, state-of-the-art LLMs such as GPT-4 may provide useful medical advice and assistance where access to expert ophthalmologists is limited. Clinical benchmarks provide useful assays of LLM capabilities in healthcare before clinical trials can be designed and conducted.

Journal Article

Share this book

Add to My Shelf

Lesion detection in age-related macular degeneration with a multi-modal imaging and machine learning approach

by Wong, Damon , Tan, Anna C. S. , Tan, Ting Fang in Ablation , Age related diseases , Aged

2025

Background Age-related macular degeneration is a leading cause of central vision loss, and assessing visual function with microperimetry can be time-consuming and tiring for patients. Targeting regions corresponding to worsening acute-stage retinal lesions may reduce test durations and patient fatigue. Results We developed a machine-learning approach using multi-modal imaging data to differentiate lesional regions from healthy retinal areas. Our dataset included 344,003 regions extracted from color fundus photographs, infrared fundus images, optical coherence tomography, and optical coherence tomography angiography images. A gradient-boosted tree-ensemble model was trained on this data and achieved an area under the receiver operating characteristic curve of 0.95 in detecting end-stage lesions in chronic age-related macular degeneration. Conclusions The proposed method effectively detects lesions associated with age-related macular degeneration using multi-modal imaging and machine learning. This approach offers a potential solution for creating targeted microperimetry test patterns, which can reduce testing time and patient fatigue, thereby enhancing the clinical assessment of visual function in affected patients.

Journal Article

Share this book

Add to My Shelf

A translational perspective towards clinical AI fairness

by Ong, Marcus Eng Hock , Teixayavong, Salinelat , Ong, Jasmine Chiat Ling in Age groups , Artificial intelligence , Bias

2023

Artificial intelligence (AI) has demonstrated the ability to extract insights from data, but the fairness of such data-driven insights remains a concern in high-stakes fields. Despite extensive developments, issues of AI fairness in clinical contexts have not been adequately addressed. A fair model is normally expected to perform equally across subgroups defined by sensitive variables (e.g., age, gender/sex, race/ethnicity, socio-economic status, etc.). Various fairness measurements have been developed to detect differences between subgroups as evidence of bias, and bias mitigation methods are designed to reduce the differences detected. This perspective of fairness, however, is misaligned with some key considerations in clinical contexts. The set of sensitive variables used in healthcare applications must be carefully examined for relevance and justified by clear clinical motivations. In addition, clinical AI fairness should closely investigate the ethical implications of fairness measurements (e.g., potential conflicts between group- and individual-level fairness) to select suitable and objective metrics. Generally defining AI fairness as “equality” is not necessarily reasonable in clinical settings, as differences may have clinical justifications and do not indicate biases. Instead, “equity” would be an appropriate objective of clinical AI fairness. Moreover, clinical feedback is essential to developing fair and well-performing AI models, and efforts should be made to actively involve clinicians in the process. The adaptation of AI fairness towards healthcare is not self-evident due to misalignments between technical developments and clinical considerations. Multidisciplinary collaboration between AI researchers, clinicians, and ethicists is necessary to bridge the gap and translate AI fairness into real-life benefits.

Journal Article

Share this book

Add to My Shelf

Defining the structure–function relationship of specific lesions in early and advanced age-related macular degeneration

by Wong, Damon , Tan, Ting Fang , Peterson, Claire L. in 692/699/3161/3175 , 692/700/1421 , Age related macular degeneration

2024

The objective of this study is to define structure–function relationships of pathological lesions related to age-related macular degeneration (AMD) using microperimetry and multimodal retinal imaging. We conducted a cross-sectional study of 87 patients with AMD (30 eyes with early and intermediate AMD and 110 eyes with advanced AMD), compared to 33 normal controls (66 eyes) recruited from a single tertiary center. All participants had enface and cross-sectional optical coherence tomography (Heidelberg HRA-2), OCT angiography, color and infra-red (IR) fundus and microperimetry (MP) (Nidek MP-3) performed. Multimodal images were graded for specific AMD pathological lesions. A custom marking tool was used to demarcate lesion boundaries on corresponding enface IR images, and subsequently superimposed onto MP color fundus photographs with retinal sensitivity points (RSP). The resulting overlay was used to correlate pathological structural changes to zonal functional changes. Mean age of patients with early/intermediate AMD, advanced AMD and controls were 73(SD = 8.2), 70.8(SD = 8), and 65.4(SD = 7.7) years respectively. Mean retinal sensitivity (MRS) of both early/intermediate (23.1 dB; SD = 5.5) and advanced AMD (18.1 dB; SD = 7.8) eyes were significantly worse than controls (27.8 dB, SD = 4.3) (p < 0.01). Advanced AMD eyes had significantly more unstable fixation (70%; SD = 63.6), larger mean fixation area (3.9 mm 2 ; SD = 3.0), and focal fixation point further away from the fovea (0.7 mm; SD = 0.8), than controls (29%; SD = 43.9; 2.6 mm 2 ; SD = 1.9; 0.4 mm; SD = 0.3) (p ≤ 0.01). Notably, 22 fellow eyes of AMD eyes (25.7 dB; SD = 3.0), with no AMD lesions, still had lower MRS than controls (p = 0.04). For specific AMD-related lesions, end-stage changes such as fibrosis (5.5 dB, SD = 5.4 dB) and atrophy (6.2 dB, SD = 7.0 dB) had the lowest MRS; while drusen and pigment epithelial detachment (17.7 dB, SD = 8.0 dB) had the highest MRS. Peri-lesional areas (20.2 dB, SD = 7.6 dB) and surrounding structurally normal areas (22.2 dB, SD = 6.9 dB) of the retina with no AMD lesions still had lower MRS compared to controls (27.8 dB, SD = 4.3 dB) (p < 0.01). Our detailed topographic structure–function correlation identified specific AMD pathological changes associated with a poorer visual function. This can provide an added value to the assessment of visual function to optimize treatment outcomes to existing and potentially future novel therapies.

Journal Article

Share this book

Add to My Shelf

Novel artificial intelligence algorithms for diabetic retinopathy and diabetic macular edema

by Tan, Tien-En , Yao, Jie , Lim, Joshua in Advances in AI in Ophthalmology , Algorithms , Artificial intelligence

2024

Background Diabetic retinopathy (DR) and diabetic macular edema (DME) are major causes of visual impairment that challenge global vision health. New strategies are needed to tackle these growing global health problems, and the integration of artificial intelligence (AI) into ophthalmology has the potential to revolutionize DR and DME management to meet these challenges. Main text This review discusses the latest AI-driven methodologies in the context of DR and DME in terms of disease identification, patient-specific disease profiling, and short-term and long-term management. This includes current screening and diagnostic systems and their real-world implementation, lesion detection and analysis, disease progression prediction, and treatment response models. It also highlights the technical advancements that have been made in these areas. Despite these advancements, there are obstacles to the widespread adoption of these technologies in clinical settings, including regulatory and privacy concerns, the need for extensive validation, and integration with existing healthcare systems. We also explore the disparity between the potential of AI models and their actual effectiveness in real-world applications. Conclusion AI has the potential to revolutionize the management of DR and DME, offering more efficient and precise tools for healthcare professionals. However, overcoming challenges in deployment, regulatory compliance, and patient privacy is essential for these technologies to realize their full potential. Future research should aim to bridge the gap between technological innovation and clinical application, ensuring AI tools integrate seamlessly into healthcare workflows to enhance patient outcomes.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter