Catalogue Search | MBRL

Multimodal Large Language Models in Health Care: Applications, Challenges, and Future Outlook

by Sheikh, Javaid , Renault, Max-Antoine , Damseh, Rafat in Application , Artificial intelligence , Chatbots

2024

In the complex and multidimensional field of medicine, multimodal data are prevalent and crucial for informed clinical decisions. Multimodal data span a broad spectrum of data types, including medical images (eg, MRI and CT scans), time-series data (eg, sensor data from wearable devices and electronic health records), audio recordings (eg, heart and respiratory sounds and patient interviews), text (eg, clinical notes and research articles), videos (eg, surgical procedures), and omics data (eg, genomics and proteomics). While advancements in large language models (LLMs) have enabled new applications for knowledge retrieval and processing in the medical field, most LLMs remain limited to processing unimodal data, typically text-based content, and often overlook the importance of integrating the diverse data modalities encountered in clinical practice. This paper aims to present a detailed, practical, and solution-oriented perspective on the use of multimodal LLMs (M-LLMs) in the medical field. Our investigation spanned M-LLM foundational principles, current and potential applications, technical and ethical challenges, and future research directions. By connecting these elements, we aimed to provide a comprehensive framework that links diverse aspects of M-LLMs, offering a unified vision for their future in health care. This approach aims to guide both future research and practical implementations of M-LLMs in health care, positioning them as a paradigm shift toward integrated, multimodal data–driven medical practice. We anticipate that this work will spark further discussion and inspire the development of innovative approaches in the next generation of medical M-LLM systems.

Journal Article

Share this book

Add to My Shelf

Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions

by Sheikh, Javaid , Alhuwail, Dari , Aziz, Sarah in Automation , Case studies , Chatbots

2023

The integration of large language models (LLMs), such as those in the Generative Pre-trained Transformers (GPT) series, into medical education has the potential to transform learning experiences for students and elevate their knowledge, skills, and competence. Drawing on a wealth of professional and academic experience, we propose that LLMs hold promise for revolutionizing medical curriculum development, teaching methodologies, personalized study plans and learning materials, student assessments, and more. However, we also critically examine the challenges that such integration might pose by addressing issues of algorithmic bias, overreliance, plagiarism, misinformation, inequity, privacy, and copyright concerns in medical education. As we navigate the shift from an information-driven educational paradigm to an artificial intelligence (AI)–driven educational paradigm, we argue that it is paramount to understand both the potential and the pitfalls of LLMs in medical education. This paper thus offers our perspective on the opportunities and challenges of using LLMs in this context. We believe that the insights gleaned from this analysis will serve as a foundation for future recommendations and best practices in the field, fostering the responsible and effective use of AI technologies in medical education.

Journal Article

Share this book

Add to My Shelf

Systematic review and meta-analysis of performance of wearable artificial intelligence in detecting and predicting depression

by Sheikh, Javaid , Aziz, Sarah , Ahmed, Arfan in Accuracy , Artificial intelligence , Digital technology

2023

Given the limitations of traditional approaches, wearable artificial intelligence (AI) is one of the technologies that have been exploited to detect or predict depression. The current review aimed at examining the performance of wearable AI in detecting and predicting depression. The search sources in this systematic review were 8 electronic databases. Study selection, data extraction, and risk of bias assessment were carried out by two reviewers independently. The extracted results were synthesized narratively and statistically. Of the 1314 citations retrieved from the databases, 54 studies were included in this review. The pooled mean of the highest accuracy, sensitivity, specificity, and root mean square error (RMSE) was 0.89, 0.87, 0.93, and 4.55, respectively. The pooled mean of lowest accuracy, sensitivity, specificity, and RMSE was 0.70, 0.61, 0.73, and 3.76, respectively. Subgroup analyses revealed that there is a statistically significant difference in the highest accuracy, lowest accuracy, highest sensitivity, highest specificity, and lowest specificity between algorithms, and there is a statistically significant difference in the lowest sensitivity and lowest specificity between wearable devices. Wearable AI is a promising tool for depression detection and prediction although it is in its infancy and not ready for use in clinical practice. Until further research improve its performance, wearable AI should be used in conjunction with other methods for diagnosing and predicting depression. Further studies are needed to examine the performance of wearable AI based on a combination of wearable device data and neuroimaging data and to distinguish patients with depression from those with other diseases.

Journal Article

Share this book

Add to My Shelf

Leveraging LLMs and wearables to provide personalized recommendations for enhancing student well-being and academic performance through a proof of concept

by Sheikh, Javaid , Aziz, Sarah , Ahmed, Arfan in 692/700 , 692/700/1719 , Academic achievement

2025

Traditional one-size-fits-all recommendations for student well-being and academic success may not be optimal. Personalized recommendations based on individual data hold promise. This study explores the potential of Large Language Models (LLMs) to generate personalized recommendations for 12 high school students to enhance their well-being and academic performance. We analyzed data from 12 students, including Fitbit data (activity levels, sleep and stress scores), PSQI surveys (sleep quality), and school reports (grades, teacher observations). An LLM model was used to analyze this data and create personalized recommendations for each student. Validator scoring assessed the clarity, actionability, and alignment of recommendations with student data. The LLM generated various recommendations based on different student data profiles (e.g., low activity levels, poor sleep quality). Validation results indicated that the recommendations were generally clear and actionable, with high ratings in both areas, though alignment with student data showed more variability, suggesting areas for improvement. This study demonstrates the potential of LLMs to generate personalized recommendations based on student data, acknowledging the need for further validation with initial validator feedback indicating their value. However, improvements are needed at every stage, including enhancing prompts, refining models, and incorporating advanced data analytics and continuous feedback. Future research, particularly with intervention groups and potentially RCT studies, is crucial to establish causal relationships and validate the recommendations’ impact. As this technology evolves, ensuring ethical considerations and data privacy remains essential.

Journal Article

Share this book

Add to My Shelf

Wearable Artificial Intelligence for Detecting Anxiety: Systematic Review and Meta-Analysis

by Harfouche, Manale , Sheikh, Javaid , Aziz, Sarah in Accuracy , Advertising executives , Algorithms

2023

Anxiety disorders rank among the most prevalent mental disorders worldwide. Anxiety symptoms are typically evaluated using self-assessment surveys or interview-based assessment methods conducted by clinicians, which can be subjective, time-consuming, and challenging to repeat. Therefore, there is an increasing demand for using technologies capable of providing objective and early detection of anxiety. Wearable artificial intelligence (AI), the combination of AI technology and wearable devices, has been widely used to detect and predict anxiety disorders automatically, objectively, and more efficiently. This systematic review and meta-analysis aims to assess the performance of wearable AI in detecting and predicting anxiety. Relevant studies were retrieved by searching 8 electronic databases and backward and forward reference list checking. In total, 2 reviewers independently carried out study selection, data extraction, and risk-of-bias assessment. The included studies were assessed for risk of bias using a modified version of the Quality Assessment of Diagnostic Accuracy Studies-Revised. Evidence was synthesized using a narrative (ie, text and tables) and statistical (ie, meta-analysis) approach as appropriate. Of the 918 records identified, 21 (2.3%) were included in this review. A meta-analysis of results from 81% (17/21) of the studies revealed a pooled mean accuracy of 0.82 (95% CI 0.71-0.89). Meta-analyses of results from 48% (10/21) of the studies showed a pooled mean sensitivity of 0.79 (95% CI 0.57-0.91) and a pooled mean specificity of 0.92 (95% CI 0.68-0.98). Subgroup analyses demonstrated that the performance of wearable AI was not moderated by algorithms, aims of AI, wearable devices used, status of wearable devices, data types, data sources, reference standards, and validation methods. Although wearable AI has the potential to detect anxiety, it is not yet advanced enough for clinical use. Until further evidence shows an ideal performance of wearable AI, it should be used along with other clinical assessments. Wearable device companies need to develop devices that can promptly detect anxiety and identify specific time points during the day when anxiety levels are high. Further research is needed to differentiate types of anxiety, compare the performance of different wearable devices, and investigate the impact of the combination of wearable device data and neuroimaging data on the performance of wearable AI. PROSPERO CRD42023387560; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=387560.

Journal Article

Share this book

Add to My Shelf

PredictPTB: an interpretable preterm birth prediction model using attention-based recurrent neural networks

by Boughorbel, Sabri , Malluhi, Qutaibah , AlSaad, Rawan in Algorithms , Analysis , Attention mechanism

2022

Background Early identification of pregnant women at risk for preterm birth (PTB), a major cause of infant mortality and morbidity, has a significant potential to improve prenatal care. However, we lack effective predictive models which can accurately forecast PTB and complement these predictions with appropriate interpretations for clinicians. In this work, we introduce a clinical prediction model (PredictPTB) which combines variables (medical codes) readily accessible through electronic health record (EHR) to accurately predict the risk of preterm birth at 1, 3, 6, and 9 months prior to delivery. Methods The architecture of PredictPTB employs recurrent neural networks (RNNs) to model the longitudinal patient’s EHR visits and exploits a single code-level attention mechanism to improve the predictive performance, while providing temporal code-level and visit-level explanations for the prediction results. We compare the performance of different combinations of prediction time-points, data modalities, and data windows. We also present a case-study of our model’s interpretability illustrating how clinicians can gain some transparency into the predictions. Results Leveraging a large cohort of 222,436 deliveries, comprising a total of 27,100 unique clinical concepts, our model was able to predict preterm birth with an ROC-AUC of 0.82, 0.79, 0.78, and PR-AUC of 0.40, 0.31, 0.24, at 1, 3, and 6 months prior to delivery, respectively. Results also confirm that observational data modalities (such as diagnoses) are more predictive for preterm birth than interventional data modalities (e.g., medications and procedures). Conclusions Our results demonstrate that PredictPTB can be utilized to achieve accurate and scalable predictions for preterm birth, complemented by explanations that directly highlight evidence in the patient’s EHR timeline.

Journal Article

Share this book

Add to My Shelf

Artificial Intelligence in Endometriosis Imaging: A Scoping Review

by Elhenidy, Ali , Farrell, Thomas , Thomas, Rajat in Artificial intelligence , Artificial neural networks , Classification

2026

Endometriosis is a chronic gynecological condition characterized by endometrium-like tissue outside the uterus. In clinical practice, diagnosis and anatomical mapping rely heavily on imaging, yet performance remains operator- and modality-dependent. Artificial intelligence (AI) has been increasingly applied to endometriosis imaging. We conducted a PRISMA-ScR-guided scoping review of primary machine learning and deep learning studies using endometriosis-related imaging. Five databases (MEDLINE, Embase, Scopus, IEEE Xplore, and Google Scholar) were searched from 2015 to 2025. Of 413 records, 32 studies met inclusion and most were single-center, retrospective investigations in reproductive-age cohorts. Ultrasound predominated (50%), followed by laparoscopic imaging (25%) and MRI (22%); ovarian endometrioma and deep infiltrating endometriosis were the most commonly modeled phenotypes. Classification was the dominant AI task (78%), typically using convolutional neural networks (often ResNet-based), whereas segmentation (31%) and object detection (3%) were less explored. Nearly all studies relied on internal validation (97%), most frequently simple hold-out splits with heterogeneous, accuracy-focused performance reporting. The minimal AI-method quality appraisal identified frequent methodological gaps across key domains, including limited reporting of patient-level separation, leakage safeguards, calibration, and data and code availability. Overall, AI-enabled endometriosis imaging is rapidly evolving but remains early-stage; multi-center and prospective validation, standardized reporting, and clinically actionable detection–segmentation pipelines are needed before routine clinical integration.

Journal Article

Share this book

Add to My Shelf

Interpreting patient-Specific risk prediction using contextual decomposition of BiLSTMs: application to children with asthma

by Janahi, Ibrahim , Boughorbel, Sabri , Malluhi, Qutaibah in Algorithms , Asthma - diagnosis , Asthma - etiology

2019

Background Predictive modeling with longitudinal electronic health record (EHR) data offers great promise for accelerating personalized medicine and better informs clinical decision-making. Recently, deep learning models have achieved state-of-the-art performance for many healthcare prediction tasks. However, deep models lack interpretability, which is integral to successful decision-making and can lead to better patient care. In this paper, we build upon the contextual decomposition (CD) method, an algorithm for producing importance scores from long short-term memory networks (LSTMs). We extend the method to bidirectional LSTMs (BiLSTMs) and use it in the context of predicting future clinical outcomes using patients’ EHR historical visits. Methods We use a real EHR dataset comprising 11071 patients, to evaluate and compare CD interpretations from LSTM and BiLSTM models. First, we train LSTM and BiLSTM models for the task of predicting which pre-school children with respiratory system-related complications will have asthma at school-age. After that, we conduct quantitative and qualitative analysis to evaluate the CD interpretations produced by the contextual decomposition of the trained models. In addition, we develop an interactive visualization to demonstrate the utility of CD scores in explaining predicted outcomes. Results Our experimental evaluation demonstrate that whenever a clear visit-level pattern exists, the models learn that pattern and the contextual decomposition can appropriately attribute the prediction to the correct pattern. In addition, the results confirm that the CD scores agree to a large extent with the importance scores generated using logistic regression coefficients. Our main insight was that rather than interpreting the attribution of individual visits to the predicted outcome, we could instead attribute a model’s prediction to a group of visits. Conclusion We presented a quantitative and qualitative evidence that CD interpretations can explain patient-specific predictions using CD attributions of individual visits or a group of visits.

Journal Article

Share this book

Add to My Shelf

Serious Games for Learning Among Older Adults With Cognitive Impairment: Systematic Review and Meta-analysis

by Sheikh, Javaid , Aziz, Sarah , Abuelezz, Israa in Adults , Aged , Aging

2023

Learning disabilities are among the major cognitive impairments caused by aging. Among the interventions used to improve learning among older adults are serious games, which are participative electronic games designed for purposes other than entertainment. Although some systematic reviews have examined the effectiveness of serious games on learning, they are undermined by some limitations, such as focusing on older adults without cognitive impairments, focusing on particular types of serious games, and not considering the comparator type in the analysis. This review aimed to evaluate the effectiveness of serious games on verbal and nonverbal learning among older adults with cognitive impairment. Eight electronic databases were searched to retrieve studies relevant to this systematic review and meta-analysis. Furthermore, we went through the studies that cited the included studies and screened the reference lists of the included studies and relevant reviews. Two reviewers independently checked the eligibility of the identified studies, extracted data from the included studies, and appraised their risk of bias and the quality of the evidence. The results of the included studies were summarized using a narrative synthesis or meta-analysis, as appropriate. Of the 559 citations retrieved, 11 (2%) randomized controlled trials (RCTs) ultimately met all eligibility criteria for this review. A meta-analysis of 45% (5/11) of the RCTs revealed that serious games are effective in improving verbal learning among older adults with cognitive impairment in comparison with no or sham interventions (P=.04), and serious games do not have a different effect on verbal learning between patients with mild cognitive impairment and those with Alzheimer disease (P=.89). A meta-analysis of 18% (2/11) of the RCTs revealed that serious games are as effective as conventional exercises in promoting verbal learning (P=.98). We also found that serious games outperformed no or sham interventions (4/11, 36%; P=.03) and conventional cognitive training (2/11, 18%; P<.001) in enhancing nonverbal learning. Serious games have the potential to enhance verbal and nonverbal learning among older adults with cognitive impairment. However, our findings remain inconclusive because of the low quality of evidence, the small sample size in most of the meta-analyzed studies (6/8, 75%), and the paucity of studies included in the meta-analyses. Thus, until further convincing proof of their effectiveness is offered, serious games should be used to supplement current interventions for verbal and nonverbal learning rather than replace them entirely. Further studies are needed to compare serious games with conventional cognitive training and conventional exercises, as well as different types of serious games, different platforms, different intervention periods, and different follow-up periods. PROSPERO CRD42022348849; https://tinyurl.com/y6yewwfa.

Journal Article

Share this book

Add to My Shelf

Predicting mood swings in women of reproductive age using machine learning on metabolic, menstrual, and lifestyle indicators

by El Rayess, Farah , Thomas, Rajat , AlSaad, Rawan in Accuracy , Acne , Algorithms

2026

Mood swings in reproductive-age women arise from interacting hormonal, metabolic, and lifestyle factors, yet scalable screening tools remain limited. Artificial intelligence (AI) and machine learning (ML) approaches offer the potential to integrate diverse predictors and enable early, data-driven risk stratification. To evaluate the performance of ML algorithms in predicting mood swings among reproductive-age women using menstrual, metabolic, and lifestyle survey data and to identify the most influential predictors. The study cohort included 465 reproductive-age women, with fifteen survey-derived features categorized into metabolic (e.g., BMI, recent weight gain, polycystic ovary syndrome), menstrual (regular periods, period length), lifestyle (fast-food consumption, daily exercise), symptom burden score, and demographic (age) categories. We compared five ML models: Random Forest, SVM, Gradient Boosting, LightGBM, and CatBoost, using precision, recall, F1, accuracy, and AUCPR metrics. Feature importance was assessed with permutation feature importance (PFI) and shapley additive explanations (SHAP). Across models, the highest values achieved were precision 0.83, recall 0.91, accuracy 0.74, and AUCPR 0.87. PFI and SHAP converged on symptom burden as the dominant predictor, with additional signal from lifestyle indicators (higher fast-food consumption, lower daily exercise) and metabolic/dermatologic markers. Menstrual regularity/length contributed minimally; age showed a modest inverse association. Low-cost, self-reported features can support ML prediction of mood swings in reproductive-age women with good performance. Findings motivate prospective validation, dynamic prediction with wearables, and evaluation of AI-based approaches for early detection of women's mental health concerns in community and primary care settings.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter