Catalogue Search | MBRL

322 Exploring the Iterative Clustering for Subtype Discovery (iKCAT) Algorithm for Robust Computer-Aided Diagnosis of Lung Cancer

by Tchoua, Roselyne , Pinzariu, Adrianna , Raicu, Daniela in Algorithms , Diagnosis , Informatics and Data Science

2024

OBJECTIVES/GOALS: With a growing intеrеst in tailoring disеasе diagnosis to еach individual as opposеd to a “onе-sizе-fits-all” approach, our aim is to еnhancе thе robustnеss of thе Itеrativе Clustеring for Subtypе Discovеry (iKCAT) algorithm in charactеrizing lung cancеr subtypеs for individualizеd trеatmеnt. METHODS/STUDY POPULATION: Our mеthod еxplorеs thе robustnеss of thе prеviously dеvеlopеd iKCAT algorithm. This itеrativе clustеring mеthod finds robust—homogеnеous and diffеrеntiablе—subtypеs of lung nodulеs through itеrativе K-mеans clustеring that hеlps classify thеm and lеavеs somе data unclustеrеd. This sеt of unclustеrеd or “hard” data rеprеsеnts imagеs that cannot confidеntly bе assignеd to any subtypеs and may rеquirе morе rеsourcеs (е.g., timе or radiologists) to diagnosе. Wе еxplorе thе robustnеss of iKCAT across multiplе fеaturе spacеs, including dеsignеd imagе fеaturеs (which arе еnginееrеd to capturе somе propеrtiеs such as lеvеl of еlongation, еccеntricity and circularity), rеducеd dеsignеd imagе fеaturеs using Principal Componеnt Analysis (PCA) and Uniform Manifold Approximation and Projеction (UMAP). RESULTS/ANTICIPATED RESULTS: Whеn running our еxpеrimеnt on thе 64 imagе fеaturеs, our rеsults consistеntly carvеd out a singlе purе, homogеnеous clustеr ovеr thе coursе of 30 iKCAT runs. From an initial datasеt of 1490 data points, 1430 points wеrе lеft unclustеrеd in this fеaturе spacе. Whеn conducting thе 30 iKCAT runs on thе PCA fеaturе spacе with 10 componеnts, wе found it did not producе any distinct clustеr abovе thе dеfinеd homogеnеity thrеshold. Thе 2D UMAP fеaturе spacе consistеntly gеnеratеd 8 clustеrs with an avеragе homogеnеity of 87. 22% ovеr 30 runs, and only lеft 9 points unclustеrеd. Ovеr 30 iKCAT runs, wе idеntifiеd 8 pеrsistеnt clustеrs or subtypеs, 3 mostly malignant and 5 mostly bеnign clustеrs. DISCUSSION/SIGNIFICANCE: Through our еxpеrimеnt using thе iKCAT algorithm, wе found that iKCAT’s clustеring functionality producеd thе most pеrsistеnt rеsults on thе 2D UMAP fеaturе spacе duе to its high avеragе homogеnеity scorеs and consistеncy in idеntifying clustеrs/subtypеs, hеlping improvе tailorеd disеasе diagnosis.

Journal Article

Share this book

Add to My Shelf

Evaluation of Eye Gaze Dynamics During Physician-Patient-Computer Interaction in Federally Qualified Health Centers: Systematic Analysis

by Almansour, Amal , Montague, Enid , Furst, Jacob in Behavior , Cameras , Computer industry

2023

Background:Understanding the communication between physicians and patients can identify areas where they can improve and build stronger relationships. This led to better patient outcomes including increased engagement, enhanced adherence to treatment plan, and a boost in trust.Objective:This study investigates eye gaze directions of physicians, patients, and computers in naturalistic medical encounters at Federally Qualified Health Centers to understand communication patterns given different patients’ diverse backgrounds. The aim is to support the building and designing of health information technologies, which will facilitate the improvement of patient outcomes.Methods:Data were obtained from 77 videotaped medical encounters in 2014 from 3 Federally Qualified Health Centers in Chicago, Illinois, that included 11 physicians and 77 patients. Self-reported surveys were collected from physicians and patients. A systematic analysis approach was used to thoroughly examine and analyze the data. The dynamics of eye gazes during interactions between physicians, patients, and computers were evaluated using the lag sequential analysis method. The objective of the study was to identify significant behavior patterns from the 6 predefined patterns initiated by both physicians and patients. The association between eye gaze patterns was examined using the Pearson chi-square test and the Yule Q test.Results:The results of the lag sequential method showed that 3 out of 6 doctor-initiated gaze patterns were followed by patient-response gaze patterns. Moreover, 4 out of 6 patient-initiated patterns were significantly followed by doctor-response gaze patterns. Unlike the findings in previous studies, doctor-initiated eye gaze behavior patterns were not leading patients’ eye gaze. Moreover, patient-initiated eye gaze behavior patterns were significant in certain circumstances, particularly when interacting with physicians.Conclusions:This study examined several physician-patient-computer interaction patterns in naturalistic settings using lag sequential analysis. The data indicated a significant influence of the patients’ gazes on physicians. The findings revealed that physicians demonstrated a higher tendency to engage with patients by reciprocating the patient’s eye gaze when the patient looked at them. However, the reverse pattern was not observed, suggesting a lack of reciprocal gaze from patients toward physicians and a tendency to not direct their gaze toward a specific object. Furthermore, patients exhibited a preference for the computer when physicians directed their eye gaze toward it.

Journal Article

Share this book

Add to My Shelf

Outcomes of ME/CFS following infectious mononucleosis: seven-year follow-up of a prospective study

by Jason, Leonard A. , Katz, Ben Z. , Furst, Jacob in Antigens , Chronic fatigue syndrome , Chronic illnesses

2026

Many individuals with Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) report experiencing an infectious illness prior to disease onset. Approximately 30% of cases are linked to Epstein-Barr virus (EBV) infection resulting in Infectious Mononucleosis (IM). We examined the progression of ME/CFS following IM among a cohort of college students who were recruited before they developed the infection. This sample represented a socioeconomically and ethnically diverse population of young adults who were monitored over a 7-year period. Assessments of health status, psychological functioning, and blood biomarkers were conducted at four time points: (1) baseline, when participants were healthy and at least 6 weeks from IM onset; (2) within 6 weeks of IM diagnosis; (3) 6 months post-IM, when participants had either recovered or met criteria for ME/CFS; and (4) the 7-year follow-up. At follow-up, 81% of participants who had initially presented with severe ME/CFS continued to fulfill diagnostic criteria. In contrast, only about one-third of those with moderate or lingering symptoms at 6 months still had ME/CFS 7 years later. These findings indicate that ME/CFS following IM tends to persist over the long term, particularly among those whose illness was more severe at onset.

Journal Article

Share this book

Add to My Shelf

Outcome risk model development for heterogeneity of treatment effect analyses: a comparison of non-parametric machine learning methods and semi-parametric statistical methods

by Vanghelof, Joseph , Wang, Yiyang , Tchoua, Roselyne in Accuracy , Aged , Aspirin

2024

Background In randomized clinical trials, treatment effects may vary, and this possibility is referred to as heterogeneity of treatment effect (HTE). One way to quantify HTE is to partition participants into subgroups based on individual’s risk of experiencing an outcome, then measuring treatment effect by subgroup. Given the limited availability of externally validated outcome risk prediction models, internal models (created using the same dataset in which heterogeneity of treatment analyses also will be performed) are commonly developed for subgroup identification. We aim to compare different methods for generating internally developed outcome risk prediction models for subject partitioning in HTE analysis. Methods Three approaches were selected for generating subgroups for the 2,441 participants from the United States enrolled in the ASPirin in Reducing Events in the Elderly (ASPREE) randomized controlled trial. An extant proportional hazards-based outcomes predictive risk model developed on the overall ASPREE cohort of 19,114 participants was identified and was used to partition United States’ participants by risk of experiencing a composite outcome of death, dementia, or persistent physical disability. Next, two supervised non-parametric machine learning outcome classifiers, decision trees and random forests, were used to develop multivariable risk prediction models and partition participants into subgroups with varied risks of experiencing the composite outcome. Then, we assessed how the partitioning from the proportional hazard model compared to those generated by the machine learning models in an HTE analysis of the 5-year absolute risk reduction (ARR) and hazard ratio for aspirin vs. placebo in each subgroup. Cochran’s Q test was used to detect if ARR varied significantly by subgroup. Results The proportional hazard model was used to generate 5 subgroups using the quintiles of the estimated risk scores; the decision tree model was used to generate 6 subgroups (6 automatically determined tree leaves); and the random forest model was used to generate 5 subgroups using the quintiles of the prediction probability as risk scores. Using the semi-parametric proportional hazards model, the ARR at 5 years was 15.1% (95% CI 4.0–26.3%) for participants with the highest 20% of predicted risk. Using the random forest model, the ARR at 5 years was 13.7% (95% CI 3.1–24.4%) for participants with the highest 20% of predicted risk. The highest outcome risk group in the decision tree model also exhibited a risk reduction, but the confidence interval was wider (5-year ARR = 17.0%, 95% CI= -5.4–39.4%). Cochran’s Q test indicated ARR varied significantly only by subgroups created using the proportional hazards model. The hazard ratio for aspirin vs. placebo therapy did not significantly vary by subgroup in any of the models. The highest risk groups for the proportional hazards model and random forest model contained 230 participants each, while the highest risk group in the decision tree model contained 41 participants. Conclusions The choice of technique for internally developed models for outcome risk subgroups influences HTE analyses. The rationale for the use of a particular subgroup determination model in HTE analyses needs to be explicitly defined based on desired levels of explainability (with features importance), uncertainty of prediction, chances of overfitting, and assumptions regarding the underlying data structure. Replication of these analyses using data from other mid-size clinical trials may help to establish guidance for selecting an outcomes risk prediction modelling technique for HTE analyses.

Journal Article

Share this book

Add to My Shelf

Research evolution of metal organic frameworks: A scientometric approach with human-in-the-loop

by Yuan, An , Hu, Xiaohua , Uribe-Romo, Fernando in Bibliometrics , Data analysis , Data collection

2024

This paper reports on a scientometric analysis bolstered by human-in-the-loop, domain experts, to examine the field of metal-organic frameworks (MOFs) research. Scientometric analyses reveal the intellectual landscape of a field. The study engaged MOF scientists in the design and review of our research workflow. MOF materials are an essential component in next-generation renewable energy storage and biomedical technologies. The research approach demonstrates how engaging experts, via human-in-the-loop processes, can help develop a comprehensive view of a field’s research trends, influential works, and specialized topics. A scientometric analysis was conducted, integrating natural language processing (NLP), topic modeling, and network analysis methods. The analytical approach was enhanced through a human-in-the-loop iterative process involving MOF research scientists at selected intervals. MOF researcher feedback was incorporated into our method. The data sample included 65,209 MOF research articles. Python3 and software tool VOSviewer were used to perform the analysis. The findings demonstrate the value of including domain experts in research workflows, refinement, and interpretation of results. At each stage of the analysis, the MOF researchers contributed to interpreting the results and method refinements targeting our focus on MOF research. This study identified influential works and their themes. Our findings also underscore four main MOF research directions and applications. This study is limited by the sample (articles identified and referenced by the Cambridge Structural Database) that informed our analysis. Our findings contribute to addressing the current gap in fully mapping out the comprehensive landscape of MOF research. Additionally, the results will help domain scientists target future research directions. To the best of our knowledge, the number of publications collected for analysis exceeds those of previous studies. This enabled us to explore a more extensive body of MOF research compared to previous studies. Another contribution of our work is the iterative engagement of domain scientists, who brought in-depth, expert interpretation to the data analysis, helping hone the study.

Journal Article

Share this book

Add to My Shelf

318 Discovering Subgroups with Supervised Machine Learning Models for Heterogeneity of Treatment Effect Analysis

by Vanghelof, Joseph , Tchoua, Roselyne , Shah, Raj in Aspirin , Clinical trials , Dementia disorders

2024

OBJECTIVES/GOALS: The goal of the study is to provide insights into the use of machine learning methods as a means to predict heterogeneity of treatment effect (HTE) in participants of randomized clinical trials. METHODS/STUDY POPULATION: Using data from 2,441 participants enrolled in the ASPirin in Reducing Events in the Elderly (ASPREE) randomized controlled trial of daily low-dose aspirin vs placebo in the United States, we developed multivariable risk prediction models for the composite outcome of dementia, disability, or death. We used two machine learning techniques, decision trees and random forests, to develop novel non-parametric outcomes classifiers and generate risk-based subgroups. The comparator method was an extant semi-parametric proportional hazards predictive risk model. We then assessed HTE by examining the 5-year absolute risk reduction (ARR) of aspirin vs placebo in each risk subgroup. RESULTS/ANTICIPATED RESULTS: In the random forest classifier, the ARR at 5 years in the highest risk quintile was 13.7% (95% CI 3.1% to 24.4%). For the semi-parametric proportional hazards model, the ARR in the highest risk quintile was 15.1% (95% CI 4.0% to 26.3%). These results were comparable and provide evidence of the viability of internally developed parsimonious non-parametric machine learning models for HTE analysis. The decision tree model results (5-year ARR = 17.0%, 95% CI= -5.4% to 39.4% in the highest risk subgroup) exhibited more uncertainty in the results. DISCUSSION/SIGNIFICANCE: None of the models detected significant HTE on the relative scale; there was substantial HTE on the absolute scale in three of the models. Treatment benefit on the absolute scale may be regarded as bearing greater clinical importance and may be present even in the absence of benefit on the relative scale.

Journal Article

Share this book

Add to My Shelf

Predicting Myalgic Encephalomyelitis/Chronic Fatigue Syndrome from Early Symptoms of COVID-19 Infection

by Jason, Leonard A , Schwabe, Jennifer , Furst, Jacob in Chronic fatigue syndrome , Cluster analysis , Coronaviruses

2023

It is still unclear why certain individuals after viral infections continue to have severe symptoms. We investigated if predicting myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) development after contracting COVID-19 is possible by analyzing symptoms from the first two weeks of COVID-19 infection. Using participant responses to the 54-item DePaul Symptom Questionnaire, we built predictive models based on a random forest algorithm using the participants’ symptoms from the initial weeks of COVID-19 infection to predict if the participants would go on to meet the criteria for ME/CFS approximately 6 months later. Early symptoms, particularly those assessing post-exertional malaise, did predict the development of ME/CFS, reaching an accuracy of 94.6%. We then investigated a minimal set of eight symptom features that could accurately predict ME/CFS. The feature reduced models reached an accuracy of 93.5%. Our findings indicated that several IOM diagnostic criteria for ME/CFS occurring during the initial weeks after COVID-19 infection predicted Long COVID and the diagnosis of ME/CFS after 6 months.

Journal Article

Share this book

Add to My Shelf

Computational Methods for Tracking, Quantitative Assessment, and Visualization of C. elegans Locomotory Behavior

by Kim, Hongkyun , Li, Weiyu , Tran, Huu Phuoc in Algorithms , Animal behavior , Animals

2015

The nematode Caenorhabditis elegans provides a unique opportunity to interrogate the neural basis of behavior at single neuron resolution. In C. elegans, neural circuits that control behaviors can be formulated based on its complete neural connection map, and easily assessed by applying advanced genetic tools that allow for modulation in the activity of specific neurons. Importantly, C. elegans exhibits several elaborate behaviors that can be empirically quantified and analyzed, thus providing a means to assess the contribution of specific neural circuits to behavioral output. Particularly, locomotory behavior can be recorded and analyzed with computational and mathematical tools. Here, we describe a robust single worm-tracking system, which is based on the open-source Python programming language, and an analysis system, which implements path-related algorithms. Our tracking system was designed to accommodate worms that explore a large area with frequent turns and reversals at high speeds. As a proof of principle, we used our tracker to record the movements of wild-type animals that were freshly removed from abundant bacterial food, and determined how wild-type animals change locomotory behavior over a long period of time. Consistent with previous findings, we observed that wild-type animals show a transition from area-restricted local search to global search over time. Intriguingly, we found that wild-type animals initially exhibit short, random movements interrupted by infrequent long trajectories. This movement pattern often coincides with local/global search behavior, and visually resembles Lévy flight search, a search behavior conserved across species. Our mathematical analysis showed that while most of the animals exhibited Brownian walks, approximately 20% of the animals exhibited Lévy flights, indicating that C. elegans can use Lévy flights for efficient food search. In summary, our tracker and analysis software will help analyze the neural basis of the alteration and transition of C. elegans locomotory behavior in a food-deprived condition.

Journal Article

Share this book

Add to My Shelf

DiiS: A Biomedical Data Access Framework for Aiding Data Driven Research Supporting FAIR Principles

by Rasin, Alexander , Deshpande, Priya , Furst, Jacob in Access control , Algorithms , Artificial intelligence

2019

Vast amounts of clinical and biomedical research data are produced daily. These data can help enable data driven healthcare through novel biomedical discoveries, improved diagnostics processes, epidemiology, and education. However, finding, and gaining access to these data and relevant metadata that are necessary to achieve these goals remains a challenge. Furthermore, data management and enabling widespread, albeit controlled, use poses a major challenge for data producers. These data sources are often geographically distributed, with diverse characteristics, and are controlled by a host of logistical and legal factors that require appropriate governance and access control guarantees. To overcome these obstacles, a set of guiding principles under the term FAIR has been previously introduced. The primary desirable dataset properties are thus that the data should be Findable, Accessible, Interoperable, and Reusable (FAIR). In this paper, we introduce and describe an abstract framework that models these ideal goals, and could be a step toward supporting data driven research. We also develop a system instantiated on our framework called the Data integration and indexing System (DiiS). The system provides an integration model for making healthcare data available on a global scale. Our research work describes the challenges inhibiting data producers, data stewards, and data brokers in achieving FAIR goals for sharing biomedical data. We attempt to address some of the key challenges through the proposed system. We evaluated our framework using the software architecture testing technique and also looked at how different challenges in data integration are addressed by our system. Our evaluation shows that the DiiS framework is a user friendly data integration system that would greatly contribute to biomedical research.

Journal Article

Share this book

Add to My Shelf

Predicting Radiological Panel Opinions Using a Panel of Machine Learning Classifiers

by Zinovev, Dmitriy , Armato III, Samuel G. , Raicu, Daniela in ensemble learning , LIDC , lung nodule classification

2009

This paper uses an ensemble of classifiers and active learning strategies to predict radiologists’ assessment of the nodules of the Lung Image Database Consortium (LIDC). In particular, the paper presents machine learning classifiers that model agreement among ratings in seven semantic characteristics: spiculation, lobulation, texture, sphericity, margin, subtlety, and malignancy. The ensemble of classifiers (which can be considered as a computer panel of experts) uses 64 image features of the nodules across four categories (shape, intensity, texture, and size) to predict semantic characteristics. The active learning begins the training phase with nodules on which radiologists’ semantic ratings agree, and incrementally learns how to classify nodules on which the radiologists do not agree. Using our proposed approach, the classification accuracy of the ensemble of classifiers is higher than the accuracy of a single classifier. In the long run, our proposed approach can be used to increase consistency among radiological interpretations by providing physicians a “second read”.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter