Catalogue Search | MBRL

Medication adherence prediction through temporal modelling in cardiovascular disease management

by Riddle, Patricia J. , Warren, James R. , Hsu, William in Acute coronary syndromes , Cardiovascular disease , Cardiovascular diseases

2022

Background Chronic conditions place a considerable burden on modern healthcare systems. Within New Zealand and worldwide cardiovascular disease (CVD) affects a significant proportion of the population and it is the leading cause of death. Like other chronic diseases, the course of cardiovascular disease is usually prolonged and its management necessarily long-term. Despite being highly effective in reducing CVD risk, non-adherence to long-term medication continues to be a longstanding challenge in healthcare delivery. The study investigates the benefits of integrating patient history and assesses the contribution of explicitly temporal models to medication adherence prediction in the context of lipid-lowering therapy. Methods Data from a CVD risk assessment tool is linked to routinely collected national and regional data sets including pharmaceutical dispensing, hospitalisation, lab test results and deaths. The study extracts a sub-cohort from 564,180 patients who had primary CVD risk assessment for analysis. Based on community pharmaceutical dispensing record, proportion of days covered (PDC) ≥ 80 is used as the threshold for adherence. Two years (8 quarters) of patient history before their CVD risk assessment is used as the observation window to predict patient adherence in the subsequent 5 years (20 quarters). The predictive performance of temporal deep learning models long short-term memory (LSTM) and simple recurrent neural networks (Simple RNN) are compared against non-temporal models multilayer perceptron (MLP), ridge classifier (RC) and logistic regression (LR). Further, the study investigates the effect of lengthening the observation window on the task of adherence prediction. Results Temporal models that use sequential data outperform non-temporal models, with LSTM producing the best predictive performance achieving a ROC AUC of 0.805. A performance gap is observed between models that can discover non-linear interactions between predictor variables and their linear counter parts, with neural network (NN) based models significantly outperforming linear models. Additionally, the predictive advantage of temporal models become more pronounced when the length of the observation window is increased. Conclusion The findings of the study provide evidence that using deep temporal models to integrate patient history in adherence prediction is advantageous. In particular, the RNN architecture LSTM significantly outperforms all other model comparators.

Journal Article

Share this book

Add to My Shelf

Combatting over-specialization bias in growing chemical databases

by Riddle, Patricia J. , Dost, Katharina , Brydon, Liam in Bias , Biodegradation , Chemical compound space

2023

Background Predicting in advance the behavior of new chemical compounds can support the design process of new products by directing the research toward the most promising candidates and ruling out others. Such predictive models can be data-driven using Machine Learning or based on researchers’ experience and depend on the collection of past results. In either case: models (or researchers) can only make reliable assumptions about compounds that are similar to what they have seen before. Therefore, consequent usage of these predictive models shapes the dataset and causes a continuous specialization shrinking the applicability domain of all trained models on this dataset in the future, and increasingly harming model-based exploration of the space. Proposed solution In this paper, we propose cancels ( C ounter A cti N g C ompound sp E cia L ization bia S ), a technique that helps to break the dataset specialization spiral. Aiming for a smooth distribution of the compounds in the dataset, we identify areas in the space that fall short and suggest additional experiments that help bridge the gap. Thereby, we generally improve the dataset quality in an entirely unsupervised manner and create awareness of potential flaws in the data. cancels does not aim to cover the entire compound space and hence retains a desirable degree of specialization to a specified research domain. Results An extensive set of experiments on the use-case of biodegradation pathway prediction not only reveals that the bias spiral can indeed be observed but also that cancels produces meaningful results. Additionally, we demonstrate that mitigating the observed bias is crucial as it cannot only intervene with the continuous specialization process, but also significantly improves a predictor’s performance while reducing the number of required experiments. Overall, we believe that cancels can support researchers in their experimentation process to not only better understand their data and potential flaws, but also to grow the dataset in a sustainable way. All code is available under github.com/KatDost/Cancels .

Journal Article

Share this book

Add to My Shelf

Forecasting severe respiratory disease hospitalizations using machine learning algorithms

by Castelino, Lorraine , Dost, Katharina , Wu, Milton in Age groups , Algorithms , Artificial neural networks

2024

Background Forecasting models predicting trends in hospitalization rates have the potential to inform hospital management during seasonal epidemics of respiratory diseases and the associated surges caused by acute hospital admissions. Hospital bed requirements for elective surgery could be better planned if it were possible to foresee upcoming peaks in severe respiratory illness admissions. Forecasting models can also guide the use of intervention strategies to decrease the spread of respiratory pathogens and thus prevent local health system overload. In this study, we explore the capability of forecasting models to predict the number of hospital admissions in Auckland, New Zealand, within a three-week time horizon. Furthermore, we evaluate probabilistic forecasts and the impact on model performance when integrating laboratory data describing the circulation of respiratory viruses. Methods The dataset used for this exploration results from active hospital surveillance, in which the World Health Organization Severe Acute Respiratory Infection (SARI) case definition was consistently used. This research nurse-led surveillance has been implemented in two public hospitals in Auckland and provides a systematic laboratory testing of SARI patients for nine respiratory viruses, including influenza, respiratory syncytial virus, and rhinovirus. The forecasting strategies used comprise automatic machine learning, one of the most recent generative pre-trained transformers, and established artificial neural network algorithms capable of univariate and multivariate forecasting. Results We found that machine learning models compute more accurate forecasts in comparison to naïve seasonal models. Furthermore, we analyzed the impact of reducing the temporal resolution of forecasts, which decreased the model error of point forecasts and made probabilistic forecasting more reliable. An additional analysis that used the laboratory data revealed strong season-to-season variations in the incidence of respiratory viruses and how this correlates with total hospitalization cases. These variations could explain why it was not possible to improve forecasts by integrating this data. Conclusions Active SARI surveillance and consistent data collection over time enable these data to be used to predict hospital bed utilization. These findings show the potential of machine learning as support for informing systems for proactive hospital management.

Journal Article

Share this book

Add to My Shelf

Solving the traveling tournament problem with iterative-deepening A

by Riddle, Patricia J. , Guesgen, Hans W. , Uthus, David C. in Artificial Intelligence , Business and Management , Calculus of Variations and Optimal Control; Optimization

2012

This work presents an iterative-deepening A ∗ (IDA ∗ ) based approach to the traveling tournament problem (TTP). The TTP is a combinatorial optimization problem which abstracts the Major League Baseball schedule. IDA ∗ is able to find optimal solutions to this problem, with performance improvements coming from the incorporation of various past concepts including disjoint pattern databases, symmetry breaking, and parallelization along with new ideas of subtree skipping, forced deepening, and elite paths to help to reduce the search space. The results of this work show that an IDA ∗ based approach can find known optimal solutions to most TTP instances much faster than past approaches. More importantly, it has been able to optimally solve two larger instances that have been unsolved since the problem’s introduction in 2001. In addition, a new problem set called GALAXY is introduced, using a 3D space to create a challenging problem set.

Journal Article

Share this book

Add to My Shelf

Analyzing and repairing concept drift adaptation in data stream classification

by Olivares, Gustavo , Riddle, Patricia , Koh, Yun Sing in Adaptation , Air quality , Algorithms

2022

Data collected over time often exhibit changes in distribution, or concept drift, caused by changes in factors relevant to the classification task, e.g. weather conditions. Incorporating all relevant factors into the model may be able to capture these changes, however, this is usually not practical. Data stream based methods, which instead explicitly detect concept drift, have been shown to retain performance under unknown changing conditions. These methods adapt to concept drift by training a model to classify each distinct data distribution. However, we hypothesize that existing methods do not robustly handle real-world tasks, leading to adaptation errors where context is misidentified. Adaptation errors may cause a system to use a model which does not fit the current data, reducing performance. We propose a novel repair algorithm to identify and correct errors in concept drift adaptation. Evaluation on synthetic data shows that our proposed AiRStream system has higher performance than baseline methods, while is also better at capturing the dynamics of the stream. Evaluation on an air quality inference task shows AiRStream provides increased real-world performance compared to eight baseline methods. A case study shows that AiRStream is able to build a robust model of environmental conditions over this task, allowing the adaptions made to concept drift to be analysed and related to changes in weather. We discovered a strong predictive link between the adaptions made by AiRStream and changes in meteorological conditions.

Journal Article

Share this book

Add to My Shelf

Towards Knowledgeable Supervised Lifelong Learning Systems

by Benavides-Prado, Diana , Koh, Yun Sing , Riddle, Patricia in Artificial intelligence , Deep learning , Hypotheses

2020

Learning a sequence of tasks is a long-standing challenge in machine learning. This setting applies to learning systems that observe examples of a range of tasks at different points in time. A learning system should become more knowledgeable as more related tasks are learned. Although the problem of learning sequentially was acknowledged for the first time decades ago, the research in this area has been rather limited. Research in transfer learning, multitask learning, metalearning and deep learning has studied some challenges of these kinds of systems. Recent research in lifelong machine learning and continual learning has revived interest in this problem. We propose Proficiente, a full framework for long-term learning systems. Proficiente relies on knowledge transferred between hypotheses learned with Support Vector Machines. The first component of the framework is focused on transferring forward selectively from a set of existing hypotheses or functions representing knowledge acquired during previous tasks to a new target task. A second component of Proficiente is focused on transferring backward, a novel ability of long-term learning systems that aim to exploit knowledge derived from recent tasks to encourage refinement of existing knowledge. We propose a method that transfers selectively from a task learned recently to existing hypotheses representing previous tasks. The method encourages retention of existing knowledge whilst refining. We analyse the theoretical properties of the proposed framework. Proficiente is accompanied by an agnostic metric that can be used to determine if a long-term learning system is becoming more knowledgeable. We evaluate Proficiente in both synthetic and real-world datasets, and demonstrate scenarios where knowledgeable supervised learning systems can be achieved by means of transfer.

Journal Article

Share this book

Add to My Shelf

Recurring concept memory management in data streams: exploiting data stream concept evolution to improve performance and transparency

by Riddle, Patricia , Pears Russel , Halstead, Ben in Constraint modelling , Data transmission , Memory management

2021

A data stream is a sequence of observations produced by a generating process which may evolve over time. In such a time-varying stream the relationship between input features and labels, or concepts, can change. Adapting to changes in concept is most often done by destroying and incrementally rebuilding the current classifier. Many systems additionally store and reuse previously built models to more efficiently adapt when stream conditions drift to a previously seen state. Reusing a model offers increased classification performance over rebuilding, and provides an indicator, or transparency, into the hidden state of the generating process. When only a subset of past models can be stored for reuse, for example due to memory constraints, the choice of which models to store for optimal future reuse is an important problem. Current methods of evaluating which models to store use valuation policies such as age, time since last use, accuracy and diversity. These policies are often not optimal, losing predictive performance by undervaluing complex models. We propose a new valuation policy based on advantage, the misclassifications avoided by reusing a model rather than training a new model, which more accurately reflects the true value of model storage. We evaluate our method on synthetic and real world data, including a real world air pollution dataset. Our results show accuracy increases of up to 6% using our valuation policy, while preserving transparency.

Journal Article

Share this book

Add to My Shelf

Scaling Up Inductive Logic Programming: An Evolutionary Wrapper Approach

by Riddle, Patricia J. , Reiser, Philip G.K. in Algorithms , Classification , Evolutionary

2001

Inductive logic programming (ILP) algorithms are classification algorithms that construct classifiers represented as logic programs. ILP algorithms have a number of attractive features, notably the ability to make use of declarative background (user-supplied) knowledge. However, ILP algorithms deal poorly with large data sets (>10 super(4) examples) and their widespread use of the greedy set-covering algorithm renders them susceptible to local maxima in the space of logic programs. This paper presents a novel approach to address these problems based on combining the local search properties of an inductive logic programming algorithm with the global search properties of an evolutionary algorithm. The proposed algorithm may be viewed as an evolutionary wrapper around a population of ILP algorithms. The evolutionary wrapper approach is evaluated on two domains. The chess-endgame (KRK) problem is an artificial domain that is a widely used benchmark in inductive logic programming, and Part-of-Speech Tagging is a real-world problem from the field of Natural Language Processing. In the latter domain, data originates from excerpts of the Wall Street Journal. Results indicate that significant improvements in predictive accuracy can be achieved over a conventional ILP approach when data is plentiful and noisy.

Journal Article

Share this book

Add to My Shelf

Solving the traveling tournament problem with iterative-deepening A super()

by Guesgen, Hans W , Uthus, David C , Riddle, Patricia J in Baseball , Broken symmetry , Combinatorial analysis

2012

This work presents an iterative-deepening A super() (IDA super()) based approach to the traveling tournament problem (TTP). The TTP is a combinatorial optimization problem which abstracts the Major League Baseball schedule. IDA super() is able to find optimal solutions to this problem, with performance improvements coming from the incorporation of various past concepts including disjoint pattern databases, symmetry breaking, and parallelization along with new ideas of subtree skipping, forced deepening, and elite paths to help to reduce the search space. The results of this work show that an IDA super() based approach can find known optimal solutions to most TTP instances much faster than past approaches. More importantly, it has been able to optimally solve two larger instances that have been unsolved since the problem's introduction in 2001. In addition, a new problem set called GALAXY is introduced, using a 3D space to create a challenging problem set.

Journal Article

Share this book

Add to My Shelf

Solving the traveling tournament problem with iterative-deepening A^sup

by Guesgen, Hans W , Uthus, David C , Riddle, Patricia J in Optimization , Professional baseball , Professional soccer

2012

This work presents an iterative-deepening A^sup ^ (IDA^sup ^) based approach to the traveling tournament problem (TTP). The TTP is a combinatorial optimization problem which abstracts the Major League Baseball schedule. IDA^sup ^ is able to find optimal solutions to this problem, with performance improvements coming from the incorporation of various past concepts including disjoint pattern databases, symmetry breaking, and parallelization along with new ideas of subtree skipping, forced deepening, and elite paths to help to reduce the search space. The results of this work show that an IDA^sup ^ based approach can find known optimal solutions to most TTP instances much faster than past approaches. More importantly, it has been able to optimally solve two larger instances that have been unsolved since the problem's introduction in 2001. In addition, a new problem set called GALAXY is introduced, using a 3D space to create a challenging problem set.[PUBLICATION ABSTRACT]

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter