Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Reading LevelReading Level
-
Content TypeContent Type
-
YearFrom:-To:
-
More FiltersMore FiltersItem TypeIs Full-Text AvailableSubjectPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
6,523
result(s) for
"Watson, David S."
Sort by:
Interpretable machine learning for genomics
2022
High-throughput technologies such as next-generation sequencing allow biologists to observe cell function with unprecedented resolution, but the resulting datasets are too large and complicated for humans to understand without the aid of advanced statistical methods. Machine learning (ML) algorithms, which are designed to automatically find patterns in data, are well suited to this task. Yet these models are often so complex as to be opaque, leaving researchers with few clues about underlying mechanisms. Interpretable machine learning (iML) is a burgeoning subdiscipline of computational statistics devoted to making the predictions of ML models more intelligible to end users. This article is a gentle and critical introduction to iML, with an emphasis on genomic applications. I define relevant concepts, motivate leading methodologies, and provide a simple typology of existing approaches. I survey recent examples of iML in genomics, demonstrating how such techniques are increasingly integrated into research workflows. I argue that iML solutions are required to realize the promise of precision medicine. However, several open challenges remain. I examine the limitations of current state-of-the-art tools and propose a number of directions for future research. While the horizon for iML in genomics is wide and bright, continued progress requires close collaboration across disciplines.
Journal Article
Conceptual challenges for interpretable machine learning
2022
As machine learning has gradually entered into ever more sectors of public and private life, there has been a growing demand for algorithmic explainability. How can we make the predictions of complex statistical models more intelligible to end users? A subdiscipline of computer science known as interpretable machine learning (IML) has emerged to address this urgent question. Numerous influential methods have been proposed, from local linear approximations to rule lists and counterfactuals. In this article, I highlight three conceptual challenges that are largely overlooked by authors in this area. I argue that the vast majority of IML algorithms are plagued by (1) ambiguity with respect to their true target; (2) a disregard for error rates and severe testing; and (3) an emphasis on product over process. Each point is developed at length, drawing on relevant debates in epistemology and philosophy of science. Examples and counterexamples from IMLare considered, demonstrating howfailure to acknowledge these problems can result in counterintuitive and potentially misleading explanations. Without greater care for the conceptual foundations of IML, future work in this area is doomed to repeat the same mistakes.
Journal Article
Testing conditional independence in supervised learning algorithms
2021
We propose the conditional predictive impact (CPI), a consistent and unbiased estimator of the association between one or several features and a given outcome, conditional on a reduced feature set. Building on the knockoff framework of Candès et al. (J R Stat Soc Ser B 80:551–577, 2018), we develop a novel testing procedure that works in conjunction with any valid knockoff sampler, supervised learning algorithm, and loss function. The CPI can be efficiently computed for high-dimensional data without any sparsity constraints. We demonstrate convergence criteria for the CPI and develop statistical inference procedures for evaluating its magnitude, significance, and precision. These tests aid in feature and model selection, extending traditional frequentist and Bayesian techniques to general supervised learning tasks. The CPI may also be applied in causal discovery to identify underlying multivariate graph structures. We test our method using various algorithms, including linear regression, neural networks, random forests, and support vector machines. Empirical results show that the CPI compares favorably to alternative variable importance measures and other nonparametric tests of conditional independence on a diverse array of real and synthetic datasets. Simulations confirm that our inference procedures successfully control Type I error with competitive power in a range of settings. Our method has been implemented in an R package, cpi, which can be downloaded from https://github.com/dswatson/cpi.
Journal Article
The explanation game
2021
We propose a formal framework for interpretable machine learning. Combining elements from statistical learning, causal interventionism, and decision theory, we design an idealised explanation game in which players collaborate to find the best explanation(s) for a given algorithmic prediction. Through an iterative procedure of questions and answers, the players establish a three-dimensional Pareto frontier that describes the optimal trade-offs between explanatory accuracy, simplicity, and relevance. Multiple rounds are played at different levels of abstraction, allowing the players to explore overlapping causal patterns of variable granularity and scope. We characterise the conditions under which such a game is almost surely guaranteed to converge on a (conditionally) optimal explanation surface in polynomial time, and highlight obstacles that will tend to prevent the players from advancing beyond certain explanatory thresholds. The game serves a descriptive and a normative function, establishing a conceptual space in which to analyse and compare existing proposals, as well as design new and improved solutions.
Journal Article
Clinical applications of machine learning algorithms: beyond the black box
by
McInnes, Iain B
,
Krutzinna, Jenny
,
Barnes, Michael R
in
Algorithms
,
Artificial intelligence
,
Attitude of Health Personnel
2019
To maximise the clinical benefits of machine learning algorithms, we need to rethink our approach to explanation, argue David Watson and colleagues
Journal Article
Transcriptomic profiling and machine learning uncover gene signatures of psoriasis endotypes and disease severity
2026
Background
Despite increased understanding of psoriasis pathogenesis, molecular classification of clinical phenotypes and disease severity is poorly defined. Knowledge gaps include whether molecular endotypes of psoriasis underlie distinct clinical phenotypes and the positive and negative molecular regulators of disease severity across tissue compartments.
Methods
We performed comprehensive RNA sequencing of skin and blood (n = 718) from prospectively-recruited, deeply-phenotyped discovery and replication cohorts of 146 subjects with moderate-to-severe chronic plaque psoriasis initiating TNF-inhibitor (adalimumab) or IL-12/23-inhibitor (ustekinumab) therapy.
Results
Here we show, using two complementary dimensionality reduction methods, that co-expressed gene modules and factors within skin and blood are significantly associated with psoriasis phenotypes and disease severity. We identify a 14-gene signature negatively associated with BMI in nonlesional skin and with disease severity in lesional skin. Genotype integration reveals that HLA-DQA1*01 and HLA-DRB1*15 genotypes are positively associated with baseline psoriasis severity. Using explainable machine learning models, we define two disease severity-associated gene modules in lesional skin - one positive, one negatively-associated - and a 9-gene signature in lesional skin predictive of disease severity. Disease severity signatures in blood are only seen following adalimumab exposure, suggesting greater systemic impact of adalimumab compared to ustekinumab, in line with its side effect profile. In contrast, a gene signature in blood linked to HLA-C*06:02 status is independent of disease severity or drug.
Conclusions
These findings delineate gene-environmental and genetic effects on the psoriasis transcriptome linked to disease severity.
Plain language summary
Psoriasis is a common and debilitating skin disease, linked to other inflammatory conditions. A lot is known about what causes psoriasis and the factors that influence it, but doctors still cannot offer personalised treatments. This is because it has been difficult to understand what makes psoriasis more or less severe, why people respond differently to treatment, or why some people develop related diseases. To help address this, we collected skin and blood samples and personal information from people with severe psoriasis across the United Kingdom. Using computer-based methods, we found shared biological processes that link the disease with obesity and help predict its severity.
Rider, Grantham, Smith, Watson et al. integrate multiomic data from patients with psoriasis using dimensionality reduction and machine learning techniques. This approach identifies biological relationships between genetic background, clinical features and disease severity, providing insight into disease variability across individuals.
Journal Article
On the Philosophy of Unsupervised Learning
2023
Unsupervised learning algorithms are widely used for many important statistical tasks with numerous applications in science and industry. Yet despite their prevalence, they have attracted remarkably little philosophical scrutiny to date. This stands in stark contrast to supervised and reinforcement learning algorithms, which have been widely studied and critically evaluated, often with an emphasis on ethical concerns. In this article, I analyze three canonical unsupervised learning problems: clustering, abstraction, and generative modeling. I argue that these methods raise unique epistemological and ontological questions, providing data-driven tools for discovering natural kinds and distinguishing essence from contingency. This analysis goes some way toward filling the lacuna in contemporary philosophical discourse on unsupervised learning, as well as bringing conceptual unity to a heterogeneous field more often described by what it is not (i.e., supervised or reinforcement learning) than by what it is. I submit that unsupervised learning is not just a legitimate subject of philosophical inquiry but perhaps the most fundamental branch of all AI. However, an uncritical overreliance on unsupervised methods poses major epistemic and ethical risks. I conclude by advocating for a pragmatic, error-statistical approach that embraces the opportunities and mitigates the challenges posed by this powerful class of algorithms.
Journal Article
Interferon-α-mediated therapeutic resistance in early rheumatoid arthritis implicates epigenetic reprogramming
by
Cope, Andrew P
,
Lindholm, Catharina
,
Gozzard, Neil
in
Adaptive immunology
,
antirheumatic agents
,
arthritis, rheumatoid
2022
ObjectivesAn interferon (IFN) gene signature (IGS) is present in approximately 50% of early, treatment naive rheumatoid arthritis (eRA) patients where it has been shown to negatively impact initial response to treatment. We wished to validate this effect and explore potential mechanisms of action.MethodsIn a multicentre inception cohort of eRA patients (n=191), we examined the whole blood IGS (MxA, IFI44L, OAS1, IFI6, ISG15) with reference to circulating IFN proteins, clinical outcomes and epigenetic influences on circulating CD19+ B and CD4+ T lymphocytes.ResultsWe reproduced our previous findings demonstrating a raised baseline IGS. We additionally showed, for the first time, that the IGS in eRA reflects circulating IFN-α protein. Paired longitudinal analysis demonstrated a significant reduction between baseline and 6-month IGS and IFN-α levels (p<0.0001 for both). Despite this fall, a raised baseline IGS predicted worse 6-month clinical outcomes such as increased disease activity score (DAS-28, p=0.025) and lower likelihood of a good EULAR clinical response (p=0.034), which was independent of other conventional predictors of disease activity and clinical response. Molecular analysis of CD4+ T cells and CD19+ B cells demonstrated differentially methylated CPG sites and dysregulated expression of disease relevant genes, including PARP9, STAT1, and EPSTI1, associated with baseline IGS/IFNα levels. Differentially methylated CPG sites implicated altered transcription factor binding in B cells (GATA3, ETSI, NFATC2, EZH2) and T cells (p300, HIF1α).ConclusionsOur data suggest that, in eRA, IFN-α can cause a sustained, epigenetically mediated, pathogenic increase in lymphocyte activation and proliferation, and that the IGS is, therefore, a robust prognostic biomarker. Its persistent harmful effects provide a rationale for the initial therapeutic targeting of IFN-α in selected patients with eRA.
Journal Article