Catalogue Search | MBRL

A deep learning model to predict RNA-Seq expression of tumours from whole slide images

by Toldo, Sylvain , Moarii, Matahi , Schmauch, Benoît in 631/114/1305 , 631/67 , Algorithms

2020

Deep learning methods for digital pathology analysis are an effective way to address multiple clinical questions, from diagnosis to prediction of treatment outcomes. These methods have also been used to predict gene mutations from pathology images, but no comprehensive evaluation of their potential for extracting molecular features from histology slides has yet been performed. We show that HE2RNA, a model based on the integration of multiple data modes, can be trained to systematically predict RNA-Seq profiles from whole-slide images alone, without expert annotation. Through its interpretable design, HE2RNA provides virtual spatialization of gene expression, as validated by CD3- and CD20-staining on an independent dataset. The transcriptomic representation learned by HE2RNA can also be transferred on other datasets, even of small size, to increase prediction performance for specific molecular phenotypes. We illustrate the use of this approach in clinical diagnosis purposes such as the identification of tumors with microsatellite instability. RNA-sequencing of tumour tissue can provide important diagnostic and prognostic information but this is costly and not routinely performed in all clinical settings. Here, the authors show that whole slide histology slides—part of routine care—can be used to predict RNA-sequencing data and thus reduce the need for additional analyses.

Journal Article

Share this book

Add to My Shelf

Validation of MSIntuit as an AI-based pre-screening tool for MSI detection from colorectal cancer histology slides

by Loiseau, Nicolas , Carpentier, Séverine , Garcia, Thierry in 13/51 , 14/63 , 45/77

2023

Mismatch Repair Deficiency (dMMR)/Microsatellite Instability (MSI) is a key biomarker in colorectal cancer (CRC). Universal screening of CRC patients for MSI status is now recommended, but contributes to increased workload for pathologists and delayed therapeutic decisions. Deep learning has the potential to ease dMMR/MSI testing and accelerate oncologist decision making in clinical practice, yet no comprehensive validation of a clinically approved tool has been conducted. We developed MSIntuit, a clinically approved artificial intelligence (AI) based pre-screening tool for MSI detection from haematoxylin-eosin (H&E) stained slides. After training on samples from The Cancer Genome Atlas (TCGA), a blind validation is performed on an independent dataset of 600 consecutive CRC patients. Inter-scanner reliability is studied by digitising each slide using two different scanners. MSIntuit yields a sensitivity of 0.96–0.98, a specificity of 0.47-0.46, and an excellent inter-scanner agreement (Cohen’s κ: 0.82). By reaching high sensitivity comparable to gold standard methods while ruling out almost half of the non-MSI population, we show that MSIntuit can effectively serve as a pre-screening tool to alleviate MSI testing burden in clinical practice. Microsatellite instability is a known risk factor for colorectal cancer development and treatment response. Here, the authors utilise deep learning to develop MSIntuit, a pre-screening tool to detect MSI from H&E stained slides.

Journal Article

Share this book

Add to My Shelf

Pacpaint: a histology-based deep learning model uncovers the extensive intratumor molecular heterogeneity of pancreatic adenocarcinoma

by Moindrot, Olivier , Emile, Jean Francois , Bachet, Jean Baptiste in 14/1 , 38/91 , 631/114/1305

2023

Two tumor (Classical/Basal) and stroma (Inactive/active) subtypes of Pancreatic adenocarcinoma (PDAC) with prognostic and theragnostic implications have been described. These molecular subtypes were defined by RNAseq, a costly technique sensitive to sample quality and cellularity, not used in routine practice. To allow rapid PDAC molecular subtyping and study PDAC heterogeneity, we develop PACpAInt, a multi-step deep learning model. PACpAInt is trained on a multicentric cohort ( n = 202) and validated on 4 independent cohorts including biopsies (surgical cohorts n = 148; 97; 126 / biopsy cohort n = 25), all with transcriptomic data ( n = 598) to predict tumor tissue, tumor cells from stroma, and their transcriptomic molecular subtypes, either at the whole slide or tile level (112 µm squares). PACpAInt correctly predicts tumor subtypes at the whole slide level on surgical and biopsies specimens and independently predicts survival. PACpAInt highlights the presence of a minor aggressive Basal contingent that negatively impacts survival in 39% of RNA-defined classical cases. Tile-level analysis ( > 6 millions) redefines PDAC microheterogeneity showing codependencies in the distribution of tumor and stroma subtypes, and demonstrates that, in addition to the Classical and Basal tumors, there are Hybrid tumors that combine the latter subtypes, and Intermediate tumors that may represent a transition state during PDAC evolution. Rapid and effective molecular subtyping of pancreatic adenocarcinoma (PDAC) is important for prognosis and treatment. Here, the authors develop PACpAInt, a deep learning model for PDAC molecular subtyping from whole-slide histological imaging that enables the analysis of heterogeneity and prognostic predictions.

Journal Article

Share this book

Add to My Shelf

AI allows pre-screening of FGFR3 mutational status using routine histology slides of muscle-invasive bladder cancer

by Hartmann, Arndt , Sikic, Danijel , Klümper, Niklas in 38/23 , 38/91 , 631/114/1305

2024

Pathogenic activating mutations in the fibroblast growth factor receptor 3 ( FGFR3 ) drive disease maintenance and progression in urothelial cancer. 10–15% of muscle-invasive and metastatic urothelial cancer (MIBC/mUC) are FGFR3 -mutant. Selective targeting of FGFR3 hotspot mutations with tyrosine kinase inhibitors (e.g., erdafitinib) is approved for mUC and requires FGFR3 mutational testing. However, current testing assays (polymerase chain reaction or next-generation sequencing) necessitate high tissue quality, have long turnover time, and are expensive. To overcome these limitations, we develop a deep-learning model that detects FGFR3 mutations using routine hematoxylin-eosin slides. Encompassing 1222 cases, our study is a large-scale validation of a model prescreening FGFR3 mutations for MIBC and mUC patients. In this work, we demonstrate that our model achieves high sensitivity (>93%) on advanced and metastatic cases while reducing molecular testing by 40% on average, thereby offering a cost-effective and rapid pre-screening tool for identifying patients eligible for FGFR3 targeted therapies. Detecting FGFR3-mutant muscle-invasive and metastatic urothelial cancers (MIBC/mUC) for targeted therapy remains challenging, but clinically important. Here, the authors develop a deep-learning model to detect FGFR3 mutations in MIBC/mUC from routine histopathology slides, allowing for highly sensitive, rapid, and cost-effective pre-screening.

Journal Article

Share this book

Add to My Shelf

Deep learning assessment of metastatic relapse risk from digitized breast cancer histological slides

by Gaury, V. , Guillou, L. , Sefta, M. in 14/63 , 631/67/1347 , 631/67/1857

2025

Accurate risk stratification is critical for guiding treatment decisions in early breast cancer. We present an artificial intelligence (AI)-based tool that analyzes digitized tumor slides to predict 5-year metastasis-free survival (MFS) in patients with estrogen receptor-positive, HER2-negative (ER + /HER2 − ) early breast cancer (EBC). Our deep learning model, RlapsRisk BC, independently predicts MFS and provides significant prognostic value beyond traditional clinico-pathological variables (C-index 0.81 vs 0.76, p < 0.05). Applying a 5% MFS event probability threshold stratifies patients into low- and high-risk groups. After dichotomization, combining RlapsRisk BC with clinico-pathological factors increases cumulative sensitivity (0.69 vs 0.63) and dynamic specificity (0.80 vs 0.76) compared to clinical factors alone. Expert analysis of high-impact regions identified by the model highlights well-established morphological features, supporting its interpretability and biological relevance. Early breast cancer is often responsive to treatment, however, long-term prognosis is variable. Here, the authors utilise a deep learning model to predict metastasis free survival using digitised tumour slides.

Journal Article

Share this book

Add to My Shelf

Transcriptomic learning for digital pathology

by Toldo, Sylvain , Moarii, Matahi , Schmauch, Benoît in Bioinformatics , Deep learning , Gene expression

2019

Deep learning methods for digital pathology analysis have proved an effective way to address multiple clinical questions, from diagnosis to prognosis and even to prediction of treatment outcomes. They have also recently been used to predict gene mutations from pathology images, but no comprehensive evaluation of their potential for extracting molecular features from histology slides has yet been performed. We propose a novel approach based on the integration of multiple data modes, and show that our deep learning model, HE2RNA, can be trained to systematically predict RNA-Seq profiles from whole-slide images alone, without the need for expert annotation. HE2RNA is interpretable by design, opening up new opportunities for virtual staining. In fact, it provides virtual spatialization of gene expression, as validated by double-staining on an independent dataset. Moreover, the transcriptomic representation learned by HE2RNA can be transferred to improve predictive performance for other tasks, particularly for small datasets. As an example of a task with direct clinical impact, we studied the prediction of microsatellite instability from hematoxylin & eosin stained images and our results show that better performance can be achieved in this setting. Footnotes * Supplementary author and minor changes.

Paper

Share this book

Add to My Shelf

Phikon-v2, A large and public feature extractor for biomarker prediction

by Jacob, Paul , Saillard, Charlie , Alice Mac Kain in Biomarkers , Datasets , Feature extraction

2024

Gathering histopathology slides from over 100 publicly available cohorts, we compile a diverse dataset of 460 million pathology tiles covering more than 30 cancer sites. Using this dataset, we train a large self-supervised vision transformer using DINOv2 and publicly release one iteration of this model for further experimentation, coined Phikon-v2. While trained on publicly available histology slides, Phikon-v2 surpasses our previously released model (Phikon) and performs on par with other histopathology foundation models (FM) trained on proprietary data. Our benchmarks include eight slide-level tasks with results reported on external validation cohorts avoiding any data contamination between pre-training and evaluation datasets. Our downstream training procedure follows a simple yet robust ensembling strategy yielding a +1.75 AUC increase across tasks and models compared to one-shot retraining (p<0.001). We compare Phikon (ViT-B) and Phikon-v2 (ViT-L) against 14 different histology feature extractors, making our evaluation the most comprehensive to date. Our result support evidences that DINOv2 handles joint model and data scaling better than iBOT. Also, we show that recent scaling efforts are overall beneficial to downstream performance in the context of biomarker prediction with GigaPath and H-Optimus-0 (two ViT-g with 1.1B parameters each) standing out. However, the statistical margins between the latest top-performing FMs remain mostly non-significant; some even underperform on specific indications or tasks such as MSI prediction - deposed by a 13x smaller model developed internally. While latest foundation models may exhibit limitations for clinical deployment, they nonetheless offer excellent grounds for the development of more specialized and cost-efficient histology encoders fueling AI-guided diagnostic tools.

Paper

Share this book

Add to My Shelf

Distilling foundation models for robust and efficient models in digital pathology

by Scalbert, Marin , Saillard, Charlie , Dop, Nicolas in Benchmarks , Computing costs , Datasets

2025

In recent years, the advent of foundation models (FM) for digital pathology has relied heavily on scaling the pre-training datasets and the model size, yielding large and powerful models. While it resulted in improving the performance on diverse downstream tasks, it also introduced increased computational cost and inference time. In this work, we explore the distillation of a large foundation model into a smaller one, reducing the number of parameters by several orders of magnitude. Leveraging distillation techniques, our distilled model, H0-mini, achieves nearly comparable performance to large FMs at a significantly reduced inference cost. It is evaluated on several public benchmarks, achieving 3rd place on the HEST benchmark and 5th place on the EVA benchmark. Additionally, a robustness analysis conducted on the PLISM dataset demonstrates that our distilled model reaches excellent robustness to variations in staining and scanning conditions, significantly outperforming other state-of-the art models. This opens new perspectives to design lightweight and robust models for digital pathology, without compromising on performance.

Paper

Share this book

Add to My Shelf

Federated Survival Analysis with Discrete-Time Cox Models

by Saillard, Charlie , Menuet, Romuald , Andre Manoel in Datasets , Machine learning , Privacy

2020

Building machine learning models from decentralized datasets located in different centers with federated learning (FL) is a promising approach to circumvent local data scarcity while preserving privacy. However, the prominent Cox proportional hazards (PH) model, used for survival analysis, does not fit the FL framework, as its loss function is non-separable with respect to the samples. The na\"ive method to bypass this non-separability consists in calculating the losses per center, and minimizing their sum as an approximation of the true loss. We show that the resulting model may suffer from important performance loss in some adverse settings. Instead, we leverage the discrete-time extension of the Cox PH model to formulate survival analysis as a classification problem with a separable loss function. Using this approach, we train survival models using standard FL techniques on synthetic data, as well as real-world datasets from The Cancer Genome Atlas (TCGA), showing similar performance to a Cox PH model trained on aggregated data. Compared to previous works, the proposed method is more communication-efficient, more generic, and more amenable to using privacy-preserving techniques.

Paper

Share this book

Add to My Shelf

Self supervised learning improves dMMR/MSI detection from histology slides across multiple cancers

by Saillard, Charlie , Moindrot, Olivier , Dehaene, Olivier in Cancer , Datasets , Deep learning

2021

Microsatellite instability (MSI) is a tumor phenotype whose diagnosis largely impacts patient care in colorectal cancers (CRC), and is associated with response to immunotherapy in all solid tumors. Deep learning models detecting MSI tumors directly from H&E stained slides have shown promise in improving diagnosis of MSI patients. Prior deep learning models for MSI detection have relied on neural networks pretrained on ImageNet dataset, which does not contain any medical image. In this study, we leverage recent advances in self-supervised learning by training neural networks on histology images from the TCGA dataset using MoCo V2. We show that these networks consistently outperform their counterparts pretrained using ImageNet and obtain state-of-the-art results for MSI detection with AUCs of 0.92 and 0.83 for CRC and gastric tumors, respectively. These models generalize well on an external CRC cohort (0.97 AUC on PAIP) and improve transfer from one organ to another. Finally we show that predictive image regions exhibit meaningful histological patterns, and that the use of MoCo features highlighted more relevant patterns according to an expert pathologist.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter