Catalogue Search | MBRL

Interpretable multiple instance learning for hematologic diagnosis from peripheral blood smears

by McCash, Samuel I. , McVoy, Lauren , Paulsen, Sean

2026

Accurate diagnosis of hematologic malignancies from peripheral blood smears (PBSs) requires integrating cellular morphology and composition across numerous white blood cells. Existing computational approaches predominantly automate single-cell classifications and do not provide holistic, slide-level diagnostic predictions. We present a framework that employs a high-performance cell-based encoder (DeepHeme) for feature extraction, integrated with our weakly supervised, attention-based multiple instance learning (MIL) model, termed CAREMIL (Cell AggRegation, Explainable, Multiple Instance Learning). Through comprehensive evaluations of leading image encoders and MIL architectures, the combination of DeepHeme and CAREMIL demonstrated superior performance on disease classification tasks. CAREMIL functions as a robust aggregation mechanism, consistently outperforming established slide-level MIL methods (gated MIL and Dual-stream MIL Network) across multiple encoder types. The most pronounced performance gains were observed with out-of-domain encoders, including ImageNet-pretrained and open-source pathology foundation models (UNI2 and Virchow2). CAREMIL combined with DeepHeme achieves the highest diagnostic accuracy across acute myeloid leukemia (AML), myelodysplastic syndromes (MDS), and hairy cell leukemia (HCL), with AUROCs of 0.999, 0.891, and 0.945, respectively, and successfully identifies AML even in cases with minimal or absent circulating blasts. Attention values assigned by CAREMIL highlight diagnostically relevant cells and reveal disease-specific morphometric patterns, enabling biological interpretability and case-level insights. The framework remains resilient to individual cell misclassifications and does not require explicit cell-level supervision. These findings establish CAREMIL as an effective and interpretable MIL framework for hematologic slide diagnosis, extendable to bone marrow aspirates, cytology, and other liquid biopsy specimens, supporting a shift toward quantitative, morphology-informed hematologic diagnostics.

Journal Article

Share this book

Add to My Shelf

GOLDMARK: Governed Outcome-Linked Diagnostic Model Assessment Reference Kit

by Kumar, Neeraj , Singi, Siddharth , Campanella, Gabriele

2026

Computational biomarkers (CBs) are histopathology-derived patterns extracted from hematoxylin-eosin (H&E) whole-slide images (WSIs) using artificial intelligence (AI) to predict therapeutic response or prognosis. Recently, slide-level (MIL) with pathology foundation models (PFMs) has become the standard baseline for CB development. While these methods, with architectural and optimization advances, have improved predictive performance, computational pathology lacks standardized intermediate data formats, provenance tracking, checkpointing conventions, and reproducible evaluation metrics required for clinical-grade deployment. Consequently, discipline-level standardization, including data representation, model versioning, evaluation protocols, and auditability, is essential to enable reliable, scalable, and regulatory-ready clinical translation of CBs. We introduce (www.artificialintelligencepathology.org), a standardized benchmarking framework built on a curated TCGA cohort with clinically anchored OncoKB level 1-3 biomarker labels. GOLDMARK distributes structured intermediate outputs, including tile coordinates, per-slide feature embeddings from canonical PFMs, embedding-level quality-control metadata, trained slide-level weights, and reference code. Multiple publicly available PFMs are benchmarked under a unified attention-based MIL head using predefined patient-level splits. Models are trained on TCGA and evaluated on an independent MSKCC cohort with reciprocal testing. We evaluated 33 tumor-biomarker tasks; aggregate summaries over the 33 tasks with complete reciprocal metric coverage yielded mean AUROC of 0.689 (TCGA) and 0.630 (MSKCC). Restricting analysis to the eight highest-performing tasks yielded mean AUROCs of 0.831 and 0.801, respectively. These tasks correspond to established morphologic-genomic associations (e.g., LGG , COAD MSI/ , THCA / , BLCA , UCEC ) and showed the most stable cross-site performance. Differences between canonical encoders were modest relative to task-specific variability. Computational pathology is entering a translational phase in which reproducibility, transparency, and cross-institutional robustness are prerequisites for clinical trust. GOLDMARK establishes a reference framework that separates dataset curation from model evaluation and introduces structured intermediate artifacts, quality-control metadata, and symmetric cross-dataset testing as core components of benchmarking. Such infrastructure is essential for transforming computational biomarkers from research demonstrations into reproducible, clinically trusted workflows.

Journal Article

Share this book

Add to My Shelf

GOLDMARK: Governed Outcome-Linked Diagnostic Model Assessment Reference Kit

by Amir Momeni Boroujeni , Kumar, Neeraj , Singi, Siddharth in Artificial intelligence , Benchmarks , Biomarkers

2026

Paper

Share this book

Add to My Shelf

Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis

by Kumar, Neeraj , Jie-Fu, Chen , Goldgof, Gregory M in Adaptation , Attention , Image analysis

2025

Pathology foundation models (PFMs) have emerged as powerful tools for analyzing whole slide images (WSIs). However, adapting these pretrained PFMs for specific clinical tasks presents considerable challenges, primarily due to the availability of only weak (WSI-level) labels for gigapixel images, necessitating multiple instance learning (MIL) paradigm for effective WSI analysis. This paper proposes a novel approach for single-GPU Task Adaptation of PFMs (TAPFM) that uses vision transformer () attention for MIL aggregation while optimizing both for feature representations and attention weights. The proposed approach maintains separate computational graphs for MIL aggregator and the PFM to create stable training dynamics that align with downstream task objectives during end-to-end adaptation. Evaluated on mutation prediction tasks for bladder cancer and lung adenocarcinoma across institutional and TCGA cohorts, TAPFM consistently outperforms conventional approaches, with H-Optimus-0 (TAPFM) outperforming the benchmarks. TAPFM effectively handles multi-label classification of actionable mutations as well. Thus, TAPFM makes adaptation of powerful pre-trained PFMs practical on standard hardware for various clinical applications.

Paper

Share this book

Add to My Shelf

Decision Making for Human-in-the-loop Robotic Agents via Uncertainty-Aware Reinforcement Learning

by He, Zhanpeng , Robinson Piramuthu , Singi, Siddharth in Confidence intervals , Decision making , Training

2023

In a Human-in-the-Loop paradigm, a robotic agent is able to act mostly autonomously in solving a task, but can request help from an external expert when needed. However, knowing when to request such assistance is critical: too few requests can lead to the robot making mistakes, but too many requests can overload the expert. In this paper, we present a Reinforcement Learning based approach to this problem, where a semi-autonomous agent asks for external assistance when it has low confidence in the eventual success of the task. The confidence level is computed by estimating the variance of the return from the current state. We show that this estimate can be iteratively improved during training using a Bellman-like recursion. On discrete navigation problems with both fully- and partially-observable state information, we show that our method makes effective use of a limited budget of expert calls at run-time, despite having no access to the expert at training time.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter