Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
20
result(s) for
"Haarburger, Christoph"
Sort by:
A multimodal comparison of latent denoising diffusion probabilistic models and generative adversarial networks for medical image synthesis
by
Niehues, Jan Moritz
,
Kuhl, Christiane
,
Arasteh, Soroosh Tayebi
in
692/308
,
692/308/575
,
Datasets
2023
Although generative adversarial networks (GANs) can produce large datasets, their limited diversity and fidelity have been recently addressed by denoising diffusion probabilistic models, which have demonstrated superiority in natural image synthesis. In this study, we introduce Medfusion, a conditional latent DDPM designed for medical image generation, and evaluate its performance against GANs, which currently represent the state-of-the-art. Medfusion was trained and compared with StyleGAN-3 using fundoscopy images from the AIROGS dataset, radiographs from the CheXpert dataset, and histopathology images from the CRCDX dataset. Based on previous studies, Progressively Growing GAN (ProGAN) and Conditional GAN (cGAN) were used as additional baselines on the CheXpert and CRCDX datasets, respectively. Medfusion exceeded GANs in terms of diversity (recall), achieving better scores of 0.40 compared to 0.19 in the AIROGS dataset, 0.41 compared to 0.02 (cGAN) and 0.24 (StyleGAN-3) in the CRMDX dataset, and 0.32 compared to 0.17 (ProGAN) and 0.08 (StyleGAN-3) in the CheXpert dataset. Furthermore, Medfusion exhibited equal or higher fidelity (precision) across all three datasets. Our study shows that Medfusion constitutes a promising alternative to GAN-based models for generating high-quality medical images, leading to improved diversity and less artifacts in the generated images.
Journal Article
Radiomics feature reproducibility under inter-rater variability in segmentations of CT images
by
Kuhl, Christiane
,
Merhof, Dorit
,
Müller-Franzes, Gustav
in
631/67/1857
,
631/67/2321
,
Automation
2020
Identifying image features that are robust with respect to segmentation variability is a tough challenge in radiomics. So far, this problem has mainly been tackled in test–retest analyses. In this work we analyse radiomics feature reproducibility in two phases: first with manual segmentations provided by four expert readers and second with probabilistic automated segmentations using a recently developed neural network (PHiseg). We test feature reproducibility on three publicly available datasets of lung, kidney and liver lesions. We find consistent results both over manual and automated segmentations in all three datasets and show that there are subsets of radiomic features which are robust against segmentation variability and other radiomic features which are prone to poor reproducibility under differing segmentations. By providing a detailed analysis of robustness of the most common radiomics features across several datasets, we envision that more reliable and reproducible radiomic models can be built in the future based on this work.
Journal Article
Denoising diffusion probabilistic models for 3D medical image generation
by
Kuhl, Christiane
,
Engelhardt, Sandy
,
Khader, Firas
in
639/705/117
,
692/700/1421/1628
,
692/700/1421/1846/2771
2023
Recent advances in computer vision have shown promising results in image generation. Diffusion probabilistic models have generated realistic images from textual input, as demonstrated by DALL-E 2, Imagen, and Stable Diffusion. However, their use in medicine, where imaging data typically comprises three-dimensional volumes, has not been systematically evaluated. Synthetic images may play a crucial role in privacy-preserving artificial intelligence and can also be used to augment small datasets. We show that diffusion probabilistic models can synthesize high-quality medical data for magnetic resonance imaging (MRI) and computed tomography (CT). For quantitative evaluation, two radiologists rated the quality of the synthesized images regarding \"realistic image appearance\", \"anatomical correctness\", and \"consistency between slices\". Furthermore, we demonstrate that synthetic images can be used in self-supervised pre-training and improve the performance of breast segmentation models when data is scarce (Dice scores, 0.91 [without synthetic data], 0.95 [with synthetic data]).
Journal Article
Advancing diagnostic performance and clinical usability of neural networks via adversarial training and dual batch normalization
2021
Unmasking the decision making process of machine learning models is essential for implementing diagnostic support systems in clinical practice. Here, we demonstrate that adversarially trained models can significantly enhance the usability of pathology detection as compared to their standard counterparts. We let six experienced radiologists rate the interpretability of saliency maps in datasets of X-rays, computed tomography, and magnetic resonance imaging scans. Significant improvements are found for our adversarial models, which are further improved by the application of dual-batch normalization. Contrary to previous research on adversarially trained models, we find that accuracy of such models is equal to standard models, when sufficiently large datasets and dual batch norm training are used. To ensure transferability, we additionally validate our results on an external test set of 22,433 X-rays. These findings elucidate that different paths for adversarial and real images are needed during training to achieve state of the art results with superior clinical interpretability.
Unmasking the decision making process of machine learning models is essential for implementing diagnostic support systems in clinical practice. Here, the authors demonstrate that adversarially trained models can significantly enhance the usability of pathology detection as compared to their standard counterparts.
Journal Article
Medical transformer for multimodal survival prediction in intensive care: integration of imaging and non-imaging data
by
Kuhl, Christiane
,
Khader, Firas
,
Hamesch, Karim
in
639/705/117
,
692/700/1421
,
692/700/1421/1770
2023
When clinicians assess the prognosis of patients in intensive care, they take imaging and non-imaging data into account. In contrast, many traditional machine learning models rely on only one of these modalities, limiting their potential in medical applications. This work proposes and evaluates a transformer-based neural network as a novel AI architecture that integrates multimodal patient data, i.e., imaging data (chest radiographs) and non-imaging data (clinical data). We evaluate the performance of our model in a retrospective study with 6,125 patients in intensive care. We show that the combined model (area under the receiver operating characteristic curve [AUROC] of 0.863) is superior to the radiographs-only model (AUROC = 0.811, p < 0.001) and the clinical data-only model (AUROC = 0.785, p < 0.001) when tasked with predicting in-hospital survival per patient. Furthermore, we demonstrate that our proposed model is robust in cases where not all (clinical) data points are available.
Journal Article
Robustness of Radiomics for Survival Prediction of Brain Tumor Patients Depending on Resection Status
by
Haarburger, Christoph
,
Merhof, Dorit
,
Weninger, Leon
in
Algorithms
,
Artificial intelligence
,
Brain cancer
2019
Prediction of overall survival based on multimodal MRI of brain tumor patients is a difficult problem. Although survival also depends on factors that cannot be assessed via preoperative MRI such as surgical outcome, encouraging results for MRI-based survival analysis have been published for different datasets. We assess if and how established radiomic approaches as well as novel methods can predict overall survival of brain tumor patients on the BraTS challenge dataset. This dataset consists of multimodal preoperative images of 211 glioblastoma patients from several institutions with reported resection status and known survival. In the official challenge setting, only patients with a reported gross total resection are taken into account. We therefore evaluated previously published methods as well as different machine learning approaches on the BraTS dataset. For different types of resection status, these approaches are compared to a baseline, a linear regression on patient age only. This naive approach won the 3rd place out of 26 participants in the BraTS survival prediction challenge 2018. Previously published radiomic signatures show significant correlations and predictiveness to patient survival for patients with a reported subtotal resection. However, for patients with reported gross total resection, none of the evaluated approaches was able to outperform the age-only baseline in a cross-validation setting, explaining the poor performance of approaches based on radiomics in the BraTS challenge 2018.
Journal Article
Medical large language models are susceptible to targeted misinformation attacks
2024
Large language models (LLMs) have broad medical knowledge and can reason about medical information across many domains, holding promising potential for diverse medical applications in the near future. In this study, we demonstrate a concerning vulnerability of LLMs in medicine. Through targeted manipulation of just 1.1% of the weights of the LLM, we can deliberately inject incorrect biomedical facts. The erroneous information is then propagated in the model’s output while maintaining performance on other biomedical tasks. We validate our findings in a set of 1025 incorrect biomedical facts. This peculiar susceptibility raises serious security and trustworthiness concerns for the application of LLMs in healthcare settings. It accentuates the need for robust protective measures, thorough verification mechanisms, and stringent management of access to these models, ensuring their reliable and safe use in medical practice.
Journal Article
Multiphase CT-based prediction of Child-Pugh classification: a machine learning approach
by
Kuhl, Christiane K.
,
Thüring, Johannes
,
Merhof, Dorit
in
Aged
,
Algorithms
,
Artificial intelligence
2020
Background
To evaluate whether machine learning algorithms allow the prediction of Child-Pugh classification on clinical multiphase computed tomography (CT).
Methods
A total of 259 patients who underwent diagnostic abdominal CT (unenhanced, contrast-enhanced arterial, and venous phases) were included in this retrospective study. Child-Pugh scores were determined based on laboratory and clinical parameters. Linear regression (LR), Random Forest (RF), and convolutional neural network (CNN) algorithms were used to predict the Child-Pugh class. Their performances were compared to the prediction of experienced radiologists (ERs). Spearman correlation coefficients and accuracy were assessed for all predictive models. Additionally, a binary classification in low disease severity (Child-Pugh class A) and advanced disease severity (Child-Pugh class ≥ B) was performed.
Results
Eleven imaging features exhibited a significant correlation when adjusted for multiple comparisons with Child-Pugh class. Significant correlations between predicted and measured Child-Pugh classes were observed (ρ
LA
= 0.35, ρ
RF
= 0.32, ρ
CNN
= 0.51, ρ
ERs
= 0.60;
p
< 0.001). Significantly better accuracies for the prediction of Child-Pugh classes
versus
no-information rate were found for CNN and ERs (
p
≤ 0.034), not for LR and RF (
p
≥ 0.384). For binary severity classification, the area under the curve at receiver operating characteristic analysis was significantly lower (
p
≤ 0.042) for LR (0.71) and RF (0.69) than for CNN (0.80) and ERs (0.76), without significant differences between CNN and ERs (
p
= 0.144).
Conclusions
The performance of a CNN in assessing Child-Pugh class based on multiphase abdominal CT images is comparable to that of ERs.
Journal Article
Reliability as a Precondition for Trust—Segmentation Reliability Analysis of Radiomic Features Improves Survival Prediction
2022
Machine learning results based on radiomic analysis are often not transferrable. A potential reason for this is the variability of radiomic features due to varying human made segmentations. Therefore, the aim of this study was to provide comprehensive inter-reader reliability analysis of radiomic features in five clinical image datasets and to assess the association of inter-reader reliability and survival prediction. In this study, we analyzed 4598 tumor segmentations in both computed tomography and magnetic resonance imaging data. We used a neural network to generate 100 additional segmentation outlines for each tumor and performed a reliability analysis of radiomic features. To prove clinical utility, we predicted patient survival based on all features and on the most reliable features. Survival prediction models for both computed tomography and magnetic resonance imaging datasets demonstrated less statistical spread and superior survival prediction when based on the most reliable features. Mean concordance indices were Cmean = 0.58 [most reliable] vs. Cmean = 0.56 [all] (p < 0.001, CT) and Cmean = 0.58 vs. Cmean = 0.57 (p = 0.23, MRI). Thus, preceding reliability analyses and selection of the most reliable radiomic features improves the underlying model’s ability to predict patient survival across clinical imaging modalities and tumor entities.
Journal Article