Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
178
result(s) for
"Bischl, Bernd"
Sort by:
Deep learning for survival analysis: a review
by
Bender, Andreas
,
Sonabend, Raphael
,
Kopper, Philipp
in
Artificial Intelligence
,
Computer Science
,
Data
2024
The influx of deep learning (DL) techniques into the field of survival analysis in recent years has led to substantial methodological progress; for instance, learning from unstructured or high-dimensional data such as images, text or omics data. In this work, we conduct a comprehensive systematic review of DL-based methods for time-to-event analysis, characterizing them according to both survival- and DL-related attributes. In summary, the reviewed methods often address only a small subset of tasks relevant to time-to-event data—e.g., single-risk right-censored data—and neglect to incorporate more complex settings. Our findings are summarized in an editable, open-source, interactive table:
https://survival-org.github.io/DL4Survival
. As this research area is advancing rapidly, we encourage community contribution in order to keep this database up to date.
Journal Article
Grouped feature importance and combined features effect plot
2022
Interpretable machine learning has become a very active area of research due to the rising popularity of machine learning algorithms and their inherently challenging interpretability. Most work in this area has been focused on the interpretation of single features in a model. However, for researchers and practitioners, it is often equally important to quantify the importance or visualize the effect of feature groups. To address this research gap, we provide a comprehensive overview of how existing model-agnostic techniques can be defined for feature groups to assess the grouped feature importance, focusing on permutation-based, refitting, and Shapley-based methods. We also introduce an importance-based sequential procedure that identifies a stable and well-performing combination of features in the grouped feature space. Furthermore, we introduce the combined features effect plot, which is a technique to visualize the effect of a group of features based on a sparse, interpretable linear combination of features. We used simulation studies and real data examples to analyze, compare, and discuss these methods.
Journal Article
High-Resolution Motor State Detection in Parkinson’s Disease Using Convolutional Neural Networks
by
Um, Terry Taewoong
,
Fietzek, Urban M.
,
Abedinpour, Kian
in
631/378/116/2392
,
692/617/375/346/1718
,
Aged
2020
Patients with advanced Parkinson’s disease regularly experience unstable motor states. Objective and reliable monitoring of these fluctuations is an unmet need. We used deep learning to classify motion data from a single wrist-worn IMU sensor recording in unscripted environments. For validation purposes, patients were accompanied by a movement disorder expert, and their motor state was passively evaluated every minute. We acquired a dataset of 8,661 minutes of IMU data from 30 patients, with annotations about the motor state (OFF,ON, DYSKINETIC) based on MDS-UPDRS global bradykinesia item and the AIMS upper limb dyskinesia item. Using a 1-minute window size as an input for a convolutional neural network trained on data from a subset of patients, we achieved a three-class balanced accuracy of 0.654 on data from previously unseen subjects. This corresponds to detecting the OFF, ON, or DYSKINETIC motor state at a sensitivity/specificity of 0.64/0.89, 0.67/0.67 and 0.64/0.89, respectively. On average, the model outputs were highly correlated with the annotation on a per subject scale (r = 0.83/0.84; p < 0.0001), and sustained so for the highly resolved time windows of 1 minute (r = 0.64/0.70; p < 0.0001). Thus, we demonstrate the feasibility of long-term motor-state detection in a free-living setting with deep learning using motion data from a single IMU.
Journal Article
Monitoring Forest Health Using Hyperspectral Imagery: Does Feature Selection Improve the Performance of Machine-Learning Techniques?
by
Schratz, Patrick
,
Brenning, Alexander
,
Muenchow, Jannes
in
Algorithms
,
Constraint modelling
,
data collection
2021
This study analyzed highly correlated, feature-rich datasets from hyperspectral remote sensing data using multiple statistical and machine-learning methods. The effect of filter-based feature selection methods on predictive performance was compared. In addition, the effect of multiple expert-based and data-driven feature sets, derived from the reflectance data, was investigated. Defoliation of trees (%), derived from in situ measurements from fall 2016, was modeled as a function of reflectance. Variable importance was assessed using permutation-based feature importance. Overall, the support vector machine (SVM) outperformed other algorithms, such as random forest (RF), extreme gradient boosting (XGBoost), and lasso (L1) and ridge (L2) regressions by at least three percentage points. The combination of certain feature sets showed small increases in predictive performance, while no substantial differences between individual feature sets were observed. For some combinations of learners and feature sets, filter methods achieved better predictive performances than using no feature selection. Ensemble filters did not have a substantial impact on performance. The most important features were located around the red edge. Additional features in the near-infrared region (800–1000 nm) were also essential to achieve the overall best performances. Filter methods have the potential to be helpful in high-dimensional situations and are able to improve the interpretation of feature effects in fitted models, which is an essential constraint in environmental modeling studies. Nevertheless, more training data and replication in similar benchmarking studies are needed to be able to generalize the results.
Journal Article
Distributed non-disclosive validation of predictive models by a modified ROC-GLM
by
Hoffmann, Verena S.
,
Bischl, Bernd
,
Rehms, Raphael
in
Algorithms
,
Area Under Curve
,
Area under the ROC curve
2024
Background
Distributed statistical analyses provide a promising approach for privacy protection when analyzing data distributed over several databases. Instead of directly operating on data, the analyst receives anonymous summary statistics, which are combined into an aggregated result. Further, in discrimination model (prognosis, diagnosis, etc.) development, it is key to evaluate a trained model w.r.t. to its prognostic or predictive performance on new independent data. For binary classification, quantifying discrimination uses the receiver operating characteristics (ROC) and its area under the curve (AUC) as aggregation measure. We are interested to calculate both as well as basic indicators of calibration-in-the-large for a binary classification task using a distributed and privacy-preserving approach.
Methods
We employ DataSHIELD as the technology to carry out distributed analyses, and we use a newly developed algorithm to validate the prediction score by conducting distributed and privacy-preserving ROC analysis. Calibration curves are constructed from mean values over sites. The determination of ROC and its AUC is based on a generalized linear model (GLM) approximation of the true ROC curve, the ROC-GLM, as well as on ideas of differential privacy (DP). DP adds noise (quantified by the
ℓ
2
sensitivity
Δ
2
(
f
^
)
) to the data and enables a global handling of placement numbers. The impact of DP parameters was studied by simulations.
Results
In our simulation scenario, the true and distributed AUC measures differ by
Δ
AUC
<
0.01
depending heavily on the choice of the differential privacy parameters. It is recommended to check the accuracy of the distributed AUC estimator in specific simulation scenarios along with a reasonable choice of DP parameters. Here, the accuracy of the distributed AUC estimator may be impaired by too much artificial noise added from DP.
Conclusions
The applicability of our algorithms depends on the
ℓ
2
sensitivity
Δ
2
(
f
^
)
of the underlying statistical/predictive model. The simulations carried out have shown that the approximation error is acceptable for the majority of simulated cases. For models with high
Δ
2
(
f
^
)
, the privacy parameters must be set accordingly higher to ensure sufficient privacy protection, which affects the approximation error. This work shows that complex measures, as the AUC, are applicable for validation in distributed setups while preserving an individual’s privacy.
Journal Article
A self-supervised deep learning method for data-efficient training in genomics
2023
Deep learning in bioinformatics is often limited to problems where extensive amounts of labeled data are available for supervised classification. By exploiting unlabeled data, self-supervised learning techniques can improve the performance of machine learning models in the presence of limited labeled data. Although many self-supervised learning methods have been suggested before, they have failed to exploit the unique characteristics of genomic data. Therefore, we introduce
Self-GenomeNet
, a self-supervised learning technique that is custom-tailored for genomic data.
Self-GenomeNet
leverages reverse-complement sequences and effectively learns short- and long-term dependencies by predicting targets of different lengths.
Self-GenomeNet
performs better than other self-supervised methods in data-scarce genomic tasks and outperforms standard supervised training with ~10 times fewer labeled training data. Furthermore, the learned representations generalize well to new datasets and tasks. These findings suggest that
Self-GenomeNet
is well suited for large-scale, unlabeled genomic datasets and could substantially improve the performance of genomic models.
Self-GenomeNet, a self-supervised learning technique, is trained by predicting unlabeled reverse-complement genome sequences of different lengths and improves the performance of models substantially when a limited amount of labeled data is available.
Journal Article
Predicting instructed simulation and dissimulation when screening for depressive symptoms
by
Stachl Clemens
,
Bühner Markus
,
Sarubin Nina
in
Algorithms
,
Artificial intelligence
,
Clinical trials
2020
The intentional distortion of test results presents a fundamental problem to self-report-based psychiatric assessment, such as screening for depressive symptoms. The first objective of the study was to clarify whether depressed patients like healthy controls possess both the cognitive ability and motivation to deliberately influence results of commonly used screening measures. The second objective was the construction of a method derived directly from within the test takers’ responses to systematically detect faking behavior. Supervised machine learning algorithms posit the potential to empirically learn the implicit interconnections between responses, which shape detectable faking patterns. In a standardized design, faking bad and faking good were experimentally induced in a matched sample of 150 depressed and 150 healthy subjects. Participants completed commonly used questionnaires to detect depressive and associated symptoms. Group differences throughout experimental conditions were evaluated using linear mixed-models. Machine learning algorithms were trained on the test results and compared regarding their capacity to systematically predict distortions in response behavior in two scenarios: (1) differentiation of authentic patient responses from simulated responses of healthy participants; (2) differentiation of authentic patient responses from dissimulated patient responses. Statistically significant convergence of the test scores in both faking conditions suggests that both depressive patients and healthy controls have the cognitive ability as well as the motivational compliance to alter their test results. Evaluation of the algorithmic capability to detect faking behavior yielded ideal predictive accuracies of up to 89%. Implications of the findings, as well as future research objectives are discussed. Trial Registration The study was pre-registered at the German registry for clinical trials (Deutsches Register klinischer Studien, DRKS; DRKS00007708).
Journal Article
Model-agnostic feature importance and effects with dependent features: a conditional subgroup approach
by
König, Gunnar
,
Molnar, Christoph
,
Casalicchio, Giuseppe
in
Extrapolation
,
Machine learning
,
Permutations
2024
The interpretation of feature importance in machine learning models is challenging when features are dependent. Permutation feature importance (PFI) ignores such dependencies, which can cause misleading interpretations due to extrapolation. A possible remedy is more advanced conditional PFI approaches that enable the assessment of feature importance conditional on all other features. Due to this shift in perspective and in order to enable correct interpretations, it is beneficial if the conditioning is transparent and comprehensible. In this paper, we propose a new sampling mechanism for the conditional distribution based on permutations in conditional subgroups. As these subgroups are constructed using tree-based methods such as transformation trees, the conditioning becomes inherently interpretable. This not only provides a simple and effective estimator of conditional PFI, but also local PFI estimates within the subgroups. In addition, we apply the conditional subgroups approach to partial dependence plots, a popular method for describing feature effects that can also suffer from extrapolation when features are dependent and interactions are present in the model. In simulations and a real-world application, we demonstrate the advantages of the conditional subgroup approach over existing methods: It allows to compute conditional PFI that is more true to the data than existing proposals and enables a fine-grained interpretation of feature effects and importance within the conditional subgroups.
Journal Article
Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features
by
Pargent, Florian
,
Pfisterer, Florian
,
Bischl, Bernd
in
Algorithms
,
Best practice
,
Data analysis
2022
Since most machine learning (ML) algorithms are designed for numerical inputs, efficiently encoding categorical variables is a crucial aspect in data analysis. A common problem are high cardinality features, i.e. unordered categorical predictor variables with a high number of levels. We study techniques that yield numeric representations of categorical variables which can then be used in subsequent ML applications. We focus on the impact of these techniques on a subsequent algorithm’s predictive performance, and—if possible—derive best practices on when to use which technique. We conducted a large-scale benchmark experiment, where we compared different encoding strategies together with five ML algorithms (lasso, random forest, gradient boosting, k-nearest neighbors, support vector machine) using datasets from regression, binary- and multiclass–classification settings. In our study, regularized versions of target encoding (i.e. using target predictions based on the feature levels in the training set as a new numerical feature) consistently provided the best results. Traditionally widely used encodings that make unreasonable assumptions to map levels to integers (e.g. integer encoding) or to reduce the number of levels (possibly based on target information, e.g. leaf encoding) before creating binary indicator variables (one-hot or dummy encoding) were not as effective in comparison.
Journal Article
Optimized model architectures for deep learning on genomic data
2024
The success of deep learning in various applications depends on task-specific architecture design choices, including the types, hyperparameters, and number of layers. In computational biology, there is no consensus on the optimal architecture design, and decisions are often made using insights from more well-established fields such as computer vision. These may not consider the domain-specific characteristics of genome sequences, potentially limiting performance. Here, we present GenomeNet-Architect, a neural architecture design framework that automatically optimizes deep learning models for genome sequence data. It optimizes the overall layout of the architecture, with a search space specifically designed for genomics. Additionally, it optimizes hyperparameters of individual layers and the model training procedure. On a viral classification task, GenomeNet-Architect reduced the read-level misclassification rate by 19%, with 67% faster inference and 83% fewer parameters, and achieved similar contig-level accuracy with ~100 times fewer parameters compared to the best-performing deep learning baselines.
Introducing GenomeNet-Architect, a neural architecture design framework that automatically optimises the overall layout of the architecture, the hyperparameters, and the training procedure of deep learning models for genome sequence data.
Journal Article