Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Improving random forest predictions in small datasets from two-phase sampling designs
by
Fong, Youyi
, Han, Sunwoo
, Williamson, Brian D.
in
Algorithms
/ Antibodies
/ Balancing
/ Biological markers
/ Biomarkers
/ Body mass index
/ Case–control design
/ Class imbalance
/ Classification
/ Datasets
/ Gene expression
/ Generalized linear models
/ Health aspects
/ Health Informatics
/ HIV
/ HIV vaccine
/ Human immunodeficiency virus
/ Humans
/ Infections
/ Information Systems and Communication Service
/ Learning algorithms
/ Lymphocytes
/ Machine Learning
/ Management of Computing and Information Systems
/ Markers
/ Medicine
/ Medicine & Public Health
/ Methods
/ Performance prediction
/ Predictions
/ Probability
/ Research Article
/ Sampling
/ Sampling designs
/ Screening
/ Selection bias
/ Stacking
/ Statistical models
/ Testing
/ Tuning
/ Vaccine Efficacy
/ Vaccines
/ Variable screening
/ Weighting
2021
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Improving random forest predictions in small datasets from two-phase sampling designs
by
Fong, Youyi
, Han, Sunwoo
, Williamson, Brian D.
in
Algorithms
/ Antibodies
/ Balancing
/ Biological markers
/ Biomarkers
/ Body mass index
/ Case–control design
/ Class imbalance
/ Classification
/ Datasets
/ Gene expression
/ Generalized linear models
/ Health aspects
/ Health Informatics
/ HIV
/ HIV vaccine
/ Human immunodeficiency virus
/ Humans
/ Infections
/ Information Systems and Communication Service
/ Learning algorithms
/ Lymphocytes
/ Machine Learning
/ Management of Computing and Information Systems
/ Markers
/ Medicine
/ Medicine & Public Health
/ Methods
/ Performance prediction
/ Predictions
/ Probability
/ Research Article
/ Sampling
/ Sampling designs
/ Screening
/ Selection bias
/ Stacking
/ Statistical models
/ Testing
/ Tuning
/ Vaccine Efficacy
/ Vaccines
/ Variable screening
/ Weighting
2021
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Improving random forest predictions in small datasets from two-phase sampling designs
by
Fong, Youyi
, Han, Sunwoo
, Williamson, Brian D.
in
Algorithms
/ Antibodies
/ Balancing
/ Biological markers
/ Biomarkers
/ Body mass index
/ Case–control design
/ Class imbalance
/ Classification
/ Datasets
/ Gene expression
/ Generalized linear models
/ Health aspects
/ Health Informatics
/ HIV
/ HIV vaccine
/ Human immunodeficiency virus
/ Humans
/ Infections
/ Information Systems and Communication Service
/ Learning algorithms
/ Lymphocytes
/ Machine Learning
/ Management of Computing and Information Systems
/ Markers
/ Medicine
/ Medicine & Public Health
/ Methods
/ Performance prediction
/ Predictions
/ Probability
/ Research Article
/ Sampling
/ Sampling designs
/ Screening
/ Selection bias
/ Stacking
/ Statistical models
/ Testing
/ Tuning
/ Vaccine Efficacy
/ Vaccines
/ Variable screening
/ Weighting
2021
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Improving random forest predictions in small datasets from two-phase sampling designs
Journal Article
Improving random forest predictions in small datasets from two-phase sampling designs
2021
Request Book From Autostore
and Choose the Collection Method
Overview
Background
While random forests are one of the most successful machine learning methods, it is necessary to optimize their performance for use with datasets resulting from a two-phase sampling design with a small number of cases—a common situation in biomedical studies, which often have rare outcomes and covariates whose measurement is resource-intensive.
Methods
Using an immunologic marker dataset from a phase III HIV vaccine efficacy trial, we seek to optimize random forest prediction performance using combinations of variable screening, class balancing, weighting, and hyperparameter tuning.
Results
Our experiments show that while class balancing helps improve random forest prediction performance when variable screening is not applied, class balancing has a negative impact on performance in the presence of variable screening. The impact of the weighting similarly depends on whether variable screening is applied. Hyperparameter tuning is ineffective in situations with small sample sizes. We further show that random forests under-perform generalized linear models for some subsets of markers, and prediction performance on this dataset can be improved by stacking random forests and generalized linear models trained on different subsets of predictors, and that the extent of improvement depends critically on the dissimilarities between candidate learner predictions.
Conclusion
In small datasets from two-phase sampling design, variable screening and inverse sampling probability weighting are important for achieving good prediction performance of random forests. In addition, stacking random forests and simple linear models can offer improvements over random forests.
Publisher
BioMed Central,BioMed Central Ltd,Springer Nature B.V,BMC
Subject
This website uses cookies to ensure you get the best experience on our website.