Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
LOCAL CASE-CONTROL SAMPLING: EFFICIENT SUBSAMPLING IN IMBALANCED DATA SETS
by
Hastie, Trevor
, Fithian, William
in
62D05
/ 62F10
/ case-control sampling
/ Cereal foods
/ Classification
/ Conditional probabilities
/ Consistent estimators
/ Datasets
/ Estimators
/ Logistic regression
/ Mathematical problems
/ Modeling
/ Parameter estimation
/ Regression analysis
/ Sampling
/ Sampling bias
/ Statistical variance
/ Studies
/ subsampling
2014
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
LOCAL CASE-CONTROL SAMPLING: EFFICIENT SUBSAMPLING IN IMBALANCED DATA SETS
by
Hastie, Trevor
, Fithian, William
in
62D05
/ 62F10
/ case-control sampling
/ Cereal foods
/ Classification
/ Conditional probabilities
/ Consistent estimators
/ Datasets
/ Estimators
/ Logistic regression
/ Mathematical problems
/ Modeling
/ Parameter estimation
/ Regression analysis
/ Sampling
/ Sampling bias
/ Statistical variance
/ Studies
/ subsampling
2014
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
LOCAL CASE-CONTROL SAMPLING: EFFICIENT SUBSAMPLING IN IMBALANCED DATA SETS
by
Hastie, Trevor
, Fithian, William
in
62D05
/ 62F10
/ case-control sampling
/ Cereal foods
/ Classification
/ Conditional probabilities
/ Consistent estimators
/ Datasets
/ Estimators
/ Logistic regression
/ Mathematical problems
/ Modeling
/ Parameter estimation
/ Regression analysis
/ Sampling
/ Sampling bias
/ Statistical variance
/ Studies
/ subsampling
2014
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
LOCAL CASE-CONTROL SAMPLING: EFFICIENT SUBSAMPLING IN IMBALANCED DATA SETS
Journal Article
LOCAL CASE-CONTROL SAMPLING: EFFICIENT SUBSAMPLING IN IMBALANCED DATA SETS
2014
Request Book From Autostore
and Choose the Collection Method
Overview
For classification problems with significant class imbalance, subsampling can reduce computational costs at the price of inflated variance in estimating model parameters. We propose a method for subsampling efficiently for logistic regression by adjusting the class balance locally in feature space via an accept-reject scheme. Our method generalizes standard case-control sampling, using a pilot estimate to preferentially select examples whose responses are conditionally rare given their features. The biased subsampling is corrected by a post-hoc analytic adjustment to the parameters. The method is simple and requires one parallelizable scan over the full data set. Standard case-control sampling is inconsistent under model misspecification for the population risk-minimizing coefficients θ*. By contrast, our estimator is consistent for θ* provided that the pilot estimate is. Moreover, under correct specification and with a consistent, independent pilot estimate, our estimator has exactly twice the asymptotic variance of the full-sample MLE—even if the selected subsample comprises a miniscule fraction of the full data set, as happens when the original data are severely imbalanced. The factor of two improves to $1 + \\frac{1}{c}$ if we multiply the baseline acceptance probabilities by c > 1 (and weight points with acceptance probability greater than 1), taking roughly $\\frac{{1 + c}}{2}$ times as many data points into the subsample. Experiments on simulated and real data show that our method can substantially outperform standard case-control subsampling.
MBRLCatalogueRelatedBooks
Related Items
Related Items
We currently cannot retrieve any items related to this title. Kindly check back at a later time.
This website uses cookies to ensure you get the best experience on our website.