Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
390
result(s) for
"two-sample test"
Sort by:
Simulation data for the analysis of Bayesian posterior significance and effect size indices for the two-sample t-test to support reproducible medical research
2020
Objectives
The data presented herein represents the simulated datasets of a recently conducted larger study which investigated the behaviour of Bayesian indices of significance and effect size as alternatives to traditional p-values. The study considered the setting of Student’s and Welch’s two-sample t-test often used in medical research. It investigated the influence of the sample size, noise, the selected prior hyperparameters and the sensitivity to type I errors. The posterior indices used included the Bayes factor, the region of practical equivalence, the probability of direction, the MAP-based p-value and the e-value in the Full Bayesian Significance Test. The simulation study was conducted in the statistical programming language R.
Data description
The R script files for simulation of the datasets used in the study are presented in this article. These script files can both simulate the raw datasets and run the analyses. As researchers may be faced with different effect sizes, noise levels or priors in their domain than the ones studied in the original paper, the scripts extend the original results by allowing to recreate all analyses of interest in different contexts. Therefore, they should be relevant to other researchers.
Journal Article
Weighted Logrank Permutation Tests for Randomly Right Censored Life Science Data
by
Brendel, Michael
,
Janssen, Arnold
,
Mayer, Claus-Dieter
in
Asymptotic properties
,
Biomedical research
,
Censored data
2014
In biomedical research, weighted logrank tests are frequently applied to compare two samples of randomly right censored survival times. We address the question how to combine a number of weighted logrank statistics to achieve good power of the corresponding survival test for a whole linear space or cone of alternatives, which are given by hazard rates. This leads to a new class of semiparametric projection tests that are motivated by likelihood ratio tests for an asymptotic model. We show that these tests can be carried out as permutation tests and discuss their asymptotic properties. A simulation study together with the analysis of a classical data set illustrates the advantages.
Journal Article
Lack-of-fit tests for quantile regression models
2019
The paper novelly transforms lack-of-fit tests for parametric quantile regression models into checking the equality of two conditional distributions of covariates. Accordingly, by applying some successful two-sample test statistics in the literature, two tests are constructed to check the lack of fit for low and high dimensional quantile regression models. The low dimensional test works well when the number of covariates is moderate, whereas the high dimensional test can maintain the power when the number of covariates exceeds the sample size. The null distribution of the high dimensional test has an explicit form, and the p-values or critical values can then be calculated directly. The finite sample performance of the tests proposed is examined by simulation studies, and their usefulness is further illustrated by two real examples.
Journal Article
Direction-Projection-Permutation for High-Dimensional Hypothesis Tests
by
Wei, Susan
,
Wichers, Lindsay
,
Lee, Chihoon
in
Assignment problem
,
Asymptotic methods
,
Classification
2016
High-dimensional low sample size (HDLSS) data are becoming increasingly common in statistical applications. When the data can be partitioned into two classes, a basic task is to construct a classifier that can assign objects to the correct class. Binary linear classifiers have been shown to be especially useful in HDLSS settings and preferable to more complicated classifiers because of their ease of interpretability. We propose a computational tool called direction-projection-permutation (DiProPerm), which rigorously assesses whether a binary linear classifier is detecting statistically significant differences between two high-dimensional distributions. The basic idea behind DiProPerm involves working directly with the one-dimensional projections of the data induced by binary linear classifier. Theoretical properties of DiProPerm are studied under the HDLSS asymptotic regime whereby dimension diverges to infinity while sample size remains fixed. We show that certain variations of DiProPerm are consistent and that consistency is a nontrivial property of tests in the HDLSS asymptotic regime. The practical utility of DiProPerm is demonstrated on HDLSS gene expression microarray datasets. Finally, an empirical power study is conducted comparing DiProPerm to several alternative two-sample HDLSS tests to understand the advantages and disadvantages of each method.
Journal Article
Analysis of type I and II error rates of Bayesian and frequentist parametric and nonparametric two-sample hypothesis tests under preliminary assessment of normality
2021
Testing for differences between two groups is among the most frequently carried out statistical methods in empirical research. The traditional frequentist approach is to make use of null hypothesis significance tests which use p values to reject a null hypothesis. Recently, a lot of research has emerged which proposes Bayesian versions of the most common parametric and nonparametric frequentist two-sample tests. These proposals include Student’s two-sample t-test and its nonparametric counterpart, the Mann–Whitney U test. In this paper, the underlying assumptions, models and their implications for practical research of recently proposed Bayesian two-sample tests are explored and contrasted with the frequentist solutions. An extensive simulation study is provided, the results of which demonstrate that the proposed Bayesian tests achieve better type I error control at slightly increased type II error rates. These results are important, because balancing the type I and II errors is a crucial goal in a variety of research, and shifting towards the Bayesian two-sample tests while simultaneously increasing the sample size yields smaller type I error rates. What is more, the results highlight that the differences in type II error rates between frequentist and Bayesian two-sample tests depend on the magnitude of the underlying effect.
Journal Article
An Automatic Test for the Umbrella Alternatives
2016
The paper proposes a new test for detecting the umbrella pattern under a general non-parametric scheme. The alternative asserts that the umbrella ordering holds while the hypothesis is its complement. The main focus is put on controlling the power function of the test outside the alternative. As a result, the asymptotic error of the first kind of the constructed solution is smaller than or equal to the fixed significance level α on the whole set where the umbrella ordering does not hold. Also, under finite sample sizes, this error is controlled to a satisfactory extent. A simulation study shows, among other things, that the new test improves upon the solution widely recommended in the literature of the subject. A routine, written in R, is attached as the Supporting Information file.
Journal Article
Non-Parametric Change-Point Tests for Long-Range Dependent Data
by
DEHLING, HEROLD
,
ROOCH, AENEAS
,
TAQQU, MURAD S.
in
Asymptotic properties
,
change-point problem
,
Critical values
2013
We propose a non-parametric change-point test for long-range dependent data, which is based on the Wilcoxon two-sample test. We derive the asymptotic distribution of the test statistic under the null hypothesis that no change occurred. In a simulation study, we compare the power of our test with the power of a test which is based on differences of means. The results of the simulation study show that in the case of Gaussian data, our test has only slightly smaller power than the 'difference-of-means' test. For heavy-tailed data, our test outperforms the 'difference-of-means' test.
Journal Article
Fast Approximation of Small p-Values in Permutation Tests by Partitioning the Permutations
2018
Researchers in genetics and other life sciences commonly use permutation tests to evaluate differences between groups. Permutation tests have desirable properties, including exactness if data are exchangeable, and are applicable even when the distribution of the test statistic is analytically intractable. However, permutation tests can be computationally intensive. We propose both an asymptotic approximation and a resampling algorithm for quickly estimating small permutation p-values (e.g., <10⁻⁷) for the difference and ratio of means in two-sample tests. Our methods are based on the distribution of test statistics within and across partitions of the permutations, which we define. In this article, we present our methods and demonstrate their use through simulations and an application to cancer genomic data. Through simulations, we find that our resampling algorithm is more computationally efficient than another leading alternative, particularly for extremely small p-values (e.g., <10⁻³⁰). Through application to cancer genomic data, we find that our methods can successfully identify upand down-regulated genes. While we focus on the difference and ratio of means, we speculate that our approaches may work in other settings.
Journal Article
Two-sample homogeneity tests based on divergence measures
2016
The concept of
f
-divergences introduced by Ali and Silvey (J R Stat Soc (B) 28:131–142,
1996
) provides a rich set of distance like measures between pairs of distributions. Divergences do not focus on certain moments of random variables, but rather consider discrepancies between the corresponding probability density functions. Thus, two-sample tests based on these measures can detect arbitrary alternatives when testing the equality of the distributions. We treat the problem of divergence estimation as well as the subsequent testing for the homogeneity of two-samples. In particular, we propose a nonparametric estimator for
f
-divergences in the case of continuous distributions, which is based on kernel density estimation and spline smoothing. As we show in extensive simulations, the new method performs stable and quite well in comparison to several existing non- and semiparametric divergence estimators. Furthermore, we tackle the two-sample homogeneity problem using permutation tests based on various divergence estimators. The methods are compared to an asymptotic divergence test as well as to several traditional parametric and nonparametric procedures under different distributional assumptions and alternatives in simulations. It turns out that divergence based methods detect discrepancies between distributions more often than traditional methods if the distributions do not differ in location only. The findings are illustrated on ion mobility spectrometry data.
Journal Article
TWO-SAMPLE HYPOTHESIS TESTING FOR INHOMOGENEOUS RANDOM GRAPHS
by
von Luxburg, Ulrike
,
Carpentier, Alexandra
,
Gutzeit, Maurilio
in
Apexes
,
Graph theory
,
Graphs
2020
The study of networks leads to a wide range of high-dimensional inference problems. In many practical applications, one needs to draw inference from one or few large sparse networks. The present paper studies hypothesis testing of graphs in this high-dimensional regime, where the goal is to test between two populations of inhomogeneous random graphs defined on the same set of n vertices. The size of each population m is much smaller than n, and can even be a constant as small as 1. The critical question in this context is whether the problem is solvable for small m.
We answer this question from a minimax testing perspective. Let P, Q be the population adjacencies of two sparse inhomogeneous random graph models, and d be a suitably defined distance function. Given a population of m graphs from each model, we derive minimax separation rates for the problem of testing P = Q against d(P, Q) > ρ. We observe that if m is small, then the minimax separation is too large for some popular choices of d, including total variation distance between corresponding distributions. This implies that some models that are widely separated in d cannot be distinguished for small m, and hence, the testing problem is generally not solvable in these cases.
We also show that if m > 1, then the minimax separation is relatively small if d is the Frobenius norm or operator norm distance between P and Q. For m = 1, only the latter distance provides small minimax separation. Thus, for these distances, the problem is solvable for small m. We also present nearoptimal two-sample tests in both cases, where tests are adaptive with respect to sparsity level of the graphs.
Journal Article