Some non-asymptotic results on resampling in high dimension, I: Confidence regions, II: Multiple testsReport as inadecuate

Some non-asymptotic results on resampling in high dimension, I: Confidence regions, II: Multiple tests - Download this document for free, or read online. Document in PDF available to download.

1 LIENS - Laboratoire d-informatique de l-école normale supérieure 2 WILLOW - Models of visual object recognition and scene understanding DI-ENS - Département d-informatique de l-École normale supérieure, ENS Paris - École normale supérieure - Paris, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548 3 WIAS - Weierstrass Institute for Applied Analysis and Stochastics 4 FHG FIRST.IDA - Fraunhofer FIRST, IDA group 5 LPMA - Laboratoire de Probabilités et Modèles Aléatoires

Abstract : We study generalized bootstrap confidence regions for the mean of a random vector whose coordinates have an unknown dependency structure. The random vector is supposed to be either Gaussian or to have a symmetric and bounded distribution. The dimensionality of the vector can possibly be much larger than the number of observations and we focus on a non-asymptotic control of the confidence level, following ideas inspired by recent results in learning theory. We consider two approaches, the first based on a concentration principle valid for a large class of resampling weights and the second on a direct resampled quantile, specifically using Rademacher weights. Several intermediate results established in the approach based on concentration principles are of self-interest. We also discuss the question of accuracy when using Monte-Carlo approximations of the resampled quantities. We present an application of these results to the one-sided and two-sided multiple testing problem, in which we derive several resampling-based step-down procedures providing a non-asymptotic FWER control. We compare our different procedures in a simulation study, and we show that they can outperform Bonferroni-s or Holm-s procedures as soon as the observed vector has sufficiently correlated coordinates.

en it

Keywords : resampling resampled quantile

keyword : confidence regions family-wise error multiple testing high dimensional data non-asymptotic error control cross-validation concentration inequalities

Author: Sylvain Arlot - Gilles Blanchard - Etienne Roquain -



Related documents