Optimal properties of centroid-based classifiers for very high-dimensional data - Mathematics > Statistics TheoryReport as inadecuate




Optimal properties of centroid-based classifiers for very high-dimensional data - Mathematics > Statistics Theory - Download this document for free, or read online. Document in PDF available to download.

Abstract: We show that scale-adjusted versions of the centroid-based classifier enjoysoptimal properties when used to discriminate between two very high-dimensionalpopulations where the principal differences are in location. The scaleadjustment removes the tendency of scale differences to confound differences inmeans. Certain other distance-based methods, for example, those founded onnearest-neighbor distance, do not have optimal performance in the sense that wepropose. Our results permit varying degrees of sparsity and signal strength tobe treated, and require only mild conditions on dependence of vectorcomponents. Additionally, we permit the marginal distributions of vectorcomponents to vary extensively. In addition to providing theory we explorenumerical properties of a centroid-based classifier, and show that thesefeatures reflect theoretical accounts of performance.



Author: Peter Hall, Tung Pham

Source: https://arxiv.org/







Related documents