A Bayesian reassessment of nearest-neighbour classification - Statistics > ComputationReport as inadecuate

A Bayesian reassessment of nearest-neighbour classification - Statistics > Computation - Download this document for free, or read online. Document in PDF available to download.

Abstract: The k-nearest-neighbour procedure is a well-known deterministic method usedin supervised classification. This paper proposes a reassessment of thisapproach as a statistical technique derived from a proper probabilistic model;in particular, we modify the assessment made in a previous analysis of thismethod undertaken by Holmes and Adams 2002,2003, and evaluated by Manocha andGirolami 2007, where the underlying probabilistic model is not completelywell-defined. Once a clear probabilistic basis for the k-nearest-neighbourprocedure is established, we derive computational tools for conducting Bayesianinference on the parameters of the corresponding model. In particular, weassess the difficulties inherent to pseudo-likelihood and to path samplingapproximations of an intractable normalising constant, and propose a perfectsampling strategy to implement a correct MCMC sampler associated with ourmodel. If perfect sampling is not available, we suggest using a Gibbs samplingapproximation. Illustrations of the performance of the corresponding Bayesianclassifier are provided for several benchmark datasets, demonstrating inparticular the limitations of the pseudo-likelihood approximation in thisset-up.

Author: Lionel Cucala, Jean-Michel Marin, Christian Robert, Mike Titterington

Source: https://arxiv.org/


Related documents