Information Distance in Multiples - Computer Science > Computer Vision and Pattern RecognitionReport as inadecuate




Information Distance in Multiples - Computer Science > Computer Vision and Pattern Recognition - Download this document for free, or read online. Document in PDF available to download.

Abstract: Information distance is a parameter-free similarity measure based oncompression, used in pattern recognition, data mining, phylogeny, clustering,and classification. The notion of information distance is extended from pairsto multiples finite lists. We study maximal overlap, metricity, universality,minimal overlap, additivity, and normalized information distance in multiples.We use the theoretical notion of Kolmogorov complexity which for practicalpurposes is approximated by the length of the compressed version of the fileinvolved, using a real-world compression program.{\em Index Terms}- Information distance, multiples, pattern recognition,data mining, similarity, Kolmogorov complexity



Author: Paul M.B. Vitanyi

Source: https://arxiv.org/







Related documents