Bag of Naïve Bayes: biomarker selection and classification from genome-wide SNP dataReport as inadecuate




Bag of Naïve Bayes: biomarker selection and classification from genome-wide SNP data - Download this document for free, or read online. Document in PDF available to download.

BMC Bioinformatics

, 13:S2

First Online: 07 September 2012

Abstract

BackgroundMultifactorial diseases arise from complex patterns of interaction between a set of genetic traits and the environment. To fully capture the genetic biomarkers that jointly explain the heritability component of a disease, thus, all SNPs from a genome-wide association study should be analyzed simultaneously.

ResultsIn this paper, we present Bag of Naïve Bayes BoNB, an algorithm for genetic biomarker selection and subjects classification from the simultaneous analysis of genome-wide SNP data. BoNB is based on the Naïve Bayes classification framework, enriched by three main features: bootstrap aggregating of an ensemble of Naïve Bayes classifiers, a novel strategy for ranking and selecting the attributes used by each classifier in the ensemble and a permutation-based procedure for selecting significant biomarkers, based on their marginal utility in the classification process. BoNB is tested on the Wellcome Trust Case-Control study on Type 1 Diabetes and its performance is compared with the ones of both a standard Naïve Bayes algorithm and HyperLASSO, a penalized logistic regression algorithm from the state-of-the-art in simultaneous genome-wide data analysis.

ConclusionsThe significantly higher classification accuracy obtained by BoNB, together with the significance of the biomarkers identified from the Type 1 Diabetes dataset, prove the effectiveness of BoNB as an algorithm for both classification and biomarker selection from genome-wide SNP data.

AvailabilitySource code of the BoNB algorithm is released under the GNU General Public Licence and is available at http:-www.dei.unipd.it-~sambofra-bonb.html.

List of abbreviationsSNPSingle Nucleotide Polymorphism

GWASGenome-Wide Association Study

NBCNaïve Bayes Classifier

OOBOut-of-Bag

MCCMatthews Correlation Coefficient

MUMarginal Utility.

Download fulltext PDF



Author: Francesco Sambo - Emanuele Trifoglio - Barbara Di Camillo - Gianna M Toffolo - Claudio Cobelli

Source: https://link.springer.com/



DOWNLOAD PDF




Related documents