Selecting age-related functional characteristics in the human gut microbiomeReport as inadecuate




Selecting age-related functional characteristics in the human gut microbiome - Download this document for free, or read online. Document in PDF available to download.

Microbiome

, 1:2

First Online: 09 January 2013Received: 13 April 2012Accepted: 23 August 2012

Abstract

BackgroundHuman gut microbial functions are often associated with various diseases and host physiologies. Aging, a less explored factor, is also suspected to affect or be affected by microbiome alterations. By combining functional feature selection with supervised classification, we aim to facilitate identification of age-related functional characteristics in metagenomes from several human gut microbiome studies MetaHIT, MicroAge, MicroObes, Kurokawa et al.’s and Gill et al.’s dataset.

ResultsWe apply two feature selection methods, term frequency-inverse document frequency TF-iDF and minimum-redundancy maximum-relevancy mRMR, to identify functional signatures that differentiate metagenomes by age. After features are reduced, we use a support vector machine SVM to predict host age of new metagenomes. Functional features are from protein families Pfams, Kyoto Encyclopedia of Genes and Genomes KEGG pathways, KEGG ontologies and the Gene Ontology GO database. Initial investigations demonstrate that ordination of the functional principal components shows great overlap between different age groups. However, when feature selection is applied, mRMR tightens the ordination cluster for each age group, and TF-iDF offers better linear separation. Both TF-iDF and mRMR were used in conjunction with a SVM classifier and achieved areas under receiver operating characteristic curves AUCs 10 to 15% above chance to classify individuals above-below mid-ages about 38 to 43 years old using Pfams. Better performance around mid-ages is also observed when using other functional categories and age-balanced dataset. We also identified some age-related Pfams that improved age discrimination at age 65 with another feature selection method called LEfSe, on an age-balanced dataset. The selected functional characteristics identify a broad range of age-relevant metabolisms, such as reduced vitamin B12 synthesis, reduced activity of reductases, increased DNA damage, occurrences of stress responses and immune system compromise, and upregulated glycosyltransferases in the aging population.

ConclusionsFeature selection can yield biologically meaningful results when used in conjunction with classification, and makes age classification of new human gut metagenomes feasible. While we demonstrate the promise of this approach, the data-dependent prediction performance could be further improved. We hypothesize that while the Qin et al. dataset is the most comprehensive to date, even deeper sampling is needed to better characterize and predict the microbiomes’ functional content.

KeywordsMetagenomics KEGG Pfam SVM Supervised classification AbbreviationsAUCarea under ROC curve

BLASTBasic Local Alignment Search Tool

COGsClusters of Orthologous Groups of proteins

GAgathering threshold

GOGene Ontology

IBDinflammatory bowel disease

KEGGKyoto Encyclopedia of Genes and Genomes

LEfSelinear discriminant analysis effect size

mRMRminimum-redundancy maximum-relevancy

PCAprincipal component analysis

Pfamprotein family

ROCreceiver operating characteristic

SVMsupport vector machine

tbPCAtransformation-based principal component analysis

TF-iDFterm frequency-inverse document frequency

UniProtUniversal Protein Resource databases.

Electronic supplementary materialThe online version of this article doi:10.1186-2049-2618-1-2 contains supplementary material, which is available to authorized users.

Download fulltext PDF



Author: Yemin Lan - Andres Kriete - Gail L Rosen

Source: https://link.springer.com/article/10.1186/2049-2618-1-2







Related documents