On the stability of the Bayenv method in assessing human SNP-environment associationsReport as inadecuate




On the stability of the Bayenv method in assessing human SNP-environment associations - Download this document for free, or read online. Document in PDF available to download.

Human Genomics

, 8:1

First Online: 09 January 2014Received: 21 November 2013Accepted: 17 December 2013

Abstract

BackgroundPhenotypic variation along environmental gradients has been documented among and within many species, and in some cases, genetic variation has been shown to be associated with these gradients. Bayenv is a relatively new method developed to detect patterns of polymorphisms associated with environmental gradients. Using a Bayesian Markov Chain Monte Carlo MCMC approach, Bayenv evaluates whether a linear model relating population allele frequencies to environmental variables is more probable than a null model based on observed frequencies of neutral markers. Although this method has been used to detect environmental adaptation in a number of species, including humans, plants, fish, and mosquitoes, stability between independent runs of this MCMC algorithm has not been characterized. In this paper, we explore the variability of results between runs and the factors contributing to it.

ResultsIndependent runs of the Bayenv program were carried out using genome-wide single-nucleotide polymorphism SNP data from samples from 60 worldwide human populations following previous applications of the Bayenv method. To assess factors contributing to the method-s stability, we used varying numbers of MCMC iterations and also analyzed a second modified data set that excluded two Siberian populations with extreme climate variables. Between any two runs, correlations between Bayes factors and the overlap of SNPs in the empirical p value tails were surprisingly low. Enrichments of genic versus non-genic SNPs in the empirical tails were more robust than the empirical p values; however, the significance of the enrichments for some environmental variables still varied among runs, contradicting previously published conclusions. Runs with a greater number of MCMC iterations slightly reduced run-to-run variability, and excluding the Siberian populations did not have a large effect on the stability of the runs.

ConclusionsBecause of high run-to-run variability, we advise against making conclusions about genome-wide patterns of adaptation based on only one run of the Bayenv algorithm and recommend caution in interpreting previous studies that have used only one run. Moving forward, we suggest carrying out multiple independent runs of Bayenv and averaging Bayes factors between runs to produce more stable and reliable results. With these modifications, future discoveries of environmental adaptation within species using the Bayenv method will be more accurate, interpretable, and easily compared between studies.

KeywordsEnvironmental adaptation Positive selection Genome-wide scans Human adaptation Markov chain monte carlo Natural selection AbbreviationsBFBayes factor

MCMCMarkov Chain Monte Carlo

SNPSingle-nucleotide polymorphism.

Download fulltext PDF



Author: Lily M Blair - Julie M Granka - Marcus W Feldman

Source: https://link.springer.com/



DOWNLOAD PDF




Related documents