Multiple genome alignment for identifying the core structure among moderately related microbial genomesReport as inadecuate




Multiple genome alignment for identifying the core structure among moderately related microbial genomes - Download this document for free, or read online. Document in PDF available to download.

BMC Genomics

, 9:515

First Online: 31 October 2008Received: 05 August 2008Accepted: 31 October 2008

Abstract

BackgroundIdentifying the set of intrinsically conserved genes, or the genomic core, among related genomes is crucial for understanding prokaryotic genomes where horizontal gene transfers are common. Although core genome identification appears to be obvious among very closely related genomes, it becomes more difficult when more distantly related genomes are compared. Here, we consider the core structure as a set of sufficiently long segments in which gene orders are conserved so that they are likely to have been inherited mainly through vertical transfer, and developed a method for identifying the core structure by finding the order of pre-identified orthologous groups OGs that maximally retains the conserved gene orders.

ResultsThe method was applied to genome comparisons of two well-characterized families, Bacillaceae and Enterobacteriaceae, and identified their core structures comprising 1438 and 2125 OGs, respectively. The core sets contained most of the essential genes and their related genes, which were primarily included in the intersection of the two core sets comprising around 700 OGs. The definition of the genomic core based on gene order conservation was demonstrated to be more robust than the simpler approach based only on gene conservation. We also investigated the core structures in terms of G+C content homogeneity and phylogenetic congruence, and found that the core genes primarily exhibited the expected characteristic, i.e., being indigenous and sharing the same history, more than the non-core genes.

ConclusionThe results demonstrate that our strategy of genome alignment based on gene order conservation can provide an effective approach to identify the genomic core among moderately related microbial genomes.

AbbreviationsOGorthologous group

HGThorizontal gene transfer

DPdynamic programming

ORFopen reading frame

GC3G+C content of the third codon positions

NJneighbor joining

MLmaximum likelihood

MDSmultidimensional scaling

SH testShimodaira-Hasegawa test

MBGDMicrobial Genome Database for Comparative Analysis

MSTminimum spanning tree

DAGdirected acyclic graph

PHX geneputative highly expressed gene.

Electronic supplementary materialThe online version of this article doi:10.1186-1471-2164-9-515 contains supplementary material, which is available to authorized users.

Download fulltext PDF



Author: Ikuo Uchiyama

Source: https://link.springer.com/article/10.1186/1471-2164-9-515







Related documents