On the Study of Constructing Genome Trees of Prokaryotes Based on Overlapping Genes
Chin Lung Lu
|關鍵字:||生物資訊;演算法;基因體樹;直系同源;重疊基因;bioinformatics;algorithm;genome tree;ortholog;overlapping gene|
As more and more complete genomes of species are available, phylogenetic tree inference by comparing whole genome can be helpful for the reconstruction of evolutionary relationships of species. In addition to sequence-based phylogenomic approaches, methods based on whole genomes, like those based on gene content and gene orders, can be used to construct more precise and robust phylogenetic trees. However, it has been reported in the literature that the genome trees constructed only based on gene content or gene order may not be suitable for microbial genomes. To address these problems, Luo et al. [6, 7] have recently proposed an alternative way to reconstruct genome trees of bacteria using a measure based on the presence and absence of overlapping genes. The overlapping genes (OGs) are defined as adjacent genes whose coding sequences overlap partially or entirely. Actually, OGs are ubiquitous in microbial genomes and more conserved between species than non-overlapping genes, implying that OGs can serve as better phylogenetic characters than non-overlapping genes for reconstructing the evolutionary relationships among microbial genomes. In fact, during evolutionary process, species genomes are subject to genome rearrangements that alter the order and orientation of genes on the genomes, leading to that the orders of orthologous genes, as well as the ones of orthologous OG pairs certainly, even between two closely related species may not be conserved. This suggests that not only OG content but also orthologous OG order should be considered to reconstruct the genome trees of prokaryotic species. Therefore, in this thesis, we define a new distance measure, called as overlapping-gene distance, between two genomes based on a combination of OG content and OG order in their whole genomes. We then use UPGMA, as well as NJ and FM (Fitch-Margolias), to build the genome tree of prokaryotic genomes according to their pairwise OG distance. Based on the method described above, we have implemented a web-based tool, called OGtree, for constructing the genome trees of prokaryotes based on OG distance between prokaryotic complete genomes. In addition, we have tested our OGtree on several Proteobacteria complete genomes to assess its quality of genome tree reconstruction. Compared with the phylogenetic trees produced by Luo et al. [6, 7], the genome trees constructed by our OGtree are quite consistent with those reference trees that were reconstructed based on 16S rRNAs as well as concatenation of multiple proteins. All these results have suggested that our OGtree can serve as a useful tool for constructing more precise and robust genome trees for prokaryotic genomes.
|Appears in Collections:||Thesis|
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.