Genome classification by gene distribution: an overlapping subspace clustering approach.

Abstract:

BACKGROUND:Genomes of lower organisms have been observed with a large amount of horizontal gene transfers, which cause difficulties in their evolutionary study. Bacteriophage genomes are a typical example. One recent approach that addresses this problem is the unsupervised clustering of genomes based on gene order and genome position, which helps to reveal species relationships that may not be apparent from traditional phylogenetic methods. RESULTS:We propose the use of an overlapping subspace clustering algorithm for such genome classification problems. The advantage of subspace clustering over traditional clustering is that it can associate clusters with gene arrangement patterns, preserving genomic information in the clusters produced. Additionally, overlapping capability is desirable for the discovery of multiple conserved patterns within a single genome, such as those acquired from different species via horizontal gene transfers. The proposed method involves a novel strategy to vectorize genomes based on their gene distribution. A number of existing subspace clustering and biclustering algorithms were evaluated to identify the best framework upon which to develop our algorithm; we extended a generic subspace clustering algorithm called HARP to incorporate overlapping capability. The proposed algorithm was assessed and applied on bacteriophage genomes. The phage grouping results are consistent overall with the Phage Proteomic Tree and showed common genomic characteristics among the TP901-like, Sfi21-like and sk1-like phage groups. Among 441 phage genomes, we identified four significantly conserved distribution patterns structured by the terminase, portal, integrase, holin and lysin genes. We also observed a subgroup of Sfi21-like phages comprising a distinctive divergent genome organization and identified nine new phage members to the Sfi21-like genus: Staphylococcus 71, phiPVL108, Listeria A118, 2389, Lactobacillus phi AT3, A2, Clostridium phi3626, Geobacillus GBSV1, and Listeria monocytogenes PSA. CONCLUSION:The method described in this paper can assist evolutionary study through objectively classifying genomes based on their resemblance in gene order, gene content and gene positions. The method is suitable for application to genomes with high genetic exchange and various conserved gene arrangement, as demonstrated through our application on phages.

journal_name

BMC Evol Biol

journal_title

BMC evolutionary biology

authors

Li J,Halgamuge SK,Tang SL

doi

10.1186/1471-2148-8-116

subject

Has Abstract

pub_date

2008-04-23 00:00:00

pages

116

issn

1471-2148

pii

1471-2148-8-116

journal_volume

8

pub_type

杂志文章
  • Anamorphic development and extended parental care in a 520 million-year-old stem-group euarthropod from China.

    abstract:BACKGROUND:Extended parental care is a complex reproductive strategy in which progenitors actively look after their offspring up to - or beyond - the first juvenile stage in order to maximize their fitness. Although the euarthropod fossil record has produced several examples of brood-care, the appearance of extended pa...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-018-1262-6

    authors: Fu D,Ortega-Hernández J,Daley AC,Zhang X,Shu D

    更新日期:2018-09-29 00:00:00

  • Whole chloroplast genome and gene locus phylogenies reveal the taxonomic placement and relationship of Tripidium (Panicoideae: Andropogoneae) to sugarcane.

    abstract:BACKGROUND:For over 50 years, attempts have been made to introgress agronomically useful traits from Erianthus sect. Ripidium (Tripidium) species into sugarcane based on both genera being part of the 'Saccharum Complex', an interbreeding group of species believed to be involved in the origins of sugarcane. However, rec...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-019-1356-9

    authors: Lloyd Evans D,Joshi SV,Wang J

    更新日期:2019-01-25 00:00:00

  • Timeframe of speciation inferred from secondary contact zones in the European tree frog radiation (Hyla arborea group).

    abstract:BACKGROUND:Hybridization between incipient species is expected to become progressively limited as their genetic divergence increases and reproductive isolation proceeds. Amphibian radiations and their secondary contact zones are useful models to infer the timeframes of speciation, but empirical data from natural system...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-015-0385-2

    authors: Dufresnes C,Brelsford A,Crnobrnja-Isailović J,Tzankov N,Lymberakis P,Perrin N

    更新日期:2015-08-08 00:00:00

  • Mitochondrial matR sequences help to resolve deep phylogenetic relationships in rosids.

    abstract:BACKGROUND:Rosids are a major clade in the angiosperms containing 13 orders and about one-third of angiosperm species. Recent molecular analyses recognized two major groups (i.e., fabids with seven orders and malvids with three orders). However, phylogenetic relationships within the two groups and among fabids, malvids...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-7-217

    authors: Zhu XY,Chase MW,Qiu YL,Kong HZ,Dilcher DL,Li JH,Chen ZD

    更新日期:2007-11-10 00:00:00

  • A mitogenomic phylogeny of chitons (Mollusca: Polyplacophora).

    abstract:BACKGROUND:Polyplacophora, or chitons, have long fascinated malacologists for their distinct and rather conserved morphology and lifestyle compared to other mollusk classes. However, key aspects of their phylogeny and evolution remain unclear due to the few morphological, molecular, or combined phylogenetic analyses, p...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-019-1573-2

    authors: Irisarri I,Uribe JE,Eernisse DJ,Zardoya R

    更新日期:2020-02-05 00:00:00

  • Evolutionary analysis of the kinesin light chain genes in the yellow fever mosquito Aedes aegypti: gene duplication as a source for novel early zygotic genes.

    abstract:BACKGROUND:The maternal zygotic transition marks the time at which transcription from the zygotic genome is initiated and a subset of maternal RNAs are progressively degraded in the developing embryo. A number of early zygotic genes have been identified in Drosophila melanogaster and comparisons to sequenced mosquito g...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-10-206

    authors: Biedler JK,Tu Z

    更新日期:2010-07-08 00:00:00

  • Disjunct distribution and distinct intraspecific diversification of Eothenomys melanogaster in South China.

    abstract:BACKGROUND:South China encompasses complex and diverse landforms, giving rise to high biological diversity and endemism from the Hengduan Mountains to Taiwan Island. Many species are widely distributed across South China with similar disjunct distribution patterns. To explore the causes of these disjunct distribution p...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-018-1168-3

    authors: Lv X,Cheng J,Meng Y,Chang Y,Xia L,Wen Z,Ge D,Liu S,Yang Q

    更新日期:2018-04-10 00:00:00

  • Genomic sequence analyses of classical and non-classical lamprey progesterone receptor genes and the inference of homologous gene evolution in metazoans.

    abstract:BACKGROUND:Nuclear progesterone receptor (nPR) is an evolutionary innovation in vertebrates that mediates genomic responses to progesterone. Vertebrates also respond to progesterone via membrane progesterone receptors (mPRs) or membrane associated progesterone receptors (MAPRs) through rapid nongenomic mechanisms. Lamp...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-019-1463-7

    authors: Ren J,Chung-Davidson YW,Jia L,Li W

    更新日期:2019-07-01 00:00:00

  • Molecular evolution of the vertebrate TLR1 gene family--a complex history of gene duplication, gene conversion, positive selection and co-evolution.

    abstract:BACKGROUND:The Toll-like receptors represent a large superfamily of type I transmembrane glycoproteins, some common to a wide range of species and others are more restricted in their distribution. Most members of the Toll-like receptor superfamily have few paralogues; the exception is the TLR1 gene family with four clo...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-11-149

    authors: Huang Y,Temperley ND,Ren L,Smith J,Li N,Burt DW

    更新日期:2011-05-28 00:00:00

  • Reassortment patterns of avian influenza virus internal segments among different subtypes.

    abstract:BACKGROUND:The segmented RNA genome of avian Influenza viruses (AIV) allows genetic reassortment between co-infecting viruses, providing an evolutionary pathway to generate genetic innovation. The genetic diversity (16 haemagglutinin and 9 neuraminidase subtypes) of AIV indicates an extensive reservoir of influenza vir...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-14-16

    authors: Lu L,Lycett SJ,Leigh Brown AJ

    更新日期:2014-01-24 00:00:00

  • Population structure of Venturia inaequalis, a causal agent of apple scab, in response to heterogeneous apple tree cultivation.

    abstract:BACKGROUND:Tracking newly emergent virulent populations in agroecosystems provides an opportunity to increase our understanding of the co-evolution dynamics of pathogens and their hosts. On the one hand host plants exert selective pressure on pathogen populations, thus dividing them into subpopulations of different vir...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-018-1122-4

    authors: Michalecka M,Masny S,Leroy T,Puławska J

    更新日期:2018-01-19 00:00:00

  • The complete mitochondrial genome of Scutopus ventrolineatus (Mollusca: Chaetodermomorpha) supports the Aculifera hypothesis.

    abstract:BACKGROUND:With more than 100000 living species, mollusks are the second most diverse metazoan phylum. The current taxonomic classification of mollusks recognizes eight classes (Neomeniomorpha, Chaetodermomorpha, Polyplacophora, Monoplacophora, Cephalopoda, Gastropoda, Bivalvia, and Scaphopoda) that exhibit very distin...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-014-0197-9

    authors: Osca D,Irisarri I,Todt C,Grande C,Zardoya R

    更新日期:2014-09-25 00:00:00

  • Phylogenetic relationships of typical antbirds (Thamnophilidae) and test of incongruence based on Bayes factors.

    abstract:BACKGROUND:The typical antbirds (Thamnophilidae) form a monophyletic and diverse family of suboscine passerines that inhabit neotropical forests. However, the phylogenetic relationships within this assemblage are poorly understood. Herein, we present a hypothesis of the generic relationships of this group based on Baye...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-4-23

    authors: Irestedt M,Fjeldså J,Nylander JA,Ericson PG

    更新日期:2004-07-30 00:00:00

  • Multi-model seascape genomics identifies distinct environmental drivers of selection among sympatric marine species.

    abstract:BACKGROUND:As global change and anthropogenic pressures continue to increase, conservation and management increasingly needs to consider species' potential to adapt to novel environmental conditions. Therefore, it is imperative to characterise the main selective forces acting on ecosystems, and how these may influence ...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-020-01679-4

    authors: Nielsen ES,Henriques R,Beger M,Toonen RJ,von der Heyden S

    更新日期:2020-09-16 00:00:00

  • Natural selection drove metabolic specialization of the chromatophore in Paulinella chromatophora.

    abstract:BACKGROUND:Genome degradation of host-restricted mutualistic endosymbionts has been attributed to inactivating mutations and genetic drift while genes coding for host-relevant functions are conserved by purifying selection. Unlike their free-living relatives, the metabolism of mutualistic endosymbionts and endosymbiont...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-017-0947-6

    authors: Valadez-Cano C,Olivares-Hernández R,Resendis-Antonio O,DeLuna A,Delaye L

    更新日期:2017-04-14 00:00:00

  • Molecular phylogeny of the subfamily Stevardiinae Gill, 1858 (Characiformes: Characidae): classification and the evolution of reproductive traits.

    abstract:BACKGROUND:The subfamily Stevardiinae is a diverse and widely distributed clade of freshwater fishes from South and Central America, commonly known as "tetras" (Characidae). The group was named "clade A" when first proposed as a monophyletic unit of Characidae and later designated as a subfamily. Stevardiinae includes ...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-015-0403-4

    authors: Thomaz AT,Arcila D,Ortí G,Malabarba LR

    更新日期:2015-07-21 00:00:00

  • Montmorillonite protection of an UV-irradiated hairpin ribozyme: evolution of the RNA world in a mineral environment.

    abstract:BACKGROUND:The hypothesis of an RNA-based origin of life, known as the "RNA world", is strongly affected by the hostile environmental conditions probably present in the early Earth. In particular, strong UV and X-ray radiations could have been a major obstacle to the formation and evolution of the first biomolecules. I...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-7-S2-S2

    authors: Biondi E,Branciamore S,Maurel MC,Gallori E

    更新日期:2007-08-16 00:00:00

  • A web-database of mammalian morphology and a reanalysis of placental phylogeny.

    abstract:BACKGROUND:Recent publications concerning the interordinal phylogeny of placental mammals have converged on a common signal, consisting of four major radiations with some ambiguity regarding the placental root. The DNA data with which these relationships have been reconstructed are easily accessible from public databas...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-7-108

    authors: Asher RJ

    更新日期:2007-07-03 00:00:00

  • Theoretical analysis of the evolution of immune memory.

    abstract:BACKGROUND:The ability of an immune system to remember pathogens improves the chance of the host to survive a second exposure to the same pathogen. This immunological memory has evolved in response to the pathogen environment of the hosts. In vertebrates, the memory of previous infection is physiologically accomplished...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-10-380

    authors: Graw F,Magnus C,Regoes RR

    更新日期:2010-12-08 00:00:00

  • Utility of characters evolving at diverse rates of evolution to resolve quartet trees with unequal branch lengths: analytical predictions of long-branch effects.

    abstract:BACKGROUND:The detection and avoidance of "long-branch effects" in phylogenetic inference represents a longstanding challenge for molecular phylogenetic investigations. A consequence of parallelism and convergence, long-branch effects arise in phylogenetic inference when there is unequal molecular divergence among line...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-015-0364-7

    authors: Su Z,Townsend JP

    更新日期:2015-05-14 00:00:00

  • Population size may shape the accumulation of functional mutations following domestication.

    abstract:BACKGROUND:Population genetics theory predicts an important role of differences in the effective population size (N e ) among species on shaping the accumulation of functional mutations by regulating the selection efficiency. However, this correlation has never been tested in domesticated animals. RESULTS:Here, we syn...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-018-1120-6

    authors: Chen J,Ni P,Li X,Han J,Jakovlić I,Zhang C,Zhao S

    更新日期:2018-01-19 00:00:00

  • Ecological, genetic and evolutionary drivers of regional genetic differentiation in Arabidopsis thaliana.

    abstract:BACKGROUND:Disentangling the drivers of genetic differentiation is one of the cornerstones in evolution. This is because genetic diversity, and the way in which it is partitioned within and among populations across space, is an important asset for the ability of populations to adapt and persist in changing environments...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-020-01635-2

    authors: Castilla AR,Méndez-Vigo B,Marcer A,Martínez-Minaya J,Conesa D,Picó FX,Alonso-Blanco C

    更新日期:2020-06-22 00:00:00

  • Resolving ambiguity in the phylogenetic relationship of genotypes A, B, and C of hepatitis B virus.

    abstract:BACKGROUND:Hepatitis B virus (HBV) is an important infectious agent that causes widespread concern because billions of people are infected by at least 8 different HBV genotypes worldwide. However, reconstruction of the phylogenetic relationship between HBV genotypes is difficult. Specifically, the phylogenetic relation...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-13-120

    authors: Jiang Y,Wang M,Zheng H,Wang WR,Jin L,He Y

    更新日期:2013-06-11 00:00:00

  • Translational machinery of the chaetognath Spadella cephaloptera: a transcriptomic approach to the analysis of cytosolic ribosomal protein genes and their expression.

    abstract:BACKGROUND:Chaetognaths, or arrow worms, are small marine, bilaterally symmetrical metazoans. The objective of this study was to analyse ribosomal protein (RP) coding sequences from a published collection of expressed sequence tags (ESTs) from a chaetognath (Spadella cephaloptera) and to use them in phylogenetic studie...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-7-146

    authors: Barthélémy RM,Chenuil A,Blanquart S,Casanova JP,Faure E

    更新日期:2007-08-28 00:00:00

  • Phylogeography of the Italian vairone (Telestes muticellus, Bonaparte 1837) inferred by microsatellite markers: evolutionary history of a freshwater fish species with a restricted and fragmented distribution.

    abstract:BACKGROUND:Owing to its independence from the main Central European drainage systems, the Italian freshwater fauna is characterized by a high degree of endemicity. Three main ichthyogeographic districts have been proposed in Italy. Yet, the validity of these regions has not been confirmed by phylogenetic and population...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-10-111

    authors: Marchetto F,Zaccara S,Muenzel FM,Salzburger W

    更新日期:2010-04-27 00:00:00

  • Duplications and functional divergence of ADP-glucose pyrophosphorylase genes in plants.

    abstract:BACKGROUND:ADP-glucose pyrophosphorylase (AGPase), which catalyses a rate limiting step in starch synthesis, is a heterotetramer comprised of two identical large and two identical small subunits in plants. Although the large and small subunits are equally sensitive to activity-altering amino acid changes when expressed...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-8-232

    authors: Georgelis N,Braun EL,Hannah LC

    更新日期:2008-08-12 00:00:00

  • The mitochondrial phylogeny of an ancient lineage of ray-finned fishes (Polypteridae) with implications for the evolution of body elongation, pelvic fin loss, and craniofacial morphology in Osteichthyes.

    abstract:BACKGROUND:The family Polypteridae, commonly known as "bichirs", is a lineage that diverged early in the evolutionary history of Actinopterygii (ray-finned fish), but has been the subject of far less evolutionary study than other members of that clade. Uncovering patterns of morphological change within Polypteridae pro...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-10-21

    authors: Suzuki D,Brandley MC,Tokita M

    更新日期:2010-01-25 00:00:00

  • Unravelling the role of host plant expansion in the diversification of a Neotropical butterfly genus.

    abstract:BACKGROUND:Understanding the processes underlying diversification is a central question in evolutionary biology. For butterflies, access to new host plants provides opportunities for adaptive speciation. On the one hand, locally abundant host species can generate ecologically significant selection pressure. But a diver...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/s12862-016-0701-5

    authors: McClure M,Elias M

    更新日期:2016-06-16 00:00:00

  • Biophysical and structural considerations for protein sequence evolution.

    abstract:BACKGROUND:Protein sequence evolution is constrained by the biophysics of folding and function, causing interdependence between interacting sites in the sequence. However, current site-independent models of sequence evolutions do not take this into account. Recent attempts to integrate the influence of structure and bi...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-11-361

    authors: Grahnen JA,Nandakumar P,Kubelka J,Liberles DA

    更新日期:2011-12-16 00:00:00

  • SmTRC1, a novel Schistosoma mansoni DNA transposon, discloses new families of animal and fungi transposons belonging to the CACTA superfamily.

    abstract:BACKGROUND:The CACTA (also called En/Spm) superfamily of DNA-only transposons contain the core sequence CACTA in their Terminal Inverted Repeats (TIRs) and so far have only been described in plants. Large transcriptome and genome sequence data have recently become publicly available for Schistosoma mansoni, a digenetic...

    journal_title:BMC evolutionary biology

    pub_type: 杂志文章

    doi:10.1186/1471-2148-6-89

    authors: DeMarco R,Venancio TM,Verjovski-Almeida S

    更新日期:2006-11-07 00:00:00