Orthonome - a new pipeline for predicting high quality orthologue gene sets applicable to complete and draft genomes.

Abstract:

BACKGROUND:Distinguishing orthologous and paralogous relationships between genes across multiple species is essential for comparative genomic analyses. Various computational approaches have been developed to resolve these evolutionary relationships, but strong trade-offs between precision and recall of orthologue prediction remains an ongoing challenge. RESULTS:Here we present Orthonome, an orthologue prediction pipeline, designed to reduce the trade-off between orthologue capture rates (recall) and accuracy of multi-species orthologue prediction. The pipeline compares sequence domains and then forms sequence-similar clusters before using phylogenetic comparisons to identify inparalogues. It then corrects sequence similarity metrics for fragment and gene length bias using a novel scoring metric capturing relationships between full length as well as fragmented genes. The remaining genes are then brought together for the identification of orthologues within a phylogenetic framework. The orthologue predictions are further calibrated along with inparalogues and gene births, using synteny, to identify novel orthologous relationships. We use 12 high quality Drosophila genomes to show that, compared to other orthologue prediction pipelines, Orthonome provides orthogroups with minimal error but high recall. Furthermore, Orthonome is resilient to suboptimal assembly/annotation quality, with the inclusion of draft genomes from eight additional Drosophila species still providing >6500 1:1 orthologues across all twenty species while retaining a better combination of accuracy and recall than other pipelines. Orthonome is implemented as a searchable database and query tool along with multiple-sequence alignment browsers for all sets of orthologues. The underlying documentation and database are accessible at http://www.orthonome.com . CONCLUSION:We demonstrate that Orthonome provides a superior combination of orthologue capture rates and accuracy on complete and draft drosophilid genomes when tested alongside previously published pipelines. The study also highlights a greater degree of evolutionary conservation across drosophilid species than earlier thought.

journal_name

BMC Genomics

journal_title

BMC genomics

authors

Rane RV,Oakeshott JG,Nguyen T,Hoffmann AA,Lee SF

doi

10.1186/s12864-017-4079-6

subject

Has Abstract

pub_date

2017-08-31 00:00:00

pages

673

issue

1

issn

1471-2164

pii

10.1186/s12864-017-4079-6

journal_volume

18

pub_type

杂志文章
  • Comparative analysis of surface-exposed virulence factors of Acinetobacter baumannii.

    abstract:BACKGROUND:Acinetobacter baumannii is a significant hospital pathogen, particularly due to the dissemination of highly multidrug resistant isolates. Genome data have revealed that A. baumannii is highly genetically diverse, which correlates with major variations seen at the phenotypic level. Thus far, comparative genom...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-1020

    authors: Eijkelkamp BA,Stroeher UH,Hassan KA,Paulsen IT,Brown MH

    更新日期:2014-11-25 00:00:00

  • Distinct gene loci control the host response to influenza H1N1 virus infection in a time-dependent manner.

    abstract:BACKGROUND:There is strong but mostly circumstantial evidence that genetic factors modulate the severity of influenza infection in humans. Using genetically diverse but fully inbred strains of mice it has been shown that host sequence variants have a strong influence on the severity of influenza A disease progression. ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-411

    authors: Nedelko T,Kollmus H,Klawonn F,Spijker S,Lu L,Heßman M,Alberts R,Williams RW,Schughart K

    更新日期:2012-08-20 00:00:00

  • Biosynthesis of the active compounds of Isatis indigotica based on transcriptome sequencing and metabolites profiling.

    abstract:BACKGROUND:Isatis indigotica is a widely used herb for the clinical treatment of colds, fever, and influenza in Traditional Chinese Medicine (TCM). Various structural classes of compounds have been identified as effective ingredients. However, little is known at genetics level about these active metabolites. In the pre...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-857

    authors: Chen J,Dong X,Li Q,Zhou X,Gao S,Chen R,Sun L,Zhang L,Chen W

    更新日期:2013-12-05 00:00:00

  • Mixed evolutionary origins of endogenous biomass-depolymerizing enzymes in animals.

    abstract:BACKGROUND:Animals are thought to achieve lignocellulose digestion via symbiotic associations with gut microbes; this view leads to significant focus on bacteria and fungi for lignocellulolytic systems. The presence of biomass conversion systems hardwired into animal genomes has not yet been unequivocally demonstrated....

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-4861-0

    authors: Chang WH,Lai AG

    更新日期:2018-06-20 00:00:00

  • Antagonistic, overlapping and distinct responses to biotic stress in rice (Oryza sativa) and interactions with abiotic stress.

    abstract:BACKGROUND:Every year, substantial crop loss occurs globally, as a result of bacterial, fungal, parasite and viral infections in rice. Here, we present an in-depth investigation of the transcriptomic response to infection with the destructive bacterial pathogen Xanthomonas oryzae pv. oryzae(Xoo) in both resistant and s...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-93

    authors: Narsai R,Wang C,Chen J,Wu J,Shou H,Whelan J

    更新日期:2013-02-12 00:00:00

  • MS2CNN: predicting MS/MS spectrum based on protein sequence using deep convolutional neural networks.

    abstract:BACKGROUND:Tandem mass spectrometry allows biologists to identify and quantify protein samples in the form of digested peptide sequences. When performing peptide identification, spectral library search is more sensitive than traditional database search but is limited to peptides that have been previously identified. An...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-6297-6

    authors: Lin YM,Chen CT,Chang JM

    更新日期:2019-12-24 00:00:00

  • Selective inhibition of yeast regulons by daunorubicin: a transcriptome-wide analysis.

    abstract:BACKGROUND:The antitumor drug daunorubicin exerts some of its cytotoxic effects by binding to DNA and inhibiting the transcription of different genes. We analysed this effect in vivo at the transcriptome level using the budding yeast Saccharomyces cerevisiae as a model and sublethal (IC40) concentrations of the drug to...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-358

    authors: Rojas M,Casado M,Portugal J,Piña B

    更新日期:2008-07-30 00:00:00

  • Comparative venom gland transcriptome analysis of the scorpion Lychas mucronatus reveals intraspecific toxic gene diversity and new venomous components.

    abstract:BACKGROUND:Lychas mucronatus is one scorpion species widely distributed in Southeast Asia and southern China. Anything is hardly known about its venom components, despite the fact that it can often cause human accidents. In this work, we performed a venomous gland transcriptome analysis by constructing and screening th...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-11-452

    authors: Ruiming Z,Yibao M,Yawen H,Zhiyong D,Yingliang W,Zhijian C,Wenxin L

    更新日期:2010-07-28 00:00:00

  • Transcriptome analysis reveals mechanism underlying the differential intestinal functionality of laying hens in the late phase and peak phase of production.

    abstract:BACKGROUND:The compromised performance of laying hens in the late phase of production relative to the peak production was thought to be associated with the impairment of intestinal functionality, which plays essential roles in contributing to their overall health and production performance. In the present study, RNA se...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-6320-y

    authors: Wang WW,Wang J,Zhang HJ,Wu SG,Qi GH

    更新日期:2019-12-12 00:00:00

  • Comparative genomics of Eucalyptus and Corymbia reveals low rates of genome structural rearrangement.

    abstract:BACKGROUND:Previous studies suggest genome structure is largely conserved between Eucalyptus species. However, it is unknown if this conservation extends to more divergent eucalypt taxa. We performed comparative genomics between the eucalypt genera Eucalyptus and Corymbia. Our results will facilitate transfer of genomi...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-3782-7

    authors: Butler JB,Vaillancourt RE,Potts BM,Lee DJ,King GJ,Baten A,Shepherd M,Freeman JS

    更新日期:2017-05-22 00:00:00

  • Transcriptome analysis reveals differentially expressed genes associated with germ cell and gonad development in the Southern bluefin tuna (Thunnus maccoyii).

    abstract:BACKGROUND:Controlling and managing the breeding of bluefin tuna (Thunnus spp.) in captivity is an imperative step towards obtaining a sustainable supply of these fish in aquaculture production systems. Germ cell transplantation (GCT) is an innovative technology for the production of inter-species surrogates, by transp...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-2397-8

    authors: Bar I,Cummins S,Elizur A

    更新日期:2016-03-10 00:00:00

  • GiSAO.db: a database for ageing research.

    abstract:BACKGROUND:Age-related gene expression patterns of Homo sapiens as well as of model organisms such as Mus musculus, Saccharomyces cerevisiae, Caenorhabditis elegans and Drosophila melanogaster are a basis for understanding the genetic mechanisms of ageing. For an effective analysis and interpretation of expression prof...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-262

    authors: Hofer E,Laschober GT,Hackl M,Thallinger GG,Lepperdinger G,Grillari J,Jansen-Dürr P,Trajanoski Z

    更新日期:2011-05-24 00:00:00

  • Transcriptomic response to heat stress among ecologically divergent populations of redband trout.

    abstract:BACKGROUND:As ectothermic organisms have evolved to differing aquatic climates, the molecular basis of thermal adaptation is a key area of research. In this study, we tested for differential transcriptional response of ecologically divergent populations of redband trout (Oncorhynchus mykiss gairdneri) that have evolved...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1246-5

    authors: Narum SR,Campbell NR

    更新日期:2015-02-21 00:00:00

  • Transcriptome map of mouse isochores.

    abstract:BACKGROUND:The availability of fully sequenced genomes and the implementation of transcriptome technologies have increased the studies investigating the expression profiles for a variety of tissues, conditions, and species. In this study, using RNA-seq data for three distinct tissues (brain, liver, and muscle), we inve...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-511

    authors: Arhondakis S,Frousios K,Iliopoulos CS,Pissis SP,Tischler G,Kossida S

    更新日期:2011-10-17 00:00:00

  • Applicability of DNA pools on 500 K SNP microarrays for cost-effective initial screens in genomewide association studies.

    abstract:BACKGROUND:Genetic influences underpinning complex traits are thought to involve multiple quantitative trait loci (QTLs) of small effect size. Detection of such QTL associations requires systematic screening of large numbers of DNA markers within large sample populations. Using pooled DNA on SNP microarrays to screen f...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-8-214

    authors: Docherty SJ,Butcher LM,Schalkwyk LC,Plomin R

    更新日期:2007-07-04 00:00:00

  • The scale and evolutionary significance of horizontal gene transfer in the choanoflagellate Monosiga brevicollis.

    abstract:BACKGROUND:It is generally agreed that horizontal gene transfer (HGT) is common in phagotrophic protists. However, the overall scale of HGT and the cumulative impact of acquired genes on the evolution of these organisms remain largely unknown. RESULTS:Choanoflagellates are phagotrophs and the closest living relatives ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-729

    authors: Yue J,Sun G,Hu X,Huang J

    更新日期:2013-10-25 00:00:00

  • G-spots cause incorrect expression measurement in Affymetrix microarrays.

    abstract:BACKGROUND:High Density Oligonucleotide arrays (HDONAs), such as the Affymetrix HG-U133A GeneChip, use sets of probes chosen to match specified genes, with the expectation that if a particular gene is highly expressed then all the probes in that gene's probe set will provide a consistent message signifying the gene's p...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-613

    authors: Upton GJ,Langdon WB,Harrison AP

    更新日期:2008-12-18 00:00:00

  • "Does replication groups scoring reduce false positive rate in SNP interaction discovery? Response".

    abstract:BACKGROUND:The genomewide evaluation of genetic epistasis is a computationally demanding task, and a current challenge in Genetics. HFCC (Hypothesis-Free Clinical Cloning) is one of the methods that have been suggested for genomewide epistasis analysis. In order to perform an exhaustive search of epistasis, HFCC has im...

    journal_title:BMC genomics

    pub_type: 评论,杂志文章

    doi:10.1186/1471-2164-11-403

    authors: Gayán J,González-Pérez A,Ruiz A

    更新日期:2010-06-24 00:00:00

  • Strand-specific transcriptomes of Enterohemorrhagic Escherichia coli in response to interactions with ground beef microbiota: interactions between microorganisms in raw meat.

    abstract:BACKGROUND:Enterohemorrhagic Escherichia coli (EHEC) are zoonotic agents associated with outbreaks worldwide. Growth of EHEC strains in ground beef could be inhibited by background microbiota that is present initially at levels greater than that of the pathogen E. coli. However, how the microbiota outcompetes the patho...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-3957-2

    authors: Galia W,Leriche F,Cruveiller S,Garnier C,Navratil V,Dubost A,Blanquet-Diot S,Thevenot-Sergentet D

    更新日期:2017-08-03 00:00:00

  • Large synteny blocks revealed between Caenorhabditis elegans and Caenorhabditis briggsae genomes using OrthoCluster.

    abstract:BACKGROUND:Accurate identification of synteny blocks is an important step in comparative genomics towards the understanding of genome architecture and expression. Most computer programs developed in the last decade for identifying synteny blocks have limitations. To address these limitations, we recently developed a ro...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-11-516

    authors: Vergara IA,Chen N

    更新日期:2010-09-24 00:00:00

  • Changes in Bacillus anthracis CodY regulation under host-specific environmental factor deprived conditions.

    abstract:BACKGROUND:Host-specific environmental factors induce changes in Bacillus anthracis gene transcription during infection. A global transcription regulator, CodY, plays a pivotal role in regulating central metabolism, biosynthesis, and virulence in B. anthracis. In this study, we utilized RNA-sequencing to assess changes...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-3004-8

    authors: Kim SK,Jung KH,Chai YG

    更新日期:2016-08-17 00:00:00

  • Avoidance of recognition sites of restriction-modification systems is a widespread but not universal anti-restriction strategy of prokaryotic viruses.

    abstract:BACKGROUND:Restriction-modification (R-M) systems protect bacteria and archaea from attacks by bacteriophages and archaeal viruses. An R-M system specifically recognizes short sites in foreign DNA and cleaves it, while such sites in the host DNA are protected by methylation. Prokaryotic viruses have developed a number ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5324-3

    authors: Rusinov IS,Ershova AS,Karyagina AS,Spirin SA,Alexeevski AV

    更新日期:2018-12-07 00:00:00

  • Transcriptome analysis of porcine PBMCs after in vitro stimulation by LPS or PMA/ionomycin using an expression array targeting the pig immune response.

    abstract:BACKGROUND:Designing sustainable animal production systems that better balance productivity and resistance to disease is a major concern. In order to address questions related to immunity and resistance to disease in pig, it is necessary to increase knowledge on its immune system and to produce efficient tools dedicate...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-11-292

    authors: Gao Y,Flori L,Lecardonnel J,Esquerré D,Hu ZL,Teillaud A,Lemonnier G,Lefèvre F,Oswald IP,Rogel-Gaillard C

    更新日期:2010-05-11 00:00:00

  • The genetic regulation of size variation in the transcriptome of the cerebrum in the chicken and its role in domestication and brain size evolution.

    abstract:BACKGROUND:Large difference in cerebrum size exist between avian species and populations of the same species and is believed to reflect differences in processing power, i.e. in the speed and efficiency of processing information in this brain region. During domestication chickens developed a larger cerebrum compared to ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-06908-0

    authors: Höglund A,Strempfl K,Fogelholm J,Wright D,Henriksen R

    更新日期:2020-07-29 00:00:00

  • Variant detection and runs of homozygosity in next generation sequencing data elucidate the genetic background of Lundehund syndrome.

    abstract:BACKGROUND:The Lundehund is a highly specialized breed characterized by a unique flexibility of the joints and polydactyly in all four limbs. The extremely small population size and high inbreeding has promoted a high frequency of diseased dogs affected by the Lundehund syndrome (LS), a severe gastro-enteropathic disea...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-2844-6

    authors: Metzger J,Pfahler S,Distl O

    更新日期:2016-08-02 00:00:00

  • Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples.

    abstract:BACKGROUND:Genomes store information for building and maintaining organisms. Complete sequencing of many genomes provides the opportunity to study and compare global information properties of those genomes. RESULTS:We have analyzed aspects of the information content of Homo sapiens, Mus musculus, Drosophila melanogast...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-509

    authors: Liu Z,Venkatesh SS,Maley CC

    更新日期:2008-10-30 00:00:00

  • Genome-wide association study of eating and cooking qualities in different subpopulations of rice (Oryza sativa L.).

    abstract:BACKGROUND:Starch and protein are two major components of polished rice, and the amylose and protein contents affect eating and cooking qualities (ECQs). In the present study, genome-wide association study with high-quality re-sequencing data was performed for 10 ECQs in a panel of 227 non-glutinous rice accessions and...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-3000-z

    authors: Xu F,Bao J,He Q,Park YJ

    更新日期:2016-08-20 00:00:00

  • Functional and gene network analyses of transcriptional signatures characterizing pre-weaned bovine mammary parenchyma or fat pad uncovered novel inter-tissue signaling networks during development.

    abstract:BACKGROUND:The neonatal bovine mammary fat pad (MFP) surrounding the mammary parenchyma (PAR) is thought to exert proliferative effects on the PAR through secretion of local modulators of growth induced by systemic hormones. We used bioinformatics to characterize transcriptomics differences between PAR and MFP from app...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-11-331

    authors: Piantoni P,Bionaz M,Graugnard DE,Daniels KM,Everts RE,Rodriguez-Zas SL,Lewin HA,Hurley HL,Akers M,Loor JJ

    更新日期:2010-05-26 00:00:00

  • Combinatorial Conflicting Homozygosity (CCH) analysis enables the rapid identification of shared genomic regions in the presence of multiple phenocopies.

    abstract:BACKGROUND:The ability to identify regions of the genome inherited with a dominant trait in one or more families has become increasingly valuable with the wide availability of high throughput sequencing technology. While a number of methods exist for mapping of homozygous variants segregating with recessive traits in c...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1360-4

    authors: Levine AP,Connor TM,Oygar DD,Neild GH,Segal AW,Maxwell PH,Gale DP

    更新日期:2015-03-10 00:00:00

  • Unique aspects of fiber degradation by the ruminal ethanologen Ruminococcus albus 7 revealed by physiological and transcriptomic analysis.

    abstract:BACKGROUND:Bacteria in the genus Ruminococcus are ubiquitous members of the mammalian gastrointestinal tract. In particular, they are important in ruminants where they digest a wide range of plant cell wall polysaccharides. For example, Ruminococcus albus 7 is a primary cellulose degrader that produces acetate usable b...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-1066

    authors: Christopherson MR,Dawson JA,Stevenson DM,Cunningham AC,Bramhacharya S,Weimer PJ,Kendziorski C,Suen G

    更新日期:2014-12-04 00:00:00