dnAQET: a framework to compute a consolidated metric for benchmarking quality of de novo assemblies.

Abstract:

BACKGROUND:Accurate de novo genome assembly has become reality with the advancements in sequencing technology. With the ever-increasing number of de novo genome assembly tools, assessing the quality of assemblies has become of great importance in genome research. Although many quality metrics have been proposed and software tools for calculating those metrics have been developed, the existing tools do not produce a unified measure to reflect the overall quality of an assembly. RESULTS:To address this issue, we developed the de novo Assembly Quality Evaluation Tool (dnAQET) that generates a unified metric for benchmarking the quality assessment of assemblies. Our framework first calculates individual quality scores for the scaffolds/contigs of an assembly by aligning them to a reference genome. Next, it computes a quality score for the assembly using its overall reference genome coverage, the quality score distribution of its scaffolds and the redundancy identified in it. Using synthetic assemblies randomly generated from the latest human genome build, various builds of the reference genomes for five organisms and six de novo assemblies for sample NA24385, we tested dnAQET to assess its capability for benchmarking quality evaluation of genome assemblies. For synthetic data, our quality score increased with decreasing number of misassemblies and redundancy and increasing average contig length and coverage, as expected. For genome builds, dnAQET quality score calculated for a more recent reference genome was better than the score for an older version. To compare with some of the most frequently used measures, 13 other quality measures were calculated. The quality score from dnAQET was found to be better than all other measures in terms of consistency with the known quality of the reference genomes, indicating that dnAQET is reliable for benchmarking quality assessment of de novo genome assemblies. CONCLUSIONS:The dnAQET is a scalable framework designed to evaluate a de novo genome assembly based on the aggregated quality of its scaffolds (or contigs). Our results demonstrated that dnAQET quality score is reliable for benchmarking quality assessment of genome assemblies. The dnQAET can help researchers to identify the most suitable assembly tools and to select high quality assemblies generated.

journal_name

BMC Genomics

journal_title

BMC genomics

authors

Yavas G,Hong H,Xiao W

doi

10.1186/s12864-019-6070-x

subject

Has Abstract

pub_date

2019-09-11 00:00:00

pages

706

issue

1

issn

1471-2164

pii

10.1186/s12864-019-6070-x

journal_volume

20

pub_type

杂志文章
  • The I2 resistance gene homologues in Solanum have complex evolutionary patterns and are targeted by miRNAs.

    abstract:BACKGROUND:Several resistance traits, including the I2 resistance against tomato fusarium wilt, were mapped to the long arm of chromosome 11 of Solanum. However, the structure and evolution of this locus remain poorly understood. RESULTS:Comparative analysis showed that the structure and evolutionary patterns of the I...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-743

    authors: Wei C,Kuang H,Li F,Chen J

    更新日期:2014-08-30 00:00:00

  • Genome-wide metabolic (re-) annotation of Kluyveromyces lactis.

    abstract:BACKGROUND:Even before having its genome sequence published in 2004, Kluyveromyces lactis had long been considered a model organism for studies in genetics and physiology. Research on Kluyveromyces lactis is quite advanced and this yeast species is one of the few with which it is possible to perform formal genetic anal...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-517

    authors: Dias O,Gombert AK,Ferreira EC,Rocha I

    更新日期:2012-10-01 00:00:00

  • Reconstruction of ancient homeobox gene linkages inferred from a new high-quality assembly of the Hong Kong oyster (Magallana hongkongensis) genome.

    abstract:BACKGROUND:Homeobox-containing genes encode crucial transcription factors involved in animal, plant and fungal development, and changes to homeobox genes have been linked to the evolution of novel body plans and morphologies. In animals, some homeobox genes are clustered together in the genome, either as remnants from ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-07027-6

    authors: Li Y,Nong W,Baril T,Yip HY,Swale T,Hayward A,Ferrier DEK,Hui JHL

    更新日期:2020-10-15 00:00:00

  • Genome-wide identification and comparison of differentially expressed profiles of miRNAs and lncRNAs with associated ceRNA networks in the gonads of Chinese soft-shelled turtle, Pelodiscus sinensis.

    abstract:BACKGROUND:The gonad is the major factor affecting animal reproduction. The regulatory mechanism of the expression of protein-coding genes involved in reproduction still remains to be elucidated. Increasing evidence has shown that ncRNAs play key regulatory roles in gene expression in many life processes. The roles of ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-06826-1

    authors: Ma X,Cen S,Wang L,Zhang C,Wu L,Tian X,Wu Q,Li X,Wang X

    更新日期:2020-06-29 00:00:00

  • Three novel Pseudomonas phages isolated from composting provide insights into the evolution and diversity of tailed phages.

    abstract:BACKGROUND:Among viruses, bacteriophages are a group of special interest due to their capacity of infecting bacteria that are important for biotechnology and human health. Composting is a microbial-driven process in which complex organic matter is converted into humus-like substances. In thermophilic composting, the de...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-3729-z

    authors: Amgarten D,Martins LF,Lombardi KC,Antunes LP,de Souza APS,Nicastro GG,Kitajima EW,Quaggio RB,Upton C,Setubal JC,da Silva AM

    更新日期:2017-05-04 00:00:00

  • Identification of chromosomal alpha-proteobacterial small RNAs by comparative genome analysis and detection in Sinorhizobium meliloti strain 1021.

    abstract:BACKGROUND:Small untranslated RNAs (sRNAs) seem to be far more abundant than previously believed. The number of sRNAs confirmed in E. coli through various approaches is above 70, with several hundred more sRNA candidate genes under biological validation. Although the total number of sRNAs in any one species is still un...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-8-467

    authors: Ulvé VM,Sevin EW,Chéron A,Barloy-Hubler F

    更新日期:2007-12-19 00:00:00

  • DAIRYdb: a manually curated reference database for improved taxonomy annotation of 16S rRNA gene sequences from dairy products.

    abstract:BACKGROUND:Reads assignment to taxonomic units is a key step in microbiome analysis pipelines. To date, accurate taxonomy annotation of 16S reads, particularly at species rank, is still challenging due to the short size of read sequences and differently curated classification databases. The close phylogenetic relations...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-5914-8

    authors: Meola M,Rifa E,Shani N,Delbès C,Berthoud H,Chassard C

    更新日期:2019-07-08 00:00:00

  • Multiple genetic loci define Ca++ utilization by bloodstream malaria parasites.

    abstract:BACKGROUND:Bloodstream malaria parasites require Ca++ for their development, but the sites and mechanisms of Ca++ utilization are not well understood. We hypothesized that there may be differences in Ca++ uptake or utilization by genetically distinct lines of P. falciparum. These differences, if identified, may provide...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5418-y

    authors: Apolis L,Olivas J,Srinivasan P,Kushwaha AK,Desai SA

    更新日期:2019-01-16 00:00:00

  • Transcriptome of the adult female malaria mosquito vector Anopheles albimanus.

    abstract:BACKGROUND:Human Malaria is transmitted by mosquitoes of the genus Anopheles. Transmission is a complex phenomenon involving biological and environmental factors of humans, parasites and mosquitoes. Among more than 500 anopheline species, only a few species from different branches of the mosquito evolutionary tree tran...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-207

    authors: Martínez-Barnetche J,Gómez-Barreto RE,Ovilla-Muñoz M,Téllez-Sosa J,García López DE,Dinglasan RR,Ubaida Mohien C,MacCallum RM,Redmond SN,Gibbons JG,Rokas A,Machado CA,Cazares-Raga FE,González-Cerón L,Hernández-Martínez S,Rod

    更新日期:2012-05-30 00:00:00

  • Target genes of myostatin loss-of-function in muscles of late bovine fetuses.

    abstract:BACKGROUND:Myostatin, a muscle-specific member of the Transforming Growth Factor beta family, negatively regulates muscle development. Double-muscled (DM) cattle have a loss-of-function mutation in their myostatin gene responsible for the hypermuscular phenotype. Thus, these animals are a good model for understanding t...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-8-63

    authors: Cassar-Malek I,Passelaigue F,Bernard C,Léger J,Hocquette JF

    更新日期:2007-03-01 00:00:00

  • Small non-coding RNA profiling and the role of piRNA pathway genes in the protection of chicken primordial germ cells.

    abstract:BACKGROUND:Genes, RNAs, and proteins play important roles during germline development. However, the functions of non-coding RNAs (ncRNAs) on germline development remain unclear in avian species. Recent high-throughput techniques have identified several classes of ncRNAs, including micro RNAs (miRNAs), small-interfering...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-757

    authors: Rengaraj D,Lee SI,Park TS,Lee HJ,Kim YM,Sohn YA,Jung M,Noh SJ,Jung H,Han JY

    更新日期:2014-09-04 00:00:00

  • The phenylalanine ammonia lyase (PAL) gene family shows a gymnosperm-specific lineage.

    abstract:BACKGROUND:Phenylalanine ammonia lyase (PAL) is a key enzyme of the phenylpropanoid pathway that catalyzes the deamination of phenylalanine to trans-cinnamic acid, a precursor for the lignin and flavonoid biosynthetic pathways. To date, PAL genes have been less extensively studied in gymnosperms than in angiosperms. Ou...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-S3-S1

    authors: Bagal UR,Leebens-Mack JH,Lorenz WW,Dean JF

    更新日期:2012-06-11 00:00:00

  • Systems toxicology identifies mechanistic impacts of 2-amino-4,6-dinitrotoluene (2A-DNT) exposure in Northern Bobwhite.

    abstract:BACKGROUND:A systems toxicology investigation comparing and integrating transcriptomic and proteomic results was conducted to develop holistic effects characterizations for the wildlife bird model, Northern bobwhite (Colinus virginianus) dosed with the explosives degradation product 2-amino-4,6-dinitrotoluene (2A-DNT)....

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1798-4

    authors: Gust KA,Nanduri B,Rawat A,Wilbanks MS,Ang CY,Johnson DR,Pendarvis K,Chen X,Quinn MJ Jr,Johnson MS,Burgess SC,Perkins EJ

    更新日期:2015-08-07 00:00:00

  • Mobilization of retrotransposons as a cause of chromosomal diversification and rapid speciation: the case for the Antarctic teleost genus Trematomus.

    abstract:BACKGROUND:The importance of transposable elements (TEs) in the genomic remodeling and chromosomal rearrangements that accompany lineage diversification in vertebrates remains the subject of debate. The major impediment to understanding the roles of TEs in genome evolution is the lack of comparative and integrative ana...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-4714-x

    authors: Auvinet J,Graça P,Belkadi L,Petit L,Bonnivard E,Dettaï A,Detrich WH 3rd,Ozouf-Costaz C,Higuet D

    更新日期:2018-05-09 00:00:00

  • Comparative transcriptomics of early petal development across four diverse species of Aquilegia reveal few genes consistently associated with nectar spur development.

    abstract:BACKGROUND:Petal nectar spurs, which facilitate pollination through animal attraction and pollen placement, represent a key innovation promoting diversification in the genus Aquilegia (Ranunculaceae). Identifying the genetic components that contribute to the development of these three-dimensional structures will inform...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-6002-9

    authors: Ballerini ES,Kramer EM,Hodges SA

    更新日期:2019-08-22 00:00:00

  • Identification of "pathologs" (disease-related genes) from the RIKEN mouse cDNA dataset using human curation plus FACTS, a new biological information extraction system.

    abstract:BACKGROUND:A major goal in the post-genomic era is to identify and characterise disease susceptibility genes and to apply this knowledge to disease prevention and treatment. Rodents and humans have remarkably similar genomes and share closely related biochemical, physiological and pathological pathways. In this work we...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-5-28

    authors: Silva DG,Schönbach C,Brusic V,Socha LA,Nagashima T,Petrovsky N

    更新日期:2004-04-29 00:00:00

  • In silico and in vivo splicing analysis of MLH1 and MSH2 missense mutations shows exon- and tissue-specific effects.

    abstract:BACKGROUND:Abnormalities of pre-mRNA splicing are increasingly recognized as an important mechanism through which gene mutations cause disease. However, apart from the mutations in the donor and acceptor sites, the effects on splicing of other sequence variations are difficult to predict. Loosely defined exonic and int...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-7-243

    authors: Lastella P,Surdo NC,Resta N,Guanti G,Stella A

    更新日期:2006-09-22 00:00:00

  • Microarray analysis of tumor necrosis factor alpha induced gene expression in U373 human glioblastoma cells.

    abstract:BACKGROUND:Tumor necrosis factor alpha (TNF) is able to induce a variety of biological responses in the nervous system including inflammation and neuroprotection. Human astrocytoma cells U373 have been widely used as a model for inflammatory cytokine actions in the nervous system. Here we used cDNA microarrays to analy...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-4-46

    authors: Schwamborn J,Lindecke A,Elvers M,Horejschi V,Kerick M,Rafigh M,Pfeiffer J,Prüllage M,Kaltschmidt B,Kaltschmidt C

    更新日期:2003-11-25 00:00:00

  • Transcriptome profiling of resistance response to Meloidogyne chitwoodi introgressed from wild species Solanum bulbocastanum into cultivated potato.

    abstract:BACKGROUND:Meloidogyne chitwoodi commonly known as Columbia root-knot nematode or CRKN is one of the most devastating pests of potato in the Pacific Northwest of the United States of America. In addition to the roots, it infects potato tubers causing internal as well as external defects, thereby reducing the market val...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-6257-1

    authors: Bali S,Vining K,Gleason C,Majtahedi H,Brown CR,Sathuvalli V

    更新日期:2019-11-28 00:00:00

  • Medicago truncatula transporter database: a comprehensive database resource for M. truncatula transporters.

    abstract:BACKGROUND:Medicago truncatula has been chosen as a model species for genomic studies. It is closely related to an important legume, alfalfa. Transporters are a large group of membrane-spanning proteins. They deliver essential nutrients, eject waste products, and assist the cell in sensing environmental conditions by f...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-60

    authors: Miao Z,Li D,Zhang Z,Dong J,Su Z,Wang T

    更新日期:2012-02-06 00:00:00

  • An advanced bioinformatics approach for analyzing RNA-seq data reveals sigma H-dependent regulation of competence genes in Listeria monocytogenes.

    abstract:BACKGROUND:Alternative σ factors are important transcriptional regulators in bacteria. While σ(B) has been shown to control a large regulon and play important roles in stress response and virulence in the pathogen Listeria monocytogenes, the function of σ(H) has not yet been well defined in Listeria, even though σ(H) c...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-2432-9

    authors: Liu Y,Orsi RH,Boor KJ,Wiedmann M,Guariglia-Oropeza V

    更新日期:2016-02-16 00:00:00

  • CRISPR/Cas9-mediated precise genome modification by a long ssDNA template in zebrafish.

    abstract:BACKGROUND:Gene targeting by homology-directed repair (HDR) can precisely edit the genome and is a versatile tool for biomedical research. However, the efficiency of HDR-based modification is still low in many model organisms including zebrafish. Recently, long single-stranded DNA (lssDNA) molecules have been developed...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-6493-4

    authors: Bai H,Liu L,An K,Lu X,Harrison M,Zhao Y,Yan R,Lu Z,Li S,Lin S,Liang F,Qin W

    更新日期:2020-01-21 00:00:00

  • In silico secretome analysis approach for next generation sequencing transcriptomic data.

    abstract:BACKGROUND:Excretory/secretory proteins (ESPs) play a major role in parasitic infection as they are present at the host-parasite interface and regulate host immune system. In case of parasitic helminths, transcriptomics has been used extensively to understand the molecular basis of parasitism and for developing novel t...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-S3-S14

    authors: Garg G,Ranganathan S

    更新日期:2011-11-30 00:00:00

  • HumCFS: a database of fragile sites in human chromosomes.

    abstract:BACKGROUND:Fragile sites are the chromosomal regions that are susceptible to breakage, and their frequency varies among the human population. Based on the frequency of fragile site induction, they are categorized as common and rare fragile sites. Common fragile sites are sensitive to replication stress and often rearra...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5330-5

    authors: Kumar R,Nagpal G,Kumar V,Usmani SS,Agrawal P,Raghava GPS

    更新日期:2019-04-18 00:00:00

  • Dose-dependent effects of small-molecule antagonists on the genomic landscape of androgen receptor binding.

    abstract:BACKGROUND:The androgen receptor plays a critical role throughout the progression of prostate cancer and is an important drug target for this disease. While chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-Seq) is becoming an essential tool for studying transcription and chromatin modifica...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-355

    authors: Zhu Z,Shi M,Hu W,Estrella H,Engebretsen J,Nichols T,Briere D,Hosea N,Los G,Rejto PA,Fanjul A

    更新日期:2012-07-31 00:00:00

  • Highly-multiplexed SNP genotyping for genetic mapping and germplasm diversity studies in pea.

    abstract:BACKGROUND:Single Nucleotide Polymorphisms (SNPs) can be used as genetic markers for applications such as genetic diversity studies or genetic mapping. New technologies now allow genotyping hundreds to thousands of SNPs in a single reaction.In order to evaluate the potential of these technologies in pea, we selected a ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-11-468

    authors: Deulvot C,Charrel H,Marty A,Jacquin F,Donnadieu C,Lejeune-Hénaut I,Burstin J,Aubert G

    更新日期:2010-08-11 00:00:00

  • Tandem repeats derived from centromeric retrotransposons.

    abstract:BACKGROUND:Tandem repeats are ubiquitous and abundant in higher eukaryotic genomes and constitute, along with transposable elements, much of DNA underlying centromeres and other heterochromatic domains. In maize, centromeric satellite repeat (CentC) and centromeric retrotransposons (CR), a class of Ty3/gypsy retrotrans...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-142

    authors: Sharma A,Wolfgruber TK,Presting GG

    更新日期:2013-03-04 00:00:00

  • A systematic evaluation of expression of HERV-W elements; influence of genomic context, viral structure and orientation.

    abstract:BACKGROUND:One member of the W family of human endogenous retroviruses (HERV) appears to have been functionally adopted by the human host. Nevertheless, a highly diversified and regulated transcription from a range of HERV-W elements has been observed in human tissues and cells. Aberrant expression of members of this f...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-22

    authors: Li F,Nellåker C,Yolken RH,Karlsson H

    更新日期:2011-01-12 00:00:00

  • Exposure to maternal obesity alters gene expression in the preimplantation ovine conceptus.

    abstract:BACKGROUND:Embryonic and fetal exposure to maternal obesity causes several maladaptive morphological and epigenetic changes in exposed offspring. The timing of these events is unclear, but changes can be observed even after a short exposure to maternal obesity around the time of conception. The hypothesis of this work ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5120-0

    authors: McCoski SR,Vailes MT,Owens CE,Cockrum RR,Ealy AD

    更新日期:2018-10-11 00:00:00

  • Sequence comparison of prefrontal cortical brain transcriptome from a tame and an aggressive silver fox (Vulpes vulpes).

    abstract:BACKGROUND:Two strains of the silver fox (Vulpes vulpes), with markedly different behavioral phenotypes, have been developed by long-term selection for behavior. Foxes from the tame strain exhibit friendly behavior towards humans, paralleling the sociability of canine puppies, whereas foxes from the aggressive strain a...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-482

    authors: Kukekova AV,Johnson JL,Teiling C,Li L,Oskina IN,Kharlamova AV,Gulevich RG,Padte R,Dubreuil MM,Vladimirova AV,Shepeleva DV,Shikhevich SG,Sun Q,Ponnala L,Temnykh SV,Trut LN,Acland GM

    更新日期:2011-10-03 00:00:00