Comparing de novo assemblers for 454 transcriptome data.

Abstract:

BACKGROUND:Roche 454 pyrosequencing has become a method of choice for generating transcriptome data from non-model organisms. Once the tens to hundreds of thousands of short (250-450 base) reads have been produced, it is important to correctly assemble these to estimate the sequence of all the transcripts. Most transcriptome assembly projects use only one program for assembling 454 pyrosequencing reads, but there is no evidence that the programs used to date are optimal. We have carried out a systematic comparison of five assemblers (CAP3, MIRA, Newbler, SeqMan and CLC) to establish best practices for transcriptome assemblies, using a new dataset from the parasitic nematode Litomosoides sigmodontis. RESULTS:Although no single assembler performed best on all our criteria, Newbler 2.5 gave longer contigs, better alignments to some reference sequences, and was fast and easy to use. SeqMan assemblies performed best on the criterion of recapitulating known transcripts, and had more novel sequence than the other assemblers, but generated an excess of small, redundant contigs. The remaining assemblers all performed almost as well, with the exception of Newbler 2.3 (the version currently used by most assembly projects), which generated assemblies that had significantly lower total length. As different assemblers use different underlying algorithms to generate contigs, we also explored merging of assemblies and found that the merged datasets not only aligned better to reference sequences than individual assemblies, but were also more consistent in the number and size of contigs. CONCLUSIONS:Transcriptome assemblies are smaller than genome assemblies and thus should be more computationally tractable, but are often harder because individual contigs can have highly variable read coverage. Comparing single assemblers, Newbler 2.5 performed best on our trial data set, but other assemblers were closely comparable. Combining differently optimal assemblies from different programs however gave a more credible final product, and this strategy is recommended.

journal_name

BMC Genomics

journal_title

BMC genomics

authors

Kumar S,Blaxter ML

doi

10.1186/1471-2164-11-571

subject

Has Abstract

pub_date

2010-10-16 00:00:00

pages

571

issn

1471-2164

pii

1471-2164-11-571

journal_volume

11

pub_type

杂志文章
  • Deep sequencing of the uterine immune response to bacteria during the equine oestrous cycle.

    abstract:BACKGROUND:The steroid hormone environment in healthy horses seems to have a significant impact on the efficiency of their uterine immune response. The objective of this study was to characterize the changes in gene expression in the equine endometrium in response to the introduction of bacterial pathogens and the infl...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-2139-3

    authors: Marth CD,Young ND,Glenton LY,Noden DM,Browning GF,Krekeler N

    更新日期:2015-11-14 00:00:00

  • The phenylalanine ammonia lyase (PAL) gene family shows a gymnosperm-specific lineage.

    abstract:BACKGROUND:Phenylalanine ammonia lyase (PAL) is a key enzyme of the phenylpropanoid pathway that catalyzes the deamination of phenylalanine to trans-cinnamic acid, a precursor for the lignin and flavonoid biosynthetic pathways. To date, PAL genes have been less extensively studied in gymnosperms than in angiosperms. Ou...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-S3-S1

    authors: Bagal UR,Leebens-Mack JH,Lorenz WW,Dean JF

    更新日期:2012-06-11 00:00:00

  • Correction to: Comparative transcriptomics reveals PrrABmediated control of metabolic, respiration, energy-generating, and dormancy pathways in Mycobacterium smegmatis.

    abstract::Following the publication of the original article [1], the authors reported an error in Fig. 2 of the PDF version of their article. ...

    journal_title:BMC genomics

    pub_type: 杂志文章,已发布勘误

    doi:10.1186/s12864-019-6419-1

    authors: Maarsingh JD,Yang S,Park JG,Haydel SE

    更新日期:2019-12-31 00:00:00

  • Repeats and EST analysis for new organisms.

    abstract:BACKGROUND:Repeat masking is an important step in the EST analysis pipeline. For new species, genomic knowledge is scarce and good repeat libraries are typically unavailable. In these cases it is common practice to mask against known repeats from other species (i.e., model organisms). There are few studies that investi...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-23

    authors: Malde K,Jonassen I

    更新日期:2008-01-18 00:00:00

  • Effect of CAR activation on selected metabolic pathways in normal and hyperlipidemic mouse livers.

    abstract:BACKGROUND:Detoxification in the liver involves activation of nuclear receptors, such as the constitutive androstane receptor (CAR), which regulate downstream genes of xenobiotic metabolism. Frequently, the metabolism of endobiotics is also modulated, resulting in potentially harmful effects. We therefore used 1,4-Bis ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-384

    authors: Rezen T,Tamasi V,Lövgren-Sandblom A,Björkhem I,Meyer UA,Rozman D

    更新日期:2009-08-19 00:00:00

  • Genomic tools for durum wheat breeding: de novo assembly of Svevo transcriptome and SNP discovery in elite germplasm.

    abstract:BACKGROUND:The tetraploid durum wheat (Triticum turgidum L. ssp. durum Desf. Husnot) is an important crop which provides the raw material for pasta production and a valuable source of genetic diversity for breeding hexaploid wheat (Triticum aestivum L.). Future breeding efforts to enhance yield potential and climate re...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-5645-x

    authors: Vendramin V,Ormanbekova D,Scalabrin S,Scaglione D,Maccaferri M,Martelli P,Salvi S,Jurman I,Casadio R,Cattonaro F,Tuberosa R,Massi A,Morgante M

    更新日期:2019-04-10 00:00:00

  • Comparative genomics of European avian pathogenic E. Coli (APEC).

    abstract:BACKGROUND:Avian pathogenic Escherichia coli (APEC) causes colibacillosis, which results in significant economic losses to the poultry industry worldwide. However, the diversity between isolates remains poorly understood. Here, a total of 272 APEC isolates collected from the United Kingdom (UK), Italy and Germany were ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-3289-7

    authors: Cordoni G,Woodward MJ,Wu H,Alanazi M,Wallis T,La Ragione RM

    更新日期:2016-11-22 00:00:00

  • Developing high throughput genotyped chromosome segment substitution lines based on population whole-genome re-sequencing in rice (Oryza sativa L.).

    abstract:BACKGROUND:Genetic populations provide the basis for a wide range of genetic and genomic studies and have been widely used in genetic mapping, gene discovery and genomics-assisted breeding. Chromosome segment substitution lines (CSSLs) are the most powerful tools for the detection and precise mapping of quantitative tr...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-11-656

    authors: Xu J,Zhao Q,Du P,Xu C,Wang B,Feng Q,Liu Q,Tang S,Gu M,Han B,Liang G

    更新日期:2010-11-24 00:00:00

  • Boolean modeling and fault diagnosis in oxidative stress response.

    abstract:BACKGROUND:Oxidative stress is a consequence of normal and abnormal cellular metabolism and is linked to the development of human diseases. The effective functioning of the pathway responding to oxidative stress protects the cellular DNA against oxidative damage; conversely the failure of the oxidative stress response ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-S6-S4

    authors: Sridharan S,Layek R,Datta A,Venkatraj J

    更新日期:2012-01-01 00:00:00

  • Comparison of gene expression microarray data with count-based RNA measurements informs microarray interpretation.

    abstract:BACKGROUND:Although numerous investigations have compared gene expression microarray platforms, preprocessing methods and batch correction algorithms using constructed spike-in or dilution datasets, there remains a paucity of studies examining the properties of microarray data using diverse biological samples. Most mic...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-649

    authors: Richard AC,Lyons PA,Peters JE,Biasci D,Flint SM,Lee JC,McKinney EF,Siegel RM,Smith KG

    更新日期:2014-08-04 00:00:00

  • The complete mitochondrial genomes for three Toxocara species of human and animal health significance.

    abstract:BACKGROUND:Studying mitochondrial (mt) genomics has important implications for various fundamental areas, including mt biochemistry, physiology and molecular biology. In addition, mt genome sequences have provided useful markers for investigating population genetic structures, systematics and phylogenetics of organisms...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-224

    authors: Li MW,Lin RQ,Song HQ,Wu XY,Zhu XQ

    更新日期:2008-05-16 00:00:00

  • Transfer of clinically relevant gene expression signatures in breast cancer: from Affymetrix microarray to Illumina RNA-Sequencing technology.

    abstract:BACKGROUND:Microarrays have revolutionized breast cancer (BC) research by enabling studies of gene expression on a transcriptome-wide scale. Recently, RNA-Sequencing (RNA-Seq) has emerged as an alternative for precise readouts of the transcriptome. To date, no study has compared the ability of the two technologies to q...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-1008

    authors: Fumagalli D,Blanchet-Cohen A,Brown D,Desmedt C,Gacquer D,Michiels S,Rothé F,Majjaj S,Salgado R,Larsimont D,Ignatiadis M,Maetens M,Piccart M,Detours V,Sotiriou C,Haibe-Kains B

    更新日期:2014-11-21 00:00:00

  • Evidence for a non-canonical JAK/STAT signaling pathway in the synthesis of the brain's major ion channels and neurotransmitter receptors.

    abstract:BACKGROUND:Brain-derived neurotrophic factor (BDNF) is a major signaling molecule that the brain uses to control a vast network of intracellular cascades fundamental to properties of learning and memory, and cognition. While much is known about BDNF signaling in the healthy nervous system where it controls the mitogen ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-6033-2

    authors: Hixson KM,Cogswell M,Brooks-Kayal AR,Russek SJ

    更新日期:2019-08-28 00:00:00

  • The wheat pathogen Zymoseptoria tritici senses and responds to different wavelengths of light.

    abstract:BACKGROUND:The ascomycete fungus Zymoseptoria tritici (synonyms: Mycosphaerella graminicola, Septoria tritici) is a major pathogen of wheat that causes the economically important foliar disease Septoria tritici blotch. Despite its importance as a pathogen, little is known about the reaction of this fungus to light. To ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-06899-y

    authors: McCorison CB,Goodwin SB

    更新日期:2020-07-25 00:00:00

  • Transcriptomic analyses reveal physiological changes in sweet orange roots affected by citrus blight.

    abstract:BACKGROUND:Citrus blight is a very important progressive decline disease of commercial citrus. The etiology is unknown, although the disease can be transmitted by root grafts, suggesting a viral etiology. Diagnosis is made by demonstrating physical blockage of xylem cells that prevents the movement of water. This test ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-6339-0

    authors: Fu S,Shao J,Roy A,Brlansky RH,Zhou C,Hartung JS

    更新日期:2019-12-11 00:00:00

  • The initial deficiency of protein processing and flavonoids biosynthesis were the main mechanisms for the male sterility induced by SX-1 in Brassica napus.

    abstract:BACKGROUND:Rapeseed (Brassica napus) is an important oil seed crop in the Brassicaceae family. Chemical induced male sterility (CIMS) is one of the widely used method to produce the hybrids in B. napus. Identification of the key genes and pathways that involved in CIMS were important to understand the underlying molecu...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5203-y

    authors: Ning L,Lin Z,Gu J,Gan L,Li Y,Wang H,Miao L,Zhang L,Wang B,Li M

    更新日期:2018-11-07 00:00:00

  • Lung transcriptomic clock predicts premature aging in cigarette smoke-exposed mice.

    abstract:BACKGROUND:Lung aging is characterized by a number of structural alterations including fibrosis, chronic inflammation and the alteration of inflammatory cell composition. Chronic exposure to cigarette smoke (CS) is known to induce similar alterations and may contribute to premature lung aging. Additionally, aging and C...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-6712-z

    authors: Choukrallah MA,Hoeng J,Peitsch MC,Martin F

    更新日期:2020-04-09 00:00:00

  • A database of circadian and diel rhythmic gene expression in the yellow fever mosquito Aedes aegypti.

    abstract:BACKGROUND:The mosquito species Aedes aegypti is the primary vector of many arboviral diseases, including dengue and yellow fevers, that are responsible for a large worldwide health burden. The biological rhythms of mosquitoes regulate many of the physiological processes and behaviors that influence the transmission of...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-1128

    authors: Leming MT,Rund SS,Behura SK,Duffield GE,O'Tousa JE

    更新日期:2014-12-17 00:00:00

  • Alpha tubulin genes from Leishmania braziliensis: genomic organization, gene structure and insights on their expression.

    abstract:BACKGROUND:Alpha tubulin is a fundamental component of the cytoskeleton which is responsible for cell shape and is involved in cell division, ciliary and flagellar motility and intracellular transport. Alpha tubulin gene expression varies according to the morphological changes suffered by Leishmania in its life cycle. ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-454

    authors: Ramírez CA,Requena JM,Puerta CJ

    更新日期:2013-07-06 00:00:00

  • Multi-tissue transcriptome analysis using hybrid-sequencing reveals potential genes and biological pathways associated with azadirachtin A biosynthesis in neem (azadirachta indica).

    abstract:BACKGROUND:Azadirachtin A is a triterpenoid from neem tree exhibiting excellent activities against over 600 insect species in agriculture. The production of azadirachtin A depends on extraction from neem tissues, which is not an eco-friendly and sustainable process. The low yield and discontinuous supply of azadirachti...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-07124-6

    authors: Wang H,Wang N,Huo Y

    更新日期:2020-10-28 00:00:00

  • ISMapper: identifying transposase insertion sites in bacterial genomes from short read sequence data.

    abstract:BACKGROUND:Insertion sequences (IS) are small transposable elements, commonly found in bacterial genomes. Identifying the location of IS in bacterial genomes can be useful for a variety of purposes including epidemiological tracking and predicting antibiotic resistance. However IS are commonly present in multiple copie...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1860-2

    authors: Hawkey J,Hamidian M,Wick RR,Edwards DJ,Billman-Jacobe H,Hall RM,Holt KE

    更新日期:2015-09-03 00:00:00

  • Paralog analyses reveal gene duplication events and genes under positive selection in Ixodes scapularis and other ixodid ticks.

    abstract:BACKGROUND:Hard ticks (family Ixodidae) are obligatory hematophagous ectoparasites of worldwide medical and veterinary importance. The haploid genomes of multiple species of ixodid ticks exceed 1 Gbp, prompting questions regarding gene, segmental and whole genome duplication in this phyletic group. The availability of ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-2350-2

    authors: Van Zee JP,Schlueter JA,Schlueter S,Dixon P,Sierra CA,Hill CA

    更新日期:2016-03-16 00:00:00

  • Characterization of the small RNA component of the transcriptome from grain and sweet sorghum stems.

    abstract:BACKGROUND:Sorghum belongs to the tribe of the Andropogoneae that includes potential biofuel crops like switchgrass, Miscanthus and successful biofuel crops like corn and sugarcane. However, from a genomics point of view sorghum has compared to these other species a simpler genome because it lacks the additional rounds...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-356

    authors: Calviño M,Bruggmann R,Messing J

    更新日期:2011-07-08 00:00:00

  • Inferring microbial interaction networks from metagenomic data using SgLV-EKF algorithm.

    abstract:BACKGROUND:Inferring the microbial interaction networks (MINs) and modeling their dynamics are critical in understanding the mechanisms of the bacterial ecosystem and designing antibiotic and/or probiotic therapies. Recently, several approaches were proposed to infer MINs using the generalized Lotka-Volterra (gLV) mode...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-3605-x

    authors: Alshawaqfeh M,Serpedin E,Younes AB

    更新日期:2017-03-27 00:00:00

  • Antagonism between Staphylococcus epidermidis and Propionibacterium acnes and its genomic basis.

    abstract:BACKGROUND:Propionibacterium acnes and Staphylococcus epidermidis live in close proximity on human skin, and both bacterial species can be isolated from normal and acne vulgaris-affected skin sites. The antagonistic interactions between the two species are poorly understood, as well as the potential significance of bac...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-2489-5

    authors: Christensen GJ,Scholz CF,Enghild J,Rohde H,Kilian M,Thürmer A,Brzuszkiewicz E,Lomholt HB,Brüggemann H

    更新日期:2016-02-29 00:00:00

  • Identification of dysfunctional modules and disease genes in congenital heart disease by a network-based approach.

    abstract:BACKGROUND:The incidence of congenital heart disease (CHD) is continuously increasing among infants born alive nowadays, making it one of the leading causes of infant morbidity worldwide. Various studies suggest that both genetic and environmental factors lead to CHD, and therefore identifying its candidate genes and d...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-592

    authors: He D,Liu ZP,Chen L

    更新日期:2011-12-02 00:00:00

  • Global gene expression profile progression in Gaucher disease mouse models.

    abstract:BACKGROUND:Gaucher disease is caused by defective glucocerebrosidase activity and the consequent accumulation of glucosylceramide. The pathogenic pathways resulting from lipid laden macrophages (Gaucher cells) in visceral organs and their abnormal functions are obscure. RESULTS:To elucidate this pathogenic pathway, de...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-20

    authors: Xu YH,Jia L,Quinn B,Zamzow M,Stringer K,Aronow B,Sun Y,Zhang W,Setchell KD,Grabowski GA

    更新日期:2011-01-11 00:00:00

  • Culture-independent genomic characterisation of Candidatus Chlamydia sanzinia, a novel uncultivated bacterium infecting snakes.

    abstract:BACKGROUND:Recent molecular studies have revealed considerably more diversity in the phylum Chlamydiae than was previously thought. Evidence is growing that many of these novel chlamydiae may be important pathogens in humans and animals. A significant barrier to characterising these novel chlamydiae is the requirement ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-3055-x

    authors: Taylor-Brown A,Bachmann NL,Borel N,Polkinghorne A

    更新日期:2016-09-05 00:00:00

  • Codon usage patterns in Chinese bayberry (Myrica rubra) based on RNA-Seq data.

    abstract:BACKGROUND:Codon usage analysis has been a classical topic for decades and has significances for studies of evolution, mRNA translation, and new gene discovery, etc. While the codon usage varies among different members of the plant kingdom, indicating the necessity for species-specific study, this work has mostly been ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-732

    authors: Feng C,Xu CJ,Wang Y,Liu WL,Yin XR,Li X,Chen M,Chen KS

    更新日期:2013-10-25 00:00:00

  • A novel analytical method, Birth Date Selection Mapping, detects response of the Angus (Bos taurus) genome to selection on complex traits.

    abstract:BACKGROUND:Several methods have recently been developed to identify regions of the genome that have been exposed to strong selection. However, recent theoretical and empirical work suggests that polygenic models are required to identify the genomic regions that are more moderately responding to ongoing selection on com...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-606

    authors: Decker JE,Vasco DA,McKay SD,McClure MC,Rolf MM,Kim J,Northcutt SL,Bauck S,Woodward BW,Schnabel RD,Taylor JF

    更新日期:2012-11-09 00:00:00