Abstract:
BACKGROUND:The technological revolution in next-generation sequencing has brought unprecedented opportunities to study any organism of interest at the genomic or transcriptomic level. Transcriptome assembly is a crucial first step for studying the molecular basis of phenotypes of interest using RNA-Sequencing (RNA-Seq). However, the optimal strategy for assembling vast amounts of short RNA-Seq reads remains unresolved, especially for organisms without a sequenced genome. This study compared four transcriptome assembly methods, including a widely used de novo assembler (Trinity), two transcriptome re-assembly strategies utilizing proteomic and genomic resources from closely related species (reference-based re-assembly and TransPS) and a genome-guided assembler (Cufflinks). RESULTS:These four assembly strategies were compared using a comprehensive transcriptomic database of Aedes albopictus, for which a genome sequence has recently been completed. The quality of the various assemblies was assessed by the number of contigs generated, contig length distribution, percent paired-end read mapping, and gene model representation via BLASTX. Our results reveal that de novo assembly generates a similar number of gene models relative to genome-guided assembly with a fragmented reference, but produces the highest level of redundancy and requires the most computational power. Using a closely related reference genome to guide transcriptome assembly can generate biased contig sequences. Increasing the number of reads used in the transcriptome assembly tends to increase the redundancy within the assembly and decrease both median contig length and percent identity between contigs and reference protein sequences. CONCLUSIONS:This study provides general guidance for transcriptome assembly of RNA-Seq data from organisms with or without a sequenced genome. The optimal transcriptome assembly strategy will depend upon the subsequent downstream analyses. However, our results emphasize the efficacy of de novo assembly, which can be as effective as genome-guided assembly when the reference genome assembly is fragmented. If a genome assembly and sufficient computational resources are available, it can be beneficial to combine de novo and genome-guided assemblies. Caution should be taken when using a closely related reference genome to guide transcriptome assembly. The quantity of read pairs used in the transcriptome assembly does not necessarily correlate with the quality of the assembly.
journal_name
BMC Genomicsjournal_title
BMC genomicsauthors
Huang X,Chen XG,Armbruster PAdoi
10.1186/s12864-016-2923-8subject
Has Abstractpub_date
2016-07-27 00:00:00pages
523issn
1471-2164pii
10.1186/s12864-016-2923-8journal_volume
17pub_type
杂志文章相关文献
BMC GENOMICS文献大全abstract:BACKGROUND:Excretory/secretory proteins (ESPs) play a major role in parasitic infection as they are present at the host-parasite interface and regulate host immune system. In case of parasitic helminths, transcriptomics has been used extensively to understand the molecular basis of parasitism and for developing novel t...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-12-S3-S14
更新日期:2011-11-30 00:00:00
abstract:BACKGROUND:Anthocyanins are a group of flavonoid compounds. As a group of important secondary metabolites, they perform several key biological functions in plants. Anthocyanins also play beneficial health roles as potentially protective factors against cancer and heart disease. To elucidate the anthocyanin biosynthetic...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-15-426
更新日期:2014-06-04 00:00:00
abstract:BACKGROUND:Both male and female pigeons have the ability to produce a nutrient solution in their crop for the nourishment of their young. The production of the nutrient solution has been likened to lactation in mammals, and hence the product has been called pigeon 'milk'. It has been shown that pigeon 'milk' is essenti...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-12-452
更新日期:2011-09-19 00:00:00
abstract:BACKGROUND:Blood-born miRNA signatures have recently been reported for various tumor diseases. Here, we compared the miRNA signature in Wilms tumor patients prior and after preoperative chemotherapy according to SIOP protocol 2001. RESULTS:We did not find a significant difference between miRNA signature of both groups...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-379
更新日期:2012-08-07 00:00:00
abstract:BACKGROUND:The microsporidian Encephalitozoon cuniculi is an obligate intracellular eukaryotic pathogen with a small nuclear genome (2.9 Mbp) consisting of 11 chromosomes. Although each chromosome end is known to contain a single rDNA unit, the incomplete assembly of subtelomeric regions following sequencing of the gen...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-015-1920-7
更新日期:2016-01-07 00:00:00
abstract:BACKGROUND:Long non-coding RNAs (lncRNAs) exhibit remarkable cell-type specificity and disease association. LncRNA's functional versatility includes epigenetic modification, nuclear domain organization, transcriptional control, regulation of RNA splicing and translation, and modulation of protein activity. However, mos...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-019-5497-4
更新日期:2019-02-15 00:00:00
abstract:BACKGROUND:Sub-optimal developmental diets often have adverse effects on long-term fitness and health. One hypothesis is that such effects are caused by mismatches between the developmental and adult environment, and may be mediated by persistent changes in gene expression. However, there are few experimental tests of ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-017-3968-z
更新日期:2017-08-22 00:00:00
abstract:BACKGROUND:Although primarily known as the site of ribosome subunit production, the nucleolus is involved in numerous and diverse cellular processes. Recent large-scale proteomics projects have identified thousands of human proteins that associate with the nucleolus. However, in most cases, we know neither the fraction...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-12-74
更新日期:2011-01-27 00:00:00
abstract:BACKGROUND:Metagenomic sequencing is a powerful technology for studying the mixture of microbes or the microbiomes on human and in the environment. One basic task of analyzing metagenomic data is to identify the component genomes in the community. This task is challenging due to the complexity of microbiome composition...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-019-5467-x
更新日期:2019-04-04 00:00:00
abstract:BACKGROUND:Despite significant efforts from the research community, an extensive portion of the proteins encoded by human genes lack an assigned cellular function. Most metazoan proteins are composed of structural and/or functional domains, of which many appear in multiple proteins. Once a domain is characterized in on...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-7-48
更新日期:2006-03-13 00:00:00
abstract:BACKGROUND:The Ion Torrent PGM is a popular benchtop sequencer that shows promise in replacing conventional Sanger sequencing as the gold standard for mutation detection. Despite the PGM's reported high accuracy in calling single nucleotide variations, it tends to generate many false positive calls in detecting inserti...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-15-516
更新日期:2014-06-24 00:00:00
abstract:BACKGROUND:Bordetella petrii is the only environmental species hitherto found among the otherwise host-restricted and pathogenic members of the genus Bordetella. Phylogenetically, it connects the pathogenic Bordetellae and environmental bacteria of the genera Achromobacter and Alcaligenes, which are opportunistic patho...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-9-449
更新日期:2008-09-30 00:00:00
abstract:BACKGROUND:Starvation not only affects the nutritional and health status of the animals, but also the microbial composition in the host's intestine. Next-generation sequencing provides a unique opportunity to explore gut microbial communities and their interactions with hosts. However, studies on gut microbiomes have b...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-15-266
更新日期:2014-04-05 00:00:00
abstract:BACKGROUND:Infectious laryngotracheitis virus (ILTV; gallid herpesvirus 1) infection causes high mortality and huge economic losses in the poultry industry. To protect chickens against ILTV infection, chicken-embryo origin (CEO) and tissue-culture origin (TCO) vaccines have been used. However, the transmission of vacci...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-143
更新日期:2012-04-24 00:00:00
abstract:BACKGROUND:Succinate is produced petrochemically from maleic anhydride to satisfy a small specialty chemical market. If succinate could be produced fermentatively at a price competitive with that of maleic anhydride, though, it could replace maleic anhydride as the precursor of many bulk chemicals, transforming a multi...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-11-680
更新日期:2010-11-30 00:00:00
abstract:BACKGROUND:Simple sequence repeats (SSRs) have become widely used as molecular markers in plant genetic studies due to their abundance, high allelic variation at each locus and simplicity to analyze using conventional PCR amplification. To study plants with unknown genome sequence, SSR markers from Expressed Sequence T...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-016-3328-4
更新日期:2016-12-22 00:00:00
abstract:BACKGROUND:MicroRNAs (miRNAs) are post-transcriptional regulators of gene expression implicated in multiple cellular processes. Cyclic stretch of alveoli is characteristic of mechanical ventilation, and is postulated to be partly responsible for the lung injury and inflammation in ventilator-induced lung injury. We pro...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-154
更新日期:2012-04-26 00:00:00
abstract:BACKGROUND:Single cell transcriptome sequencing has become an increasingly valuable technology for dissecting complex biology at a resolution impossible with bulk sequencing. However, the gap between the technical expertise required to effectively work with the resultant high dimensional data and the biological experti...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-019-6053-y
更新日期:2019-08-27 00:00:00
abstract:BACKGROUND:Adaptive divergence driven by environmental heterogeneity has long been a fascinating topic in ecology and evolutionary biology. The study of the genetic basis of adaptive divergence has, however, been greatly hampered by a lack of genomic information. The recent development of transcriptome sequencing provi...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-149
更新日期:2012-04-24 00:00:00
abstract:BACKGROUND:Celiac disease (CD) is caused by an uncontrolled immune response to gluten, a heterogeneous mixture of wheat storage proteins. The CD-toxicity of these proteins and their derived peptides is depending on the presence of specific T-cell epitopes (9-mer peptides; CD epitopes) that mediate the stimulation of HL...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-277
更新日期:2012-06-22 00:00:00
abstract:BACKGROUND:Spinach downy mildew caused by the oomycete Peronospora effusa is a significant burden on the expanding spinach production industry, especially for organic farms where synthetic fungicides cannot be deployed to control the pathogen. P. effusa is highly variable and 15 new races have been recognized in the pa...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-018-5214-8
更新日期:2018-11-29 00:00:00
abstract:BACKGROUND:Understanding polyphenism, the ability of a single genome to express multiple morphologically and behaviourally distinct phenotypes, is an important goal for evolutionary and developmental biology. Polyphenism has been key to the evolution of the Hymenoptera, and particularly the social Hymenoptera where the...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-12-623
更新日期:2011-12-20 00:00:00
abstract:BACKGROUND:Respiratory syncytial virus (RSV) is an important cause of lower respiratory tract infection in young children. The degree of disease severity is determined by the host response to infection. Lung macrophages play an important early role in the host response to infection and we have used a systems-based appr...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-14-190
更新日期:2013-03-18 00:00:00
abstract:BACKGROUND:The recently developed RNA interference (RNAi) technology has created an unprecedented opportunity which allows the function of individual genes in whole organisms or cell lines to be interrogated at genome-wide scale. However, multiple issues, such as off-target effects or low efficacies in knocking down ce...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-10-220
更新日期:2009-05-12 00:00:00
abstract:BACKGROUND:Wine produced at low temperature is often considered to improve sensory qualities. However, there are certain drawbacks to low temperature fermentations: e.g. low growth rate, long lag phase, and sluggish or stuck fermentations. Selection and development of new Saccharomyces cerevisiae strains well adapted a...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-015-1755-2
更新日期:2015-07-22 00:00:00
abstract:BACKGROUND:Sacha Inchi (Plukenetia volubilis L., Euphorbiaceae) is a potential oilseed crop because the seeds of this plant are rich in unsaturated fatty acids (FAs). In particular, the fatty acid composition of its seed oil differs markedly in containing large quantities of α-linolenic acid (18C:3, a kind of ω-3 FAs)....
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-716
更新日期:2012-12-20 00:00:00
abstract:BACKGROUND:Pseudogenes are ubiquitous genetic elements that derive from functional genes after mutational inactivation. Characterization of pseudogenes is important to understand genome dynamics and evolution, and its significance increases when several genomes of related organisms can be compared. Among yeasts, only t...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-11-260
更新日期:2010-04-22 00:00:00
abstract:BACKGROUND:Matrix attachment regions (MAR) are the sites on genomic DNA that interact with the nuclear matrix. There is increasing evidence for the involvement of MAR in regulation of gene expression. The unsuitability of experimental detection of MAR for genome-wide analyses has led to the development of computational...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-8-418
更新日期:2007-11-15 00:00:00
abstract:BACKGROUND:Pancreatic cancer is a deadly disease with a five-year survival of less than 5%. A better understanding of the underlying biology may suggest novel therapeutic targets. Recent surveys of the pancreatic cancer genome have uncovered numerous new alterations; yet systematic functional characterization of candid...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-14-624
更新日期:2013-09-16 00:00:00
abstract:BACKGROUND:Transmembrane β-barrel proteins are a special class of transmembrane proteins which play several key roles in human body and diseases. Due to experimental difficulties, the number of transmembrane β-barrel proteins with known structures is very small. Over the years, a number of learning-based methods have b...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-S2-S5
更新日期:2012-04-12 00:00:00