Abstract:
BACKGROUND:RNA-seq and small RNA-seq are powerful, quantitative tools to study gene regulation and function. Common high-throughput sequencing methods rely on polymerase chain reaction (PCR) to expand the starting material, but not every molecule amplifies equally, causing some to be overrepresented. Unique molecular identifiers (UMIs) can be used to distinguish undesirable PCR duplicates derived from a single molecule and identical but biologically meaningful reads from different molecules. RESULTS:We have incorporated UMIs into RNA-seq and small RNA-seq protocols and developed tools to analyze the resulting data. Our UMIs contain stretches of random nucleotides whose lengths sufficiently capture diverse molecule species in both RNA-seq and small RNA-seq libraries generated from mouse testis. Our approach yields high-quality data while allowing unique tagging of all molecules in high-depth libraries. CONCLUSIONS:Using simulated and real datasets, we demonstrate that our methods increase the reproducibility of RNA-seq and small RNA-seq data. Notably, we find that the amount of starting material and sequencing depth, but not the number of PCR cycles, determine PCR duplicate frequency. Finally, we show that computational removal of PCR duplicates based only on their mapping coordinates introduces substantial bias into data analysis.
journal_name
BMC Genomicsjournal_title
BMC genomicsauthors
Fu Y,Wu PH,Beane T,Zamore PD,Weng Zdoi
10.1186/s12864-018-4933-1subject
Has Abstractpub_date
2018-07-13 00:00:00pages
531issue
1issn
1471-2164pii
10.1186/s12864-018-4933-1journal_volume
19pub_type
杂志文章相关文献
BMC GENOMICS文献大全abstract:BACKGROUND:Regulation of bacterial gene expression by small RNAs (sRNAs) have proved to be important for many biological processes. Francisella tularensis is a highly pathogenic Gram-negative bacterium that causes the disease tularaemia in humans and animals. Relatively little is known about the regulatory networks exi...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-11-625
更新日期:2010-11-10 00:00:00
abstract:BACKGROUND:Progress in the fields of protein separation and identification technologies has accelerated research into biofluids proteomics for protein biomarker discovery. Urine has become an ideal and rich source of biomarkers in clinical proteomics. Here we performed a proteomic analysis of urine samples from pregnan...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-14-777
更新日期:2013-11-11 00:00:00
abstract:BACKGROUND:Abnormalities of pre-mRNA splicing are increasingly recognized as an important mechanism through which gene mutations cause disease. However, apart from the mutations in the donor and acceptor sites, the effects on splicing of other sequence variations are difficult to predict. Loosely defined exonic and int...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-7-243
更新日期:2006-09-22 00:00:00
abstract:BACKGROUND:Over the last decade, emerging research methods, such as comparative genomic analysis and phylogenetic study, have yielded new insights into genotypes and phenotypes of closely related bacterial strains. Several findings have revealed that genomic structural variations (SVs), including gene gain/loss, gene d...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-015-1259-0
更新日期:2015-02-14 00:00:00
abstract:BACKGROUND:Transposable elements (TEs) are mobile genetic sequences that randomly propagate within their host's genome. This mobility has the potential to affect gene transcription and cause disease. However, TEs are technically challenging to identify, which complicates efforts to assess the impact of TE insertions on...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-018-4485-4
更新日期:2018-02-01 00:00:00
abstract:BACKGROUND:Magnaporthe oryzae (anamorph Pyricularia oryzae) is the causal agent of blast disease of Poaceae crops and their wild relatives. To understand the genetic mechanisms that drive host specialization of M. oryzae, we carried out whole genome resequencing of four M. oryzae isolates from rice (Oryza sativa), one ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-016-2690-6
更新日期:2016-05-18 00:00:00
abstract:BACKGROUND:We previously described the first respiratory Saccharomyces cerevisiae strain, KOY.TM6*P, by integrating the gene encoding a chimeric hexose transporter, Tm6*, into the genome of an hxt null yeast. Subsequently we transferred this respiratory phenotype in the presence of up to 50 g/L glucose to a yeast strai...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-9-365
更新日期:2008-07-31 00:00:00
abstract:BACKGROUND:The steadily increasing number of prokaryotic genomes has accelerated the study of genome evolution; in particular, the availability of sets of genomes from closely related bacteria has facilitated the exploration of the mechanisms underlying genome plasticity. The family Vibrionaceae is found in the Gammapr...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-10-S1-S11
更新日期:2009-07-07 00:00:00
abstract:BACKGROUND:Membrane proteins constitute up to 30% of the human proteome. These proteins have special properties because the transmembrane segments are embedded into lipid bilayer while extramembranous parts are in different environments. Membrane proteins have several functions and are involved in numerous diseases. A ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-019-5865-0
更新日期:2019-07-16 00:00:00
abstract:BACKGROUND:Human herpesvirus-6A and -6B (HHV-6) are betaherpesviruses that reach > 90% seroprevalence in the adult population. Unique among human herpesviruses, HHV-6 can integrate into the subtelomeric regions of human chromosomes; when this occurs in germ line cells it causes a condition called inherited chromosomall...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-018-4604-2
更新日期:2018-03-20 00:00:00
abstract:BACKGROUND:Genomic inversion is one type of structural variations (SVs) and is known to play an important biological role. An established problem in sequence data analysis is calling inversions from high-throughput sequence data. It is more difficult to detect inversions because they are surrounded by duplication or ot...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-020-6585-1
更新日期:2020-03-05 00:00:00
abstract:BACKGROUND:Modern broiler chickens exhibit very rapid growth and high feed efficiency compared to unselected chicken breeds. The improved production efficiency in modern broiler chickens was achieved by the intensive genetic selection for meat production. This study was designed to investigate the genetic alterations a...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-016-3471-y
更新日期:2017-01-13 00:00:00
abstract:BACKGROUND:The Gram-negative bacterium Chlamydia pneumoniae (Cpn) is the leading intracellular human pathogen responsible for respiratory infections such as pneumonia and bronchitis. Basic and applied research in pathogen biology, especially the elaboration of new mechanism-based anti-pathogen strategies, target discov...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-632
更新日期:2012-11-16 00:00:00
abstract:BACKGROUND:Understanding how plants and pathogens modulate gene expression during the host-pathogen interaction is key to uncovering the molecular mechanisms that regulate disease progression. Recent advances in sequencing technologies have provided new opportunities to decode the complexity of such interactions. In th...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-016-2684-4
更新日期:2016-05-20 00:00:00
abstract:BACKGROUND:There is growing recognition that horizontal DNA transfer, a process known to be common in prokaryotes, is also a significant source of genomic variation in eukaryotes. Horizontal transfer of transposable elements (HTT) may be especially prevalent in eukaryotes given the inherent mobility, widespread occurre...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-14-134
更新日期:2013-02-27 00:00:00
abstract:BACKGROUND:Reading disability (RD) is a common syndrome with a large genetic component. Chromosome 6 has been identified in several linkage studies as playing a significant role. A more recent study identified a peak of transmission disequilibrium to marker JA04 (G72384) on chromosome 6p22.3, suggesting that a gene is ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-4-25
更新日期:2003-06-30 00:00:00
abstract:BACKGROUND:RNA editing is an important mechanism that expands the diversity and complexity of genetic codes. The conversions of adenosine (A) to inosine (I) and cytosine (C) to uridine (U) are two prominent types of RNA editing in animals. The roles of RNA editing events have been implicated in important biological pat...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-017-4330-1
更新日期:2018-01-19 00:00:00
abstract:BACKGROUND:Triptolide is a therapeutic diterpenoid derived from the Chinese herb Tripterygium wilfordii Hook f. Triptolide has been shown to induce apoptosis by activation of pro-apoptotic proteins, inhibiting NFkB and c-KIT pathways, suppressing the Jak2 transcription, activating MAPK8/JNK signaling and modulating the...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-015-1614-1
更新日期:2015-06-30 00:00:00
abstract:BACKGROUND:Fusarium graminearum virus 1 strain-DK21 (FgV1-DK21) is a mycovirus that confers hypovirulence to F. graminearum, which is the primary phytopathogenic fungus that causes Fusarium head blight (FHB) disease in many cereals. Understanding the interaction between mycoviruses and plant pathogenic fungi is necessa...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-173
更新日期:2012-05-06 00:00:00
abstract:BACKGROUND:Daphnia (Crustacea: Cladocera) plays a central role in standing aquatic ecosystems, has a well known ecology and is widely used in population studies and environmental risk assessments. Daphnia magna is, especially in Europe, intensively used to study stress responses of natural populations to pollutants, cl...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-12-309
更新日期:2011-06-13 00:00:00
abstract:BACKGROUND:Several global transcriptomic and proteomic approaches have been applied in order to obtain new molecular insights on skeletal myogenesis, but none has generated any specific data on glycogenome expression, and thus on the role of glycan structures in this process, despite the involvement of glycoconjugates ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-10-483
更新日期:2009-10-20 00:00:00
abstract:BACKGROUND:The disease caused by Haemonchus contortus, a blood-feeding nematode of small ruminants, is of major economic importance worldwide. The infective third-stage larva (L3) of this gastric nematode is enclosed in a cuticle (sheath) and, once ingested with herbage by the host, undergoes an exsheathment process th...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-11-266
更新日期:2010-04-27 00:00:00
abstract:BACKGROUND:The autism spectrum encompasses a set of complex multigenic developmental disorders that severely impact the development of language, non-verbal communication, and social skills, and are associated with odd, stereotyped, repetitive behavior and restricted interests. To date, diagnosis of these neurologically...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-7-118
更新日期:2006-05-18 00:00:00
abstract:BACKGROUND:Eucalyptus is one of the most important sources of industrial cellulose. Three species of this botanical group are intensively used in breeding programs: E. globulus, E. grandis and E. urophylla. E. globulus is adapted to subtropical/temperate areas and is considered a source of high-quality cellulose; E. gr...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-14-201
更新日期:2013-03-22 00:00:00
abstract:BACKGROUND:Variation in gene expression between two Drosophila melanogaster strains, as revealed by transcriptional profiling, seldom corresponded to variation in proximal promoter sequence for 34 genes analyzed. Two sets of protein-coding genes were selected from pre-existing microarray data: (1) those whose expressio...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-6-110
更新日期:2005-08-17 00:00:00
abstract:BACKGROUND:Much of the morphological diversity in eukaryotes results from differential regulation of gene expression in which transcription factors (TFs) play a central role. The nematode Caenorhabditis elegans is an established model organism for the study of the roles of TFs in controlling the spatiotemporal pattern ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-9-399
更新日期:2008-08-27 00:00:00
abstract:BACKGROUND:Teladorsagia circumcincta (order Strongylida) is an economically important parasitic nematode of small ruminants (including sheep and goats) in temperate climatic regions of the world. Improved insights into the molecular biology of this parasite could underpin alternative methods required to control this an...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-S7-S10
更新日期:2012-01-01 00:00:00
abstract:BACKGROUND:The purpose of this research was to develop a novel information theoretic method and an efficient algorithm for analyzing the gene-gene (GGI) and gene-environmental interactions (GEI) associated with quantitative traits (QT). The method is built on two information-theoretic metrics, the k-way interaction inf...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-10-509
更新日期:2009-11-04 00:00:00
abstract:BACKGROUND:ATP binding cassette (ABC) systems are responsible for the import and export of a wide variety of molecules across cell membranes and comprise one of largest protein superfamilies found in prokarya, eukarya and archea. ABC systems play important roles in bacterial lifestyle, virulence and survival. In this s...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-8-83
更新日期:2007-03-28 00:00:00
abstract:BACKGROUND:Currently, diabetes has become one of the leading causes of death worldwide. Fasting plasma glucose (FPG) levels that are higher than optimal, even if below the diagnostic threshold of diabetes, can also lead to increased morbidity and mortality. Here we intend to study the magnitude of the genetic influence...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-020-06898-z
更新日期:2020-07-18 00:00:00