Abstract:
:The advances of next-generation sequencing technology have facilitated metagenomics research that attempts to determine directly the whole collection of genetic material within an environmental sample (i.e. the metagenome). Identification of genes directly from short reads has become an important yet challenging problem in annotating metagenomes, since the assembly of metagenomes is often not available. Gene predictors developed for whole genomes (e.g. Glimmer) and recently developed for metagenomic sequences (e.g. MetaGene) show a significant decrease in performance as the sequencing error rates increase, or as reads get shorter. We have developed a novel gene prediction method FragGeneScan, which combines sequencing error models and codon usages in a hidden Markov model to improve the prediction of protein-coding region in short reads. The performance of FragGeneScan was comparable to Glimmer and MetaGene for complete genomes. But for short reads, FragGeneScan consistently outperformed MetaGene (accuracy improved ∼62% for reads of 400 bases with 1% sequencing errors, and ∼18% for short reads of 100 bases that are error free). When applied to metagenomes, FragGeneScan recovered substantially more genes than MetaGene predicted (>90% of the genes identified by homology search), and many novel genes with no homologs in current protein sequence database.
journal_name
Nucleic Acids Resjournal_title
Nucleic acids researchauthors
Rho M,Tang H,Ye Ydoi
10.1093/nar/gkq747subject
Has Abstractpub_date
2010-11-01 00:00:00pages
e191issue
20eissn
0305-1048issn
1362-4962pii
gkq747journal_volume
38pub_type
杂志文章abstract::Sexual differentiation in Drosophila is regulated through alternative splicing of doublesex. Female-specific splicing is activated through the activity of splicing enhancer complexes assembled on multiple repeat elements. Each of these repeats serves as a binding platform for the cooperative assembly of a heterotrimer...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkl984
更新日期:2006-01-01 00:00:00
abstract::A set of programs is presented for the reconstruction of a DNA sequence from data generated by the M13 shotgun sequencing technique. Once the sequence has been established and stored other programs are used for its analysis. The programs have been written for the Apple II microcomputer. A minimum investment is require...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/10.1.39
更新日期:1982-01-11 00:00:00
abstract::Three Rat1/Xrn2 homologues exist in Arabidopsis thaliana: nuclear AtXRN2 and AtXRN3, and cytoplasmic AtXRN4. The latter has a role in degrading 3' products of miRNA-mediated mRNA cleavage, whereas all three proteins act as endogenous post-transcriptional gene silencing suppressors. Here we show that, similar to yeast ...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkq172
更新日期:2010-07-01 00:00:00
abstract::The SRPDB (signal recognition particle database) provides aligned SRP RNA and protein sequences, annotated and phylogenetically ordered. This release includes 82 SRP RNAs (including 22 bacterial and 9 archaeal homologs) and a total of 20 protein sequences representing SRP9, SRP14, SRP19, SRP54, SRP68, and SRP72. The o...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/22.17.3483
更新日期:1994-09-01 00:00:00
abstract::Cells of the central nervous system (CNS) are prone to the devastating consequences of trinucleotide repeat (TNR) expansion. Some CNS cells, including astrocytes, show substantial TNR instability in affected individuals. Since astrocyte enrichment occurs in brain regions sensitive to neurodegeneration and somatic TNR ...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkl614
更新日期:2006-01-01 00:00:00
abstract::The Bovine Genome Database (BGD) (http://bovinegenome.org) has been the key community bovine genomics database for more than a decade. To accommodate the increasing amount and complexity of bovine genomics data, BGD continues to advance its practices in data acquisition, curation, integration and efficient data retrie...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkz944
更新日期:2020-01-08 00:00:00
abstract::Large-scale sequencing studies discovered substantial genetic variants occurring in enhancers which regulate genes via long range chromatin interactions. Importantly, such variants could affect enhancer regulation by changing transcription factor bindings or enhancer hijacking, and in turn, make an essential contribut...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkx920
更新日期:2018-01-04 00:00:00
abstract::A cDNA library was constructed from the mRNA of the Ig lambda producing Burkitt's lymphoma cell line, EB4. Overlapping clones encompassing the coding sequence of the Ig lambda mRNA were isolated and sequenced. The predicted amino acid sequence shows a short hydrophobic leader peptide and a mature polypeptide of 217 re...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/13.8.2931
更新日期:1985-04-25 00:00:00
abstract::The ASTRAL compendium provides several databases and tools to aid in the analysis of protein structures, particularly through the use of their sequences. It is partially derived from the SCOP database of protein domains, and it includes sequences for each domain as well as other resources useful for studying these seq...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/30.1.260
更新日期:2002-01-01 00:00:00
abstract::Using total internal reflection fluorescence microscopy, we directly visualize in real-time, the 1D Brownian motion and transcription elongation of T7 RNA polymerase along aligned DNA molecules bound to substrates by molecular combing. We fluorescently label T7 RNA polymerase with antibodies and use flow to convect th...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkm332
更新日期:2007-01-01 00:00:00
abstract::VIDA is a new virus database that organizes open reading frames (ORFs) from partial and complete genomic sequences from animal viruses. Currently VIDA includes all sequences from GenBank for Herpesviridae, Coronaviridae and Arteriviridae. The ORFs are organized into homologous protein families, which are identified on...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/29.1.133
更新日期:2001-01-01 00:00:00
abstract::Sequencing of RNAs (RNA-Seq) has revolutionized the field of transcriptomics, but the reads obtained often contain errors. Read error correction can have a large impact on our ability to accurately assemble transcripts. This is especially true for de novo transcriptome analysis, where a reference genome is not availab...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkt215
更新日期:2013-05-01 00:00:00
abstract::We have sequenced the 480 base pair (bp) repeating unit of the 5S RNA genes of the Dipteran fly Calliphora erythrocephala and compared this sequence to the three known 5S RNA gene sequences from the Dipteran Genus Drosophila (1,2). A striking series of five perfectly conserved homologies identically positioned within ...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/12.21.8193
更新日期:1984-11-12 00:00:00
abstract::Advances in single-cell RNA-sequencing techniques reveal the existence of distinct cell subpopulations. Identification of transcription factors (TFs) that define the identity of these subpopulations poses a challenge. Here, we postulate that identity depends on background subpopulations, and is determined by a synergi...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkz147
更新日期:2019-04-23 00:00:00
abstract::Guanine-rich DNA sequences tend to form four-stranded G-quadruplex structures. Characteristic glycosidic conformational patterns along the G-strands, such as the 5'-syn-anti-syn-anti pattern observed with the Oxytricha nova telomeric G-quadruplexes, have been well documented. However, an explanation for these featured...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkr031
更新日期:2011-05-01 00:00:00
abstract::GenBank is a comprehensive database that contains publicly available DNA sequences for more than 140 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the BankIt (web) or Sequin program an...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkh045
更新日期:2004-01-01 00:00:00
abstract::Laser microirradiation is a powerful tool for real-time single-cell analysis of the DNA damage response (DDR). It is often found, however, that factor recruitment or modification profiles vary depending on the laser system employed. This is likely due to an incomplete understanding of how laser conditions/dosages affe...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkv976
更新日期:2016-02-18 00:00:00
abstract::The phenotypic adjustments of Mycobacterium tuberculosis are commonly inferred from the analysis of transcript abundance. While mechanisms of transcriptional regulation have been extensively analysed in mycobacteria, little is known about mechanisms that shape the transcriptome by regulating RNA decay rates. The aim o...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkz251
更新日期:2019-06-20 00:00:00
abstract::We have studied the effect of selenium on the expression of a cellular glutathione peroxidase, GSHPx-1, in transfected MCF-7 cells and in doxorubicin-resistant (Adrr) MCF-7 cells. A GSHPx-1 cDNA with a Rous Sarcoma virus promoter was transfected into a human mammary carcinoma cell line, MCF-7, which has very low endog...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/18.6.1531
更新日期:1990-03-25 00:00:00
abstract::The Chinese hamster cell line, DC-3F, is heterozygous at the DHFR locus, and each allele can be distinguished on the basis of a unique DNA restriction pattern, protein isoelectric profile and in the abundancy of the DHFR mRNAs it expresses. Although each allele produces four transcripts, 1000, 1650 and 2150 nucleotide...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/20.24.6597
更新日期:1992-12-25 00:00:00
abstract::APOBEC3G (A3G), a host protein that inhibits HIV-1 reverse transcription and replication in the absence of Vif, displays cytidine deaminase and single-stranded (ss) nucleic acid binding activities. HIV-1 nucleocapsid protein (NC) also binds nucleic acids and has a unique property, nucleic acid chaperone activity, whic...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkm750
更新日期:2007-01-01 00:00:00
abstract::A feature of Haemophilus influenzae genomes is the presence of several loci containing tracts of six or more identical tetranucleotide repeat units. These repeat tracts are unstable and mediate high frequency, reversible alterations in the expression of surface antigens. This process, termed phase variation (PV), enab...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gki180
更新日期:2005-01-14 00:00:00
abstract::The X chromosome-linked inhibitor of apoptosis protein (XIAP) is the most potent intrinsic caspase inhibitor and plays an important role in the maintenance of intestinal epithelial integrity. The RNA binding protein, HuR, regulates the stability and translation of many target transcripts. Here, we report that HuR asso...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkp755
更新日期:2009-12-01 00:00:00
abstract::Human Protein Reference Database (HPRD) (http://www.hprd.org) was developed to serve as a comprehensive collection of protein features, post-translational modifications (PTMs) and protein-protein interactions. Since the original report, this database has increased to >20 000 proteins entries and has become the largest...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkj141
更新日期:2006-01-01 00:00:00
abstract::Studies involving ribozyme-directed inactivation of targeted RNA molecules have met with mixed success, making clear the importance of methods to measure and optimize ribozyme activity within cells. The interpretation of biochemical assays for determining ribozyme activity in the cellular environment have been complic...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/26.15.3494
更新日期:1998-08-01 00:00:00
abstract::The cells of the bronchial epithelium of man are targets for benzo(a)pyrene carcinogenesis. When cultures of these cells, and of non-target fibroblasts, are exposed to [3H]-benzo(a)pyrene, we find that the epithelial cells metabolise and bind to DNA far greater amounts of benzpyrene than do fibroblasts. By analysis of...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/10.5.1547
更新日期:1982-03-11 00:00:00
abstract::Polymers of random 14 mer oligonucleotides are shown to detect discrete loci in the human genome. Eighteen different synthetic tandem repeats of random 14 base-pair units (STRs) have been generated and all of them turn out to detect polymorphic loci on southern blots of human DNA samples, presumably corresponding to a...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/17.19.7623
更新日期:1989-10-11 00:00:00
abstract::Most E2F-binding sites repress transcription through the recruitment of Retinoblastoma (RB) family members until the end of the G1 cell-cycle phase. Although the MYB promoter contains an E2F-binding site, its transcription is activated shortly after the exit from quiescence, before RB family members inactivation, by u...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkx641
更新日期:2017-09-29 00:00:00
abstract::We have constructed and tested several new vectors for P element-mediated gene transfer. These vectors contain restriction sites for cloning a wide variety of DNA fragments within a small, non-autonomous P element and can be used to efficiently transduce microinjected DNA sequences into the germ line chromosomes of D....
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/11.18.6341
更新日期:1983-09-24 00:00:00
abstract::The mouse c-Ki-ras protooncogene promoter contains a homopurine-homopyrimidine domain that exhibits S1 nuclease sensitivity in vitro. We have studied the structure of this DNA region in a supercoiled state using a number of chemical probes for non-B DNA conformations including diethyl pyrocarbonate, osmium tetroxide, ...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/19.23.6527
更新日期:1991-12-11 00:00:00