UnSplicer: mapping spliced RNA-Seq reads in compact genomes and filtering noisy splicing.

Abstract:

:Accurate mapping of spliced RNA-Seq reads to genomic DNA has been known as a challenging problem. Despite significant efforts invested in developing efficient algorithms, with the human genome as a primary focus, the best solution is still not known. A recently introduced tool, TrueSight, has demonstrated better performance compared with earlier developed algorithms such as TopHat and MapSplice. To improve detection of splice junctions, TrueSight uses information on statistical patterns of nucleotide ordering in intronic and exonic DNA. This line of research led to yet another new algorithm, UnSplicer, designed for eukaryotic species with compact genomes where functional alternative splicing is likely to be dominated by splicing noise. Genome-specific parameters of the new algorithm are generated by GeneMark-ES, an ab initio gene prediction algorithm based on unsupervised training. UnSplicer shares several components with TrueSight; the difference lies in the training strategy and the classification algorithm. We tested UnSplicer on RNA-Seq data sets of Arabidopsis thaliana, Caenorhabditis elegans, Cryptococcus neoformans and Drosophila melanogaster. We have shown that splice junctions inferred by UnSplicer are in better agreement with knowledge accumulated on these well-studied genomes than predictions made by earlier developed tools.

journal_name

Nucleic Acids Res

journal_title

Nucleic acids research

authors

Burns PD,Li Y,Ma J,Borodovsky M

doi

10.1093/nar/gkt1141

subject

Has Abstract

pub_date

2014-02-01 00:00:00

pages

e25

issue

4

eissn

0305-1048

issn

1362-4962

pii

gkt1141

journal_volume

42

pub_type

杂志文章
  • uORFdb--a comprehensive literature database on eukaryotic uORF biology.

    abstract::Approximately half of all human transcripts contain at least one upstream translational initiation site that precedes the main coding sequence (CDS) and gives rise to an upstream open reading frame (uORF). We generated uORFdb, publicly available at http://cbdm.mdc-berlin.de/tools/uorfdb, to serve as a comprehensive li...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkt952

    authors: Wethmar K,Barbosa-Silva A,Andrade-Navarro MA,Leutz A

    更新日期:2014-01-01 00:00:00

  • The enigmatic mitochondrial ORF ymf39 codes for ATP synthase chain b.

    abstract::ymf39 is a conserved hypothetical protein-coding gene found in mitochondrial genomes of land plants and certain protists. We speculated earlier, based on a weak sequence similarity between Ymf39 from a green alga and the atpF gene product from Bradyrhizobium, that ymf39 might code for subunit b of mitochondrial F(0)F(...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkg326

    authors: Burger G,Lang BF,Braun HP,Marx S

    更新日期:2003-05-01 00:00:00

  • DNA sequencing: bench to bedside and beyond.

    abstract::Fifteen years elapsed between the discovery of the double helix (1953) and the first DNA sequencing (1968). Modern DNA sequencing began in 1977, with development of the chemical method of Maxam and Gilbert and the dideoxy method of Sanger, Nicklen and Coulson, and with the first complete DNA sequence (phage X174), whi...

    journal_title:Nucleic acids research

    pub_type: 历史文章,杂志文章

    doi:10.1093/nar/gkm688

    authors: Hutchison CA 3rd

    更新日期:2007-01-01 00:00:00

  • Summary: the modified nucleosides of RNA.

    abstract::A comprehensive listing is made of posttranscriptionally modified nucleosides from RNA reported in the literature through mid-1994. Included are chemical structures, common names, symbols, Chemical Abstracts registry numbers (for ribonucleoside and corresponding base), Chemical Abstracts Index Name, phylogenetic sourc...

    journal_title:Nucleic acids research

    pub_type: 杂志文章,评审

    doi:10.1093/nar/22.12.2183

    authors: Limbach PA,Crain PF,McCloskey JA

    更新日期:1994-06-25 00:00:00

  • Two classes of EF1-family translational GTPases encoded by giant viruses.

    abstract::Giant viruses have extraordinarily large dsDNA genomes, and exceptionally, they encode various components of the translation apparatus, including tRNAs, aminoacyl-tRNA synthetases and translation factors. Here, we focused on the elongation factor 1 (EF1) family of viral translational GTPases (trGTPases), using computa...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkz296

    authors: Zinoviev A,Kuroha K,Pestova TV,Hellen CUT

    更新日期:2019-06-20 00:00:00

  • Single locus microsatellites isolated using 5' anchored PCR.

    abstract::Microsatellites are widely used as genetic markers because they are co-dominant, multiallelic, easily scored and highly polymorphic. A major drawback of microsatellite markers is the time and cost required to characterise them. We have developed a novel technique to reduce this cost by producing a microsatellite-rich ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/24.21.4369

    authors: Fisher PJ,Gardner RC,Richardson TE

    更新日期:1996-11-01 00:00:00

  • Genomic organization and transcription of the alpha beta heat shock DNA in Drosophila melanogaster.

    abstract::Previous studies have shown that (i) several RNAs induced by heat shock of Drosophila melanogaster cells are homologous to tandemly repeated alpha beta units found in cloned segments of D. melanogaster DNA, and (ii) the alpha beta sequences are present both at a major heat shock locus, 87Cl, and the chromocenter of po...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/9.20.5297

    authors: Lis JT,Ish-Horowicz D,Pinchin SM

    更新日期:1981-10-24 00:00:00

  • General method of rapid Smith/Birnstiel mapping adds for gap closure in shotgun microbial genome sequencing projects: application to Pseudomonas putida KT2440.

    abstract::A physical mapping strategy has been developed to verify and accelerate the assembly and gap closure phase of a microbial genome shotgun-sequencing project. The protocol was worked out during the ongoing Pseudomonas putida KT2440 genome project. A macro-restriction map was constructed by linking probe hybridisation of...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/29.22.e110

    authors: Weinel C,Tümmler B,Hilbert H,Nelson KE,Kiewitz C

    更新日期:2001-11-15 00:00:00

  • Phosphorylation of Kruppel-like factor 5 (KLF5/IKLF) at the CBP interaction region enhances its transactivation function.

    abstract::The Kruppel-like factor 5 (KLF5/IKLF) belongs to the Kruppel family of genes which bind GC-rich DNA elements and activate or repress their target genes in a promoter context and/or cellular environment-dependent manner. In the present study, we used the Gal4 fusion assay system to characterize the mechanism of transac...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkg310

    authors: Zhang Z,Teng CT

    更新日期:2003-04-15 00:00:00

  • jpHMM: improving the reliability of recombination prediction in HIV-1.

    abstract::Previously, we developed jumping profile hidden Markov model (jpHMM), a new method to detect recombinations in HIV-1 genomes. The jpHMM predicts recombination breakpoints in a query sequence and assigns to each position of the sequence one of the major HIV-1 subtypes. Since incorrect subtype assignment or recombinatio...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkp371

    authors: Schultz AK,Zhang M,Bulla I,Leitner T,Korber B,Morgenstern B,Stanke M

    更新日期:2009-07-01 00:00:00

  • Abf1 and other general regulatory factors control ribosome biogenesis gene expression in budding yeast.

    abstract::Ribosome biogenesis in Saccharomyces cerevisiae involves a regulon of >200 genes (Ribi genes) coordinately regulated in response to nutrient availability and cellular growth rate. Two cis-acting elements called PAC and RRPE are known to mediate Ribi gene repression in response to nutritional downshift. Here, we show t...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkx058

    authors: Bosio MC,Fermi B,Spagnoli G,Levati E,Rubbi L,Ferrari R,Pellegrini M,Dieci G

    更新日期:2017-05-05 00:00:00

  • Polymerization retardation isothermal amplification (PRIA): a strategy enables sensitively quantify genome-wide 5-methylcytosine oxides rapidly on handy instruments with nanoscale sample input.

    abstract::The current methods for quantifying genome-wide 5-methylcytosine (5mC) oxides are still scarce, mostly restricted with two limitations: assay sensitivity is seriously compromised with cost, assay time and sample input; epigenetic information is irreproducible during polymerase chain reaction (PCR) amplification withou...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkz704

    authors: Chen D,Wang Y,Mo M,Zhang J,Zhang Y,Xu Y,Liu SY,Chen J,Ma Y,Zhang L,Dai Z,Cai C,Zou X

    更新日期:2019-11-04 00:00:00

  • Plasmodium falciparum heterochromatin protein 1 binds to tri-methylated histone 3 lysine 9 and is linked to mutually exclusive expression of var genes.

    abstract::Increasing experimental evidence shows a prominent role of histone modifications in the coordinated control of gene expression in the human malaria parasite Plasmodium falciparum. The search for the histone-mark-reading machinery that translates histone modifications into biological processes, such as formation of het...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkp115

    authors: Pérez-Toledo K,Rojas-Meza AP,Mancio-Silva L,Hernández-Cuevas NA,Delgadillo DM,Vargas M,Martínez-Calvillo S,Scherf A,Hernandez-Rivas R

    更新日期:2009-05-01 00:00:00

  • Relief of triple-helix-mediated promoter inhibition by elongating RNA polymerases.

    abstract::We have characterized triple-helix-mediated inhibition of an artificial bacteriophage promoter with respect to relief of inhibition by incoming RNA polymerases that initiate upstream or downstream from the operator sequence. Whereas oligonucleotide-directed triple-helix formation inhibits the test promoter, promoter a...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/21.17.4055

    authors: Skoog JU,Maher LJ 3rd

    更新日期:1993-08-25 00:00:00

  • Microamplification of specific chromosome sequences; an improved method for genome analysis.

    abstract::An improved method was developed for microdissection and cloning of metaphase as well as pachytene chromosomes. The protocol incorporates efficient ligation of chromosomal DNA with linker adaptors, abolishment of microcloning steps and the reduction of micromanipulation. The threshold for amplifying genomic DNA templa...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/25.17.3555

    authors: Ponelies N,Stein N,Weber G

    更新日期:1997-09-01 00:00:00

  • Structural studies of type I topoisomerases.

    abstract::Topoisomerases are ubiquitous proteins found in all three domains of life. They change the topology of DNA via transient breaks on either one or two of the DNA strands to allow passage of another single or double DNA strand through the break. Topoisomerases are classified into two types: type I enzymes cleave one DNA ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章,评审

    doi:10.1093/nar/gkn1009

    authors: Baker NM,Rajan R,Mondragón A

    更新日期:2009-02-01 00:00:00

  • ENdb: a manually curated database of experimentally supported enhancers for human and mouse.

    abstract::Enhancers are a class of cis-regulatory elements that can increase gene transcription by forming loops in intergenic regions, introns and exons. Enhancers, as well as their associated target genes, and transcription factors (TFs) that bind to them, are highly associated with human disease and biological processes. Alt...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkz973

    authors: Bai X,Shi S,Ai B,Jiang Y,Liu Y,Han X,Xu M,Pan Q,Wang F,Wang Q,Zhang J,Li X,Feng C,Li Y,Wang Y,Song Y,Feng K,Li C

    更新日期:2020-01-08 00:00:00

  • Genome sequence of Shigella flexneri 2a: insights into pathogenicity through comparison with genomes of Escherichia coli K12 and O157.

    abstract::We have sequenced the genome of Shigella flexneri serotype 2a, the most prevalent species and serotype that causes bacillary dysentery or shigellosis in man. The whole genome is composed of a 4 607 203 bp chromosome and a 221 618 bp virulence plasmid, designated pCP301. While the plasmid shows minor divergence from th...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkf566

    authors: Jin Q,Yuan Z,Xu J,Wang Y,Shen Y,Lu W,Wang J,Liu H,Yang J,Yang F,Zhang X,Zhang J,Yang G,Wu H,Qu D,Dong J,Sun L,Xue Y,Zhao A,Gao Y,Zhu J,Kan B,Ding K,Chen S,Cheng H,Yao Z,He B,Chen R,Ma D,Qiang B,

    更新日期:2002-10-15 00:00:00

  • Specificities of human, rat and E. coli O6-methylguanine-DNA methyltransferases towards the repair of O6-methyl and O6-ethylguanine in DNA.

    abstract::The behaviour of highly purified bacterial expressed rat O6-methylguanine-DNA methyltransferase (MGMT) towards the repair of CGCm6GAGCTCGCG and CGCe6GAGCTCGCG (km6G/ke6G = 1.45, where k is the second order repair rate constant determined, m6G and e6G are O6-methyl and O6-ethylguanine) is similar to that of E. coli 39k...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/22.9.1613

    authors: Liem LK,Lim A,Li BF

    更新日期:1994-05-11 00:00:00

  • Transcription of cloned Moloney murine leukemia proviral DNA injected into Xenopus laevis oocytes.

    abstract::We have microinjected genomic DNA clones containing the Moloney murine leukemia virus (M-MuLV) proviral genome and flanking mouse sequences from Mov-3, Mov-7 and Mov-10 mice into Xenopus laevis oocytes and analyzed the virus-specific transcription and translation products. These mouse strains carry a proviral genome c...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/11.12.3989

    authors: Breindl M,Kalthoff H,Jaenisch R

    更新日期:1983-06-25 00:00:00

  • Human transcription factor IIIC contains a polypeptide of 55 kDa specifically binding to Pol III genes.

    abstract::Human transcription factor IIIC contains a 55 kDa polypeptide which specifically interacts with the Adenovirus 2 VAI gene promoter and which mimics most of the DNA binding properties of the entire factor. The specificity and affinity of this protein:DNA interaction was demonstrated by: (i) Separation of purified fract...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/18.16.4743

    authors: Schneider HR,Waldschmidt R,Seifart KH

    更新日期:1990-08-25 00:00:00

  • Dynamics of genetic variation in transcription factors and its implications for the evolution of regulatory networks in Bacteria.

    abstract::The evolution of regulatory networks in Bacteria has largely been explained at macroevolutionary scales through lateral gene transfer and gene duplication. Transcription factors (TF) have been found to be less conserved across species than their target genes (TG). This would be expected if TFs accumulate mutations fas...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkaa162

    authors: Ali F,Seshasayee ASN

    更新日期:2020-05-07 00:00:00

  • Expanding the structural and functional diversity of RNA: analog uridine triphosphates as candidates for in vitro selection of nucleic acids.

    abstract::Two analog uridine triphosphates tethering additional functionality, one a primary amino group and the second a mercapto group, were prepared and tested for their compatibility with in vitro RNA selection procedures. 5-(3-Aminopropyl)uridine triphosphate (UNH(2)) as a uridine substitute was a more effective substrate ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/28.17.3316

    authors: Vaish NK,Fraley AW,Szostak JW,McLaughlin LW

    更新日期:2000-09-01 00:00:00

  • DNA targeting by Clostridium cellulolyticum CRISPR-Cas9 Type II-C system.

    abstract::Type II CRISPR-Cas9 RNA-guided nucleases are widely used for genome engineering. Type II-A SpCas9 protein from Streptococcus pyogenes is the most investigated and highly used enzyme of its class. Nevertheless, it has some drawbacks, including a relatively big size, imperfect specificity and restriction to DNA targets ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkz1225

    authors: Fedorova I,Arseniev A,Selkova P,Pobegalov G,Goryanin I,Vasileva A,Musharova O,Abramova M,Kazalov M,Zyubko T,Artamonova T,Artamonova D,Shmakov S,Khodorkovskii M,Severinov K

    更新日期:2020-02-28 00:00:00

  • GenDiS: Genomic Distribution of protein structural domain Superfamilies.

    abstract::Several proteins that have substantially diverged during evolution retain similar three-dimensional structures and biological function inspite of poor sequence identity. The database on Genomic Distribution of protein structural domain Superfamilies (GenDiS) provides record for the distribution of 4001 protein domains...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gki087

    authors: Pugalenthi G,Bhaduri A,Sowdhamini R

    更新日期:2005-01-01 00:00:00

  • Allotopic expression of mitochondrial-encoded genes in mammals: achieved goal, undemonstrated mechanism or impossible task?

    abstract::Mitochondrial-DNA diseases have no effective treatments. Allotopic expression-synthesis of a wild-type version of the mutated protein in the nuclear-cytosolic compartment and its importation into mitochondria-has been proposed as a gene-therapy approach. Allotopic expression has been successfully demonstrated in yeast...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkq769

    authors: Perales-Clemente E,Fernández-Silva P,Acín-Pérez R,Pérez-Martos A,Enríquez JA

    更新日期:2011-01-01 00:00:00

  • Identification, cloning and characterization of a new DNA-binding protein from the hyperthermophilic methanogen Methanopyrus kandleri.

    abstract::Three novel DNA-binding proteins with apparent molecular masses of 7, 10 and 30 kDa have been isolated from the hyperthermophilic methanogen Methanopyrus kandleri. The proteins were identified using a blot overlay assay that was modified to emulate the high ionic strength intracellular environment of M.kandleri protei...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/30.3.685

    authors: Pavlov NA,Cherny DI,Nazimov IV,Slesarev AI,Subramaniam V

    更新日期:2002-02-01 00:00:00

  • The nucleotide sequence of the chick cytoplasmic beta-actin gene.

    abstract::The nucleotide sequence of the chick beta-actin gene was determined. The gene contains 5 introns; 4 interrupt the translated region at codons 41/42, 120/122, 267, 327/328 and a large intron occurs in the 5' untranslated region. The gene has a 97 nucleotide 5'-untranslated region and a 594 nucleotide 3'-untranslated re...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/11.23.8287

    authors: Kost TA,Theodorakis N,Hughes SH

    更新日期:1983-12-10 00:00:00

  • Mouse DNA 'fingerprints': analysis of chromosome localization and germ-line stability of hypervariable loci in recombinant inbred strains.

    abstract::Human minisatellite probes cross-hybridize to mouse DNA and detect multiple variable loci. The resulting DNA "fingerprints" vary substantially between inbred strains but relatively little within an inbred strain. By studying the segregation of variable DNA fragments in BXD recombinant inbred strains of mice, at least ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/15.7.2823

    authors: Jeffreys AJ,Wilson V,Kelly R,Taylor BA,Bulfield G

    更新日期:1987-04-10 00:00:00

  • The mouse genome database: genotypes, phenotypes, and models of human disease.

    abstract::The laboratory mouse is the premier animal model for studying human biology because all life stages can be accessed experimentally, a completely sequenced reference genome is publicly available and there exists a myriad of genomic tools for comparative and experimental research. In the current era of genome scale, dat...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gks1115

    authors: Bult CJ,Eppig JT,Blake JA,Kadin JA,Richardson JE,Mouse Genome Database Group.

    更新日期:2013-01-01 00:00:00