CNIT: a fast and accurate web tool for identifying protein-coding and long non-coding transcripts based on intrinsic sequence composition.

Abstract:

:As more and more high-throughput data has been produced by next-generation sequencing, it is still a challenge to classify RNA transcripts into protein-coding or non-coding, especially for poorly annotated species. We upgraded our original coding potential calculator, CNCI (Coding-Non-Coding Index), to CNIT (Coding-Non-Coding Identifying Tool), which provides faster and more accurate evaluation of the coding ability of RNA transcripts. CNIT runs ∼200 times faster than CNCI and exhibits more accuracy compared with CNCI (0.98 versus 0.94 for human, 0.95 versus 0.93 for mouse, 0.93 versus 0.92 for zebrafish, 0.93 versus 0.92 for fruit fly, 0.92 versus 0.88 for worm, and 0.98 versus 0.85 for Arabidopsis transcripts). Moreover, the AUC values of 11 animal species and 27 plant species showed that CNIT was capable of obtaining relatively accurate identification results for almost all eukaryotic transcripts. In addition, a mobile-friendly web server is now freely available at http://cnit.noncode.org/CNIT.

journal_name

Nucleic Acids Res

journal_title

Nucleic acids research

authors

Guo JC,Fang SS,Wu Y,Zhang JH,Chen Y,Liu J,Wu B,Wu JR,Li EM,Xu LY,Sun L,Zhao Y

doi

10.1093/nar/gkz400

subject

Has Abstract

pub_date

2019-07-02 00:00:00

pages

W516-W522

issue

W1

eissn

0305-1048

issn

1362-4962

pii

5506859

journal_volume

47

pub_type

杂志文章
  • Role of PCNA-dependent stimulation of 3'-phosphodiesterase and 3'-5' exonuclease activities of human Ape2 in repair of oxidative DNA damage.

    abstract::Human Ape2 protein has 3' phosphodiesterase activity for processing 3'-damaged DNA termini, 3'-5' exonuclease activity that supports removal of mismatched nucleotides from the 3'-end of DNA, and a somewhat weak AP-endonuclease activity. However, very little is known about the role of Ape2 in DNA repair processes. Here...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkp357

    authors: Burkovics P,Hajdú I,Szukacsov V,Unk I,Haracska L

    更新日期:2009-07-01 00:00:00

  • Conformation of DNA in chromatin reconstituted from poly [d(A-T)] and the core histones.

    abstract::Present results provide direct evidence of the nature of a conformational change in DNA when nucleosomes are formed from core histones and poly [d(A-T)]. First, we have found some features which have characteristic aspects of the A like conformation of DNA. Thus, an increased contribution due to a sugar conformation c...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/9.19.4879

    authors: Brahms S,Brahmachari SK,Angelier N,Brahms JG

    更新日期:1981-10-10 00:00:00

  • The primary structure of E. coli RNA polymerase, Nucleotide sequence of the rpoC gene and amino acid sequence of the beta'-subunit.

    abstract::The primary structure of the E. coli rpoC gene (5321 base pairs) coding the beta'-subunit of RNA polymerase as well as its adjacent segment have been determined. The structure analysis of the peptides obtained by cleavage of the protein with cyanogen bromide and trypsin has confirmed the amino acid sequence of the bet...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/10.13.4035

    authors: Ovchinnikov YuA,Monastyrskaya GS,Gubanov VV,Guryev SO,Salomatina IS,Shuvaeva TM,Lipkin VM,Sverdlov ED

    更新日期:1982-07-10 00:00:00

  • Regulation of adenovirus alternative RNA splicing at the level of commitment complex formation.

    abstract::The adenovirus late region 1 (L1) represents an example of an alternatively spliced gene where one 5' splice site is spliced to two alternative 3' splice sites, to produce two mRNAs; the 52,55K and IIIa mRNAs, respectively. Accumulation of the L1 mRNAs is temporally regulated during the infectious cycle. Thus, the pro...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/22.3.332

    authors: Kreivi JP,Akusjärvi G

    更新日期:1994-02-11 00:00:00

  • AntigenDB: an immunoinformatics database of pathogen antigens.

    abstract::The continuing threat of infectious disease and future pandemics, coupled to the continuous increase of drug-resistant pathogens, makes the discovery of new and better vaccines imperative. For effective vaccine development, antigen discovery and validation is a prerequisite. The compilation of information concerning p...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkp830

    authors: Ansari HR,Flower DR,Raghava GP

    更新日期:2010-01-01 00:00:00

  • PlantLoc: an accurate web server for predicting plant protein subcellular localization by substantiality motif.

    abstract::Knowledge of subcellular localizations (SCLs) of plant proteins relates to their functions and aids in understanding the regulation of biological processes at the cellular level. We present PlantLoc, a highly accurate and fast webserver for predicting the multi-label SCLs of plant proteins. The PlantLoc server has two...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkt428

    authors: Tang S,Li T,Cong P,Xiong W,Wang Z,Sun J

    更新日期:2013-07-01 00:00:00

  • The PATRIC Bioinformatics Resource Center: expanding data and analysis capabilities.

    abstract::The PathoSystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center funded by the National Institute of Allergy and Infectious Diseases (https://www.patricbrc.org). PATRIC supports bioinformatic analyses of all bacteria with a special emphasis on pathogens, offering a rich comparative...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkz943

    authors: Davis JJ,Wattam AR,Aziz RK,Brettin T,Butler R,Butler RM,Chlenski P,Conrad N,Dickerman A,Dietrich EM,Gabbard JL,Gerdes S,Guard A,Kenyon RW,Machi D,Mao C,Murphy-Olson D,Nguyen M,Nordberg EK,Olsen GJ,Olson RD,Overb

    更新日期:2020-01-08 00:00:00

  • Hierarchy and positive/negative interplays of the hepatocyte nuclear factors HNF-1, -3 and -4 in the liver-specific enhancer for the human alpha-1-microglobulin/bikunin precursor.

    abstract::Alpha-1-microglobulin and bikunin are two plasma glycoproteins encoded by an alpha-1-microglobulin/bikunin precursor (AMBP) gene. The strict liver-specific expression of the AMBP gene is controlled by a potent enhancer made of six clustered boxes numbered 1-6 that have been reported to be proven or potential binding s...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/23.3.395

    authors: Rouet P,Raguenez G,Tronche F,Mfou'ou V,Salier JP

    更新日期:1995-02-11 00:00:00

  • Database for mobile group II introns.

    abstract::Group II introns are self-splicing RNAs and retroelements found in bacteria and lower eukaryotic organelles. During the past several years, they have been uncovered in surprising numbers in bacteria due to the genome sequencing projects; however, most of the newly sequenced introns are not correctly identified. We hav...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkg049

    authors: Dai L,Toor N,Olson R,Keeping A,Zimmerly S

    更新日期:2003-01-01 00:00:00

  • TRANSFAC, TRRD and COMPEL: towards a federated database system on transcriptional regulation.

    abstract::Three databases that provide data on transcriptional regulation are described. TRANSFAC is a database on transcription factors and their DNA binding sites. TRRD (Transcription Regulatory Region Database) collects information about complete regulatory regions, their regulation properties and architecture. COMPEL compri...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/25.1.265

    authors: Wingender E,Kel AE,Kel OV,Karas H,Heinemeyer T,Dietze P,Knüppel R,Romaschenko AG,Kolchanov NA

    更新日期:1997-01-01 00:00:00

  • The active site residue Valine 867 in human telomerase reverse transcriptase influences nucleotide incorporation and fidelity.

    abstract::Human telomerase reverse transcriptase (hTERT), the catalytic subunit of human telomerase, contains conserved motifs common to retroviral reverse transcriptases and telomerases. Within the C motif of hTERT is the Leu866-Val867-Asp868-Asp869 tetrapeptide that includes a catalytically essential aspartate dyad. Site-dire...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkm002

    authors: Drosopoulos WC,Prasad VR

    更新日期:2007-01-01 00:00:00

  • Genome sequence comparison of Col and Ler lines reveals the dynamic nature of Arabidopsis chromosomes.

    abstract::Large differences in plant genome sizes are mainly due to numerous events of insertions or deletions (indels). The balance between these events determines the evolutionary direction of genome changes. To address the question of what phenomena trigger these alterations, we compared the genomic sequences of two Arabidop...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkp183

    authors: Ziolkowski PA,Koczyk G,Galganski L,Sadowski J

    更新日期:2009-06-01 00:00:00

  • H-DNA and Z-DNA in the mouse c-Ki-ras promoter.

    abstract::The mouse c-Ki-ras protooncogene promoter contains a homopurine-homopyrimidine domain that exhibits S1 nuclease sensitivity in vitro. We have studied the structure of this DNA region in a supercoiled state using a number of chemical probes for non-B DNA conformations including diethyl pyrocarbonate, osmium tetroxide, ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/19.23.6527

    authors: Pestov DG,Dayn A,Siyanova EYu,George DL,Mirkin SM

    更新日期:1991-12-11 00:00:00

  • Analysis of erythroid nuclear proteins binding to the promoter and enhancer elements of the chicken histone H5 gene.

    abstract::The chicken erythroid proteins binding to the histone H5 5' promoter and 3' erythroid-specific enhancer regions were identified. In DNase I footprinting and gel mobility shift experiments with immature adult erythrocyte nuclear extracts, we have demonstrated the binding of proteins to the GC-box, a high affinity Sp1 b...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/20.23.6385

    authors: Sun JM,Penner CG,Davie JR

    更新日期:1992-12-11 00:00:00

  • PRP28, a 'DEAD-box' protein, is required for the first step of mRNA splicing in vitro.

    abstract::We previously reported the isolation of PRP28, a gene in Saccharomyces cerevisiae whose activity is required for the first step of nuclear mRNA splicing in vivo. Sequence analysis revealed that PRP28 is included in the 'DEAD-box' gene family, members of which are thought to function as ATP-dependent RNA helicases. Gen...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/22.15.3187

    authors: Strauss EJ,Guthrie C

    更新日期:1994-08-11 00:00:00

  • A thermodynamic overview of naturally occurring intramolecular DNA quadruplexes.

    abstract::Loop length and its composition are important for the structural and functional versatility of quadruplexes. To date studies on the loops have mainly concerned model sequences compared with naturally occurring quadruplex sequences which have diverse loop lengths and compositions. Herein, we have characterized 36 quadr...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkn543

    authors: Kumar N,Maiti S

    更新日期:2008-10-01 00:00:00

  • Specific hydrolysis of methionyl-tRNA Met f catalyzed by a purified peptide.

    abstract::A peptide initiation factor purified from rat liver and promoting the binding of initiator tRNA and model initiators to 40S and 80S ribosome at an acid pH liberates methionine and N-acetylmethionine from Trna Met f at neutral reaction. Phenylalanyl-tRNA, N-acetylphenylalanyl-tRNA and methionyl-tRNA Met m are not hydro...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/2.11.2119

    authors: Hradec J

    更新日期:1975-11-01 00:00:00

  • A test of the model that RNA polymerase III transcription is regulated by selective induction of the 110 kDa subunit of TFIIIC.

    abstract::TFIIIC is a RNA polymerase (pol) III-specific DNA-binding factor that is required for transcription of tRNA and 5S rRNA genes. Active human TFIIIC consists of five subunits. However, an inactive form has also been isolated that lacks one of the five subunits, called TFIIIC110. A model was proposed in which pol III tra...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkl432

    authors: Innes F,Ramsbottom B,White RJ

    更新日期:2006-07-05 00:00:00

  • Determinants of redox sensitivity in RsrA, a zinc-containing anti-sigma factor for regulating thiol oxidative stress response.

    abstract::Various environmental oxidative stresses are sensed by redox-sensitive regulators through cysteine thiol oxidation or modification. A few zinc-containing anti-sigma (ZAS) factors in actinomycetes have been reported to respond sensitively to thiol oxidation, among which RsrA from Streptomyces coelicolor is best charact...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkr477

    authors: Jung YG,Cho YB,Kim MS,Yoo JS,Hong SH,Roe JH

    更新日期:2011-09-01 00:00:00

  • Modification of the aminopyridine unit of 2'-deoxyaminopyridinyl-pseudocytidine allowing triplex formation at CG interruptions in homopurine sequences.

    abstract::The antigene strategy based on site-specific recognition of duplex DNA by triplex DNA formation has been exploited in a wide range of biological activities. However, specific triplex formation is mostly restricted to homo-purine strands within the target duplex DNA, due to the destabilizing effect of CG and TA inversi...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gky704

    authors: Wang L,Taniguchi Y,Okamura H,Sasaki S

    更新日期:2018-09-28 00:00:00

  • DNA unwinding and protein displacement by superfamily 1 and superfamily 2 helicases.

    abstract::DNA helicases are required for virtually every aspect of DNA metabolism, including replication, repair, recombination and transcription. A comprehensive description of these essential biochemical processes requires detailed understanding of helicase mechanisms. These enzymes are ubiquitous, having been identified in v...

    journal_title:Nucleic acids research

    pub_type: 杂志文章,评审

    doi:10.1093/nar/gkl501

    authors: Mackintosh SG,Raney KD

    更新日期:2006-01-01 00:00:00

  • BLM unfolds G-quadruplexes in different structural environments through different mechanisms.

    abstract::Mutations in the RecQ DNA helicase gene BLM give rise to Bloom's syndrome, which is a rare autosomal recessive disorder characterized by genetic instability and cancer predisposition. BLM helicase is highly active in binding and unwinding G-quadruplexes (G4s), which are physiological targets for BLM, as revealed by ge...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkv361

    authors: Wu WQ,Hou XM,Li M,Dou SX,Xi XG

    更新日期:2015-05-19 00:00:00

  • A second look at cellular mRNA sequences said to function as internal ribosome entry sites.

    abstract::This review takes a second look at a set of mRNAs that purportedly employ an alternative mechanism of initiation when cap-dependent translation is reduced during mitosis or stress conditions. A closer look is necessary because evidence cited in support of the internal initiation hypothesis is often flawed. When putati...

    journal_title:Nucleic acids research

    pub_type: 杂志文章,评审

    doi:10.1093/nar/gki958

    authors: Kozak M

    更新日期:2005-11-28 00:00:00

  • Different dynamics in nuclear entry of subunits of the repair/transcription factor TFIIH.

    abstract::We report here the different ways in which four subunits of the basal transcription/repair factor TFIIH (XPB, XPD, p62 and p44) and the damage recognition XPC repair protein can enter the nucleus. We examined their nuclear localization by transiently expressing the gene products tagged with the enhanced green fluoresc...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/29.7.1574

    authors: Santagati F,Botta E,Stefanini M,Pedrini AM

    更新日期:2001-04-01 00:00:00

  • The missing indels: an estimate of indel variation in a human genome and analysis of factors that impede detection.

    abstract::With the development of High-Throughput Sequencing (HTS) thousands of human genomes have now been sequenced. Whenever different studies analyze the same genome they usually agree on the amount of single-nucleotide polymorphisms, but differ dramatically on the number of insertion and deletion variants (indels). Further...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkv677

    authors: Jiang Y,Turinsky AL,Brudno M

    更新日期:2015-09-03 00:00:00

  • Solution structure and stability of the DNA undecamer duplexes containing oxanine mismatch.

    abstract::Solution structures of DNA duplexes containing oxanine (Oxa, O) opposite a cytosine (O:C duplex) and opposite a thymine (O:T duplex) have been solved by the combined use of (1)H NMR and restrained molecular dynamics calculation. One mismatch pair was introduced into the center of the 11-mer duplex of [d(GTGACO(6)CACTG...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkr872

    authors: Pack SP,Morimoto H,Makino K,Tajima K,Kanaori K

    更新日期:2012-02-01 00:00:00

  • NOPdb: Nucleolar Proteome Database.

    abstract::The Nucleolar Proteome Database (NOPdb) archives data on >700 proteins that were identified by multiple mass spectrometry (MS) analyses from highly purified preparations of human nucleoli, the most prominent nuclear organelle. Each protein entry is annotated with information about its corresponding gene, its domain st...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkj004

    authors: Leung AK,Trinkle-Mulcahy L,Lam YW,Andersen JS,Mann M,Lamond AI

    更新日期:2006-01-01 00:00:00

  • Effects of codon usage on gene expression are promoter context dependent.

    abstract::Codon usage bias is a universal feature of all genomes. Although codon usage has been shown to regulate mRNA and protein levels by influencing mRNA decay and transcription in eukaryotes, little or no genome-wide correlations between codon usage and mRNA levels are detected in mammalian cells, raising doubt on the sign...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkaa1253

    authors: Yang Q,Lyu X,Zhao F,Liu Y

    更新日期:2021-01-25 00:00:00

  • Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames.

    abstract::A new method which predicts internal exon sequences in human DNA has been developed. The method is based on a splice site prediction algorithm that uses the linear discriminant function to combine information about significant triplet frequencies of various functional parts of splice site regions and preferences of ol...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/22.24.5156

    authors: Solovyev VV,Salamov AA,Lawrence CB

    更新日期:1994-12-11 00:00:00

  • DIAL: a web server for the pairwise alignment of two RNA three-dimensional structures using nucleotide, dihedral angle and base-pairing similarities.

    abstract::DIAL (dihedral alignment) is a web server that provides public access to a new dynamic programming algorithm for pairwise 3D structural alignment of RNA. DIAL achieves quadratic time by performing an alignment that accounts for (i) pseudo-dihedral and/or dihedral angle similarity, (ii) nucleotide sequence similarity a...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkm334

    authors: Ferrè F,Ponty Y,Lorenz WA,Clote P

    更新日期:2007-07-01 00:00:00