Abstract:
:Protein classification by machine learning algorithms is now widely used in structural and functional annotation of proteins. The Protein Classification Benchmark collection (http://hydra.icgeb.trieste.it/benchmark) was created in order to provide standard datasets on which the performance of machine learning methods can be compared. It is primarily meant for method developers and users interested in comparing methods under standardized conditions. The collection contains datasets of sequences and structures, and each set is subdivided into positive/negative, training/test sets in several ways. There is a total of 6405 classification tasks, 3297 on protein sequences, 3095 on protein structures and 10 on protein coding regions in DNA. Typical tasks include the classification of structural domains in the SCOP and CATH databases based on their sequences or structures, as well as various functional and taxonomic classification problems. In the case of hierarchical classification schemes, the classification tasks can be defined at various levels of the hierarchy (such as classes, folds, superfamilies, etc.). For each dataset there are distance matrices available that contain all vs. all comparison of the data, based on various sequence or structure comparison methods, as well as a set of classification performance measures computed with various classifier algorithms.
journal_name
Nucleic Acids Resjournal_title
Nucleic acids researchauthors
Sonego P,Pacurar M,Dhir S,Kertész-Farkas A,Kocsor A,Gáspári Z,Leunissen JA,Pongor Sdoi
10.1093/nar/gkl812subject
Has Abstractpub_date
2007-01-01 00:00:00pages
D232-6issue
Database issueeissn
0305-1048issn
1362-4962pii
gkl812journal_volume
35pub_type
杂志文章abstract::Nuclear extracts prepared from growth hormone-secreting (GC) and prolactin-secreting (235-1) rat anterior pituitary cell lines were compared for their ability to bind to the DNA sequences conferring tissue-specificity to the expression of the rat growth hormone (rGH) gene promoter. Cell-specific differences in the int...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/18.17.5235
更新日期:1990-09-11 00:00:00
abstract::ProtoNet is an automatic hierarchical classification of the protein sequence space. In 2004, the ProtoNet (version 4.0) presents the analysis of over one million proteins merged from SwissProt and TrEMBL databases. In addition to rich visualization and analysis tools to navigate the clustering hierarchy, we incorporat...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gki007
更新日期:2005-01-01 00:00:00
abstract::Neph et al. (2012) (Circuitry and dynamics of human transcription factor regulatory networks. Cell, 150: 1274-1286) reported the transcription factor (TF) regulatory networks of 41 human cell types using the DNaseI footprinting technique. This provides a valuable resource for uncovering regulation principles in differ...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gku923
更新日期:2014-11-10 00:00:00
abstract::The TcSNP database (http://snps.tcruzi.org) integrates information on genetic variation (polymorphisms and mutations) for different stocks, strains and isolates of Trypanosoma cruzi, the causative agent of Chagas disease. The database incorporates sequences (genes from the T. cruzi reference genome, mRNAs, ESTs and ge...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkn874
更新日期:2009-01-01 00:00:00
abstract::Segments of SV40 DNA having homologous overlapping termini recombine to produce viable genomes in monkey cells. Frequencies of recombination on either side of a deletion marker are non-random; replication and palindromes do not appear to be essential. Since recombination involves host enzymes, a suitable system has be...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/8.12.2725
更新日期:1980-06-25 00:00:00
abstract::Impaired DNA damage repair, especially deficient transcription-coupled nucleotide excision repair, leads to segmental progeroid syndromes in human patients as well as in rodent models. Furthermore, DNA double-strand break signalling has been pinpointed as a key inducer of cellular senescence. Several recent findings s...
journal_title:Nucleic acids research
pub_type: 杂志文章,评审
doi:10.1093/nar/gkm1065
更新日期:2007-01-01 00:00:00
abstract::Riboswitch RNAs fold into complex tertiary structures upon binding to their cognate ligand. Ligand recognition is accomplished by key residues in the binding pocket. In addition, it often crucially depends on the stability of peripheral structural elements. The ligand-bound complex of the guanine-sensing riboswitch fr...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkr664
更新日期:2011-12-01 00:00:00
abstract::Identification of components essential to chromosome structure and behaviour remains a vibrant area of study. We have previously shown that invadolysin is essential in Drosophila, with roles in cell division and cell migration. Mitotic chromosomes are hypercondensed in length, but display an aberrant fuzzy appearance....
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkv211
更新日期:2015-04-20 00:00:00
abstract::The H19 lncRNA has been implicated in development and growth control and is associated with human genetic disorders and cancer. Acting as a molecular sponge, H19 inhibits microRNA (miRNA) let-7. Here we report that H19 is significantly decreased in muscle of human subjects with type-2 diabetes and insulin resistant ro...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gku1160
更新日期:2014-12-16 00:00:00
abstract::RNA polymerase (RNAP) is a major target of gene regulation. Thermus thermophilus bacteriophage P23-45 encodes two RNAP binding proteins, gp39 and gp76, which shut off host gene transcription while allowing orderly transcription of phage genes. We previously reported the structure of the T. thermophilus RNAP•σA holoenz...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkx1162
更新日期:2018-01-09 00:00:00
abstract::Nucleic acids possess the unique property of being enzymatically amplifiable, and have therefore been a popular choice for the combinatorial selection of functional sequences, such as aptamers or ribozymes. However, amplification typically requires known sequence segments that serve as primer binding sites, which can ...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gks004
更新日期:2012-05-01 00:00:00
abstract::Transcription driven by the proviral promoter of the Human T-cell Leukemia Virus type I (HTLV-I) is tightly regulated by the Tax1 transactivator. This viral protein potently induces the enhancer activity of a 21 bp motif repeated three times in the promoter. We have previously shown that this induction results from th...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/21.17.3935
更新日期:1993-08-25 00:00:00
abstract::The structural simplicity and ability to capture serial correlations make Markov models a popular modeling choice in several genomic analyses, such as identification of motifs, genes and regulatory elements. A critical, yet relatively unexplored, issue is the determination of the order of the Markov model. Most biolog...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gks1285
更新日期:2013-02-01 00:00:00
abstract::Two of the tRNA's found in rabbit reticulocytes are substrates for a post-transcriptional modification leading to the incorporation of guanine into the polynucleotide chain. The major guanylated tRNA was previously identified as tRNA (His). In the present report we show that the minor guanylated tRNA is tRNA (Asn), an...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/3.10.2521
更新日期:1976-10-01 00:00:00
abstract::We describe an algorithm for gene identification in DNA sequences derived from shotgun sequencing of microbial communities. Accurate ab initio gene prediction in a short nucleotide sequence of anonymous origin is hampered by uncertainty in model parameters. While several machine learning approaches could be proposed t...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkq275
更新日期:2010-07-01 00:00:00
abstract::The RNA of a deleted strain (lacking Src gene) of an avian sarcoma virus (ASV) was examined by a newly developed immunoelectron microscopic procedure which uses anti-nucleotide antibodies as probes. After denaturation of the RNA and reaction with a high affinity, highly specific anti-7-methylguanosine-5'-phosphate (an...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/8.19.4485
更新日期:1980-10-10 00:00:00
abstract::Protein domains are subunits of proteins that recur throughout the protein world. There are many definitions attempting to capture the essence of a protein domain, and several systems that identify protein domains and classify them into families. EVEREST, recently described in Portugaly et al. (2006) BMC Bioinformatic...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkl850
更新日期:2007-01-01 00:00:00
abstract::Distamycin binds the minor groove of duplex DNA at AT-rich regions and has been a valuable probe of protein interactions with double-stranded DNA. We find that distamycin can also inhibit protein interactions with G-quadruplex (G4) DNA, a stable four-stranded structure in which the repeating unit is a G-quartet. Using...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkg392
更新日期:2003-06-01 00:00:00
abstract::This paper examines theoretically the effects that restraints on the tertiary structure of a superhelical DNA domain exert on the energetics of linking and the onset of conformational transitions. The most important tertiary constraint arises from the nucleosomal winding of genomic DNA in vivo. Conformational transiti...
journal_title:Nucleic acids research
pub_type: 杂志文章,评审
doi:10.1093/nar/15.23.9985
更新日期:1987-12-10 00:00:00
abstract::A better understanding of transcriptional and post-transcriptional regulation of gene expression in bacteria relies on studying their transcriptome. RNA sequencing methods are used not only to assess RNA abundance but also the exact boundaries of primary and processed transcripts. Here, we developed a method, called i...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkw1316
更新日期:2017-03-17 00:00:00
abstract::We have isolated the mouse MyoD1 gene flanked by its promoter region by screening a genomic library with synthetic oligonucleotides. The structural gene is interrupted by two G + C rich introns. Transfection of the cloned gene inserted into an expression vector converts fibroblasts to myoblasts. Sequence analysis of a...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/19.23.6433
更新日期:1991-12-11 00:00:00
abstract::Several restriction endonuclease fragments isolated from highly repetitive satellite DNA of the chromatin eliminating nematode Ascaris lumbricoides var. suum have been cloned. Each type of restriction fragment corresponds to a different variant of the same related ancestral sequence. These variants differ by small del...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/10.23.7493
更新日期:1982-12-11 00:00:00
abstract::Synthesis of (p)ppRNA-DNA chains by purified HeLa cell DNA primase-DNA polymerase alpha (pol alpha-primase) was compared with those synthesized by a multiprotein form of DNA polymerase alpha (pol alpha 2) using unique single-stranded DNA templates containing the origin of replication for simian virus 40 (SV40) DNA. Th...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/14.18.7305
更新日期:1986-09-25 00:00:00
abstract::An attempt has been made to correlate differential scanning calorimetry melting profiles of 5S rRNAs from lupin seeds (L.s.) and wheat germ (W.g.) with their structure. It is suggested that the observed differences in thermal unfolding are due to differences in RNA nucleotide sequence and as a consequence in higher or...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/16.2.685
更新日期:1988-01-25 00:00:00
abstract::For over 10 years, Binding MOAD (Mother of All Databases; http://www.BindingMOAD.org) has been one of the largest resources for high-quality protein-ligand complexes and associated binding affinity data. Binding MOAD has grown at the rate of 1994 complexes per year, on average. Currently, it contains 23,269 complexes ...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gku1088
更新日期:2015-01-01 00:00:00
abstract::Despite the extensive use of Saccharomyces cerevisiae as a platform for synthetic biology, strain engineering remains slow and laborious. Here, we employ CRISPR/Cas9 technology to build a cloning-free toolkit that addresses commonly encountered obstacles in metabolic engineering, including chromosomal integration locu...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkw1023
更新日期:2017-01-09 00:00:00
abstract::The Bcl-2 protein has an anti-apoptotic effect in neuronal and other cell types. We show for the first time that the Bcl-2 promoter is activated by the neuronal survival factor nerve growth factor (NGF) and that this effect is dependent on a region of the promoter from -1472 to -1414. This activation requires the Rap-...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/27.10.2086
更新日期:1999-05-15 00:00:00
abstract::This article describes recent developments of Europe PMC (http://europepmc.org), the leading database for life science literature. Formerly known as UKPMC, the service was rebranded in November 2012 as Europe PMC to reflect the scope of the funding agencies that support it. Several new developments have enriched Europ...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gku1061
更新日期:2015-01-01 00:00:00
abstract::A new integrated image analysis package with quantitative quality control schemes is described for cDNA microarray technology. The package employs an iterative algorithm that utilizes both intensity characteristics and spatial information of the spots on a microarray image for signal-background segmentation and define...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/29.15.e75
更新日期:2001-08-01 00:00:00
abstract::Accurate maps and DNA sequences for human subtelomere regions, along with detailed knowledge of subtelomere variation and long-range telomere-terminal haplotypes in individuals, are critical for understanding telomere function and its roles in human biology. Here, we use a highly automated whole genome mapping technol...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkx017
更新日期:2017-05-19 00:00:00