Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes.

Abstract:

:Although cis-regulatory binding sites (CRBSs) are at least as important as the coding sequences in a genome, our general understanding of them in most sequenced genomes is very limited due to the lack of efficient and accurate experimental and computational methods for their characterization, which has largely hindered our understanding of many important biological processes. In this article, we describe a novel algorithm for genome-wide de novo prediction of CRBSs with high accuracy. We designed our algorithm to circumvent three identified difficulties for CRBS prediction using comparative genomics principles based on a new method for the selection of reference genomes, a new metric for measuring the similarity of CRBSs, and a new graph clustering procedure. When operon structures are correctly predicted, our algorithm can predict 81% of known individual binding sites belonging to 94% of known cis-regulatory motifs in the Escherichia coli K12 genome, while achieving high prediction specificity. Our algorithm has also achieved similar prediction accuracy in the Bacillus subtilis genome, suggesting that it is very robust, and thus can be applied to any other sequenced prokaryotic genome. When compared with the prior state-of-the-art algorithms, our algorithm outperforms them in both prediction sensitivity and specificity.

journal_name

Nucleic Acids Res

journal_title

Nucleic acids research

authors

Zhang S,Xu M,Li S,Su Z

doi

10.1093/nar/gkp248

subject

Has Abstract

pub_date

2009-06-01 00:00:00

pages

e72

issue

10

eissn

0305-1048

issn

1362-4962

pii

gkp248

journal_volume

37

pub_type

杂志文章
  • 2'-Deoxynucleoside 5'-triphosphates modified at alpha-, beta- and gamma-phosphates as substrates for DNA polymerases.

    abstract::Replacement of alpha-, beta- and gamma-phosphate groups in 2'-deoxynucleoside 5'-triphosphates (dNTP) with phosphonate groups yields a new set of dNTP mimics with potential biological and therapeutic applications. Here, we describe the synthesis of 15 new dNTPs modified at alpha-, beta- and gamma-phosphates containing...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/26.3.778

    authors: Alexandrova LA,Skoblov AY,Jasko MV,Victorova LS,Krayevsky AA

    更新日期:1998-02-01 00:00:00

  • Positive and negative selection using the tetA-sacB cassette: recombineering and P1 transduction in Escherichia coli.

    abstract::The two-step process of selection and counter-selection is a standard way to enable genetic modification and engineering of bacterial genomes using homologous recombination methods. The tetA and sacB genes are contained in a DNA cassette and confer a novel dual counter-selection system. Expression of tetA confers bact...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkt1075

    authors: Li XT,Thomason LC,Sawitzke JA,Costantino N,Court DL

    更新日期:2013-12-01 00:00:00

  • jpHMM at GOBICS: a web server to detect genomic recombinations in HIV-1.

    abstract::Detecting recombinations in the genome sequence of human immunodeficiency virus (HIV-1) is crucial for epidemiological studies and for vaccine development. Herein, we present a web server for subtyping and localization of phylogenetic breakpoints in HIV-1. Our software is based on a jumping profile Hidden Markov Model...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkl255

    authors: Zhang M,Schultz AK,Calef C,Kuiken C,Leitner T,Korber B,Morgenstern B,Stanke M

    更新日期:2006-07-01 00:00:00

  • High accuracy operon prediction method based on STRING database scores.

    abstract::We present a simple and highly accurate computational method for operon prediction, based on intergenic distances and functional relationships between the protein products of contiguous genes, as defined by STRING database (Jensen,L.J., Kuhn,M., Stark,M., Chaffron,S., Creevey,C., Muller,J., Doerks,T., Julien,P., Roth,...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkq254

    authors: Taboada B,Verde C,Merino E

    更新日期:2010-07-01 00:00:00

  • A new procedure for purifying histone pairs H2A + H2B and H3 + H4 from chromatin using hydroxylapatite.

    abstract::A method to purify histone groups H2A+H2B and H3+H4 using dissociation with NaCl and hydroxylapatite chromatography is presented. The procedure is simple, involves mild solvents, and provides milligram quantities of histones of high purity. The histone pairs prepared by this method can regenerate chromatin-like charac...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/6.2.689

    authors: Simon RH,Felsenfeld G

    更新日期:1979-02-01 00:00:00

  • Dual transcriptional-translational cascade permits cellular level tuneable expression control.

    abstract::The ability to induce gene expression in a small molecule dependent manner has led to many applications in target discovery, functional elucidation and bio-production. To date these applications have relied on a limited set of protein-based control mechanisms operating at the level of transcription initiation. The dis...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkv912

    authors: Morra R,Shankar J,Robinson CJ,Halliwell S,Butler L,Upton M,Hay S,Micklefield J,Dixon N

    更新日期:2016-02-18 00:00:00

  • Structural and functional insights into DNA-end processing by the archaeal HerA helicase-NurA nuclease complex.

    abstract::Helicase-nuclease systems dedicated to DNA end resection in preparation for homologous recombination (HR) are present in all kingdoms of life. In thermophilic archaea, the HerA helicase and NurA nuclease cooperate with the highly conserved Mre11 and Rad50 proteins during HR-dependent DNA repair. Here we show that HerA...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkr1157

    authors: Blackwood JK,Rzechorzek NJ,Abrams AS,Maman JD,Pellegrini L,Robinson NP

    更新日期:2012-04-01 00:00:00

  • The Tudor protein Veneno assembles the ping-pong amplification complex that produces viral piRNAs in Aedes mosquitoes.

    abstract::PIWI-interacting RNAs (piRNAs) comprise a class of small RNAs best known for suppressing transposable elements in germline tissues. The vector mosquito Aedes aegypti encodes seven PIWI genes, four of which are somatically expressed. This somatic piRNA pathway generates piRNAs from viral RNA during infection with cytop...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gky1266

    authors: Joosten J,Miesen P,Taşköprü E,Pennings B,Jansen PWTC,Huynen MA,Vermeulen M,Van Rij RP

    更新日期:2019-03-18 00:00:00

  • Specific interactions of distamycin with G-quadruplex DNA.

    abstract::Distamycin binds the minor groove of duplex DNA at AT-rich regions and has been a valuable probe of protein interactions with double-stranded DNA. We find that distamycin can also inhibit protein interactions with G-quadruplex (G4) DNA, a stable four-stranded structure in which the repeating unit is a G-quartet. Using...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkg392

    authors: Cocco MJ,Hanakahi LA,Huber MD,Maizels N

    更新日期:2003-06-01 00:00:00

  • Modulation of thyroglobulin messenger RNA level by thyrotropin in cultured thyroid cells.

    abstract::To examine the influence of thyrotropin (TSH) on the thyroglobulin (Tgb) mRNA content, the latter was evaluated in the cytoplasm of hog thyroid cells cultured in the absence (control cells) or presence of TSH. The Tgb mRNA levels were determined by, (i) kinetics of hybridization to sheep Tgb cDNA, (ii) capacity of cod...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/6.10.3353

    authors: Chebath J,Chabaud O,Mauchamp J

    更新日期:1979-07-25 00:00:00

  • The discriminator bases G73 in human tRNA(Ser) and A73 in tRNA(Leu) have significantly different roles in the recognition of aminoacyl-tRNA synthetases.

    abstract::The recognition of human tRNA(Leu) or tRNA(Ser) by cognate aminoacyl- tRNA synthetases has distinct requirements. Only one base change (A73-->G) in tRNA(Leu) is required to generate an efficient serine acceptor in vitro, whereas several changes in three structural domains (the acceptor stem, DHU loop and long extra ar...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/24.3.405

    authors: Breitschopf K,Gross HJ

    更新日期:1996-02-01 00:00:00

  • The optimal binding sequence of the Hox11 protein contains a predicted recognition core motif.

    abstract::HOX11 is a homeobox-containing oncogene of specific T-cell leukemias. We determined the DNA binding specificity of the Hox11 protein by using a novel technique of random oligonucleotide selection developed in this study. The optimal Hox11 binding sequence, GGCGGTAAGTGG, contained a core TAAGTG motif that is consistent...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/23.11.1928

    authors: Tang S,Breitman ML

    更新日期:1995-06-11 00:00:00

  • Unique organization of the human BCR gene promoter.

    abstract::The promoter of the human BCR gene, regulating the transcription of the chimeric BCR/ABL mRNA in leukemia, has been isolated and characterized. A region of 1.1 kb immediately 5' to the transcription start site was analyzed in detail by sequencing, DNase 1 footprinting, gel retardation and functional studies. These exp...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/18.23.7119

    authors: Zhu QS,Heisterkamp N,Groffen J

    更新日期:1990-12-11 00:00:00

  • The conserved 7SK snRNA gene localizes to human chromosome 6 by homolog exclusion probing of somatic cell hybrid RNA.

    abstract::Many small RNAs contribute essential activities to eukaryotic cells. In mammalian genomes dispersed repetitive sequences which exhibit homology to small RNAs often exist as pseudogenes which can complicate identification, localization, and analysis of the authentic gene. We mapped a productive human 7SK small nuclear ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/22.5.722

    authors: Driscoll CT,Darlington GJ,Maraia RJ

    更新日期:1994-03-11 00:00:00

  • INTERSPIA: a web application for exploring the dynamics of protein-protein interactions among multiple species.

    abstract::Proteins perform biological functions through cascading interactions with each other by forming protein complexes. As a result, interactions among proteins, called protein-protein interactions (PPIs) are not completely free from selection constraint during evolution. Therefore, the identification and analysis of PPI c...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gky378

    authors: Kwon D,Lee D,Kim J,Lee J,Sim M,Kim J

    更新日期:2018-07-02 00:00:00

  • The nucleotide sequence at the 3'-end of Neurospora crassa 25S-rRNA and the location of a 5.8S-rRNA binding site.

    abstract::The sequence of 110 nucleotides adjacent to the 3'-end of Neurospora crassa 25S-rRNA has been derived by chemical sequencing methods. Sequences present between 40 and 85 nucleotides of the 3'-end were found to complement sequences at the 3'- and 5'-ends of 5.8S-rRNA. Interaction was shown to occur between 5.8S-rRNA an...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/9.5.1111

    authors: Kelly JM,Cox RA

    更新日期:1981-03-11 00:00:00

  • COVID19 Drug Repository: text-mining the literature in search of putative COVID19 therapeutics.

    abstract::The recent outbreak of COVID-19 has generated an enormous amount of Big Data. To date, the COVID-19 Open Research Dataset (CORD-19), lists ∼130,000 articles from the WHO COVID-19 database, PubMed Central, medRxiv, and bioRxiv, as collected by Semantic Scholar. According to LitCovid (11 August 2020), ∼40,300 COVID19-re...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkaa969

    authors: Tworowski D,Gorohovski A,Mukherjee S,Carmi G,Levy E,Detroja R,Mukherjee SB,Frenkel-Morgenstern M

    更新日期:2021-01-08 00:00:00

  • From general base to general acid catalysis in a sodium-specific DNAzyme by a guanine-to-adenine mutation.

    abstract::Recently, a few Na+-specific RNA-cleaving DNAzymes were reported, where nucleobases are likely to play critical roles in catalysis. The NaA43 and NaH1 DNAzymes share the same 16-nt Na+-binding motif, but differ in one or two nucleotides in a small catalytic loop. Nevertheless, they display an opposite pH-dependency, i...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkz578

    authors: Ma L,Kartik S,Liu B,Liu J

    更新日期:2019-09-05 00:00:00

  • The PathoYeastract database: an information system for the analysis of gene and genomic transcription regulation in pathogenic yeasts.

    abstract::We present the PATHOgenic YEAst Search for Transcriptional Regulators And Consensus Tracking (PathoYeastract - http://pathoyeastract.org) database, a tool for the analysis and prediction of transcription regulatory associations at the gene and genomic levels in the pathogenic yeasts Candida albicans and C. glabrata Up...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkw817

    authors: Monteiro PT,Pais P,Costa C,Manna S,Sá-Correia I,Teixeira MC

    更新日期:2017-01-04 00:00:00

  • RNA-binding strategies common to cold-shock domain- and RNA recognition motif-containing proteins.

    abstract::Numerous RNA-binding proteins have modular structures, comprising one or several copies of a selective RNA-binding domain generally coupled to an auxiliary domain that binds RNA non-specifically. We have built and compared homology-based models of the cold-shock domain (CSD) of the Xenopus protein, FRGY2, and of the t...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/29.11.2223

    authors: Manival X,Ghisolfi-Nieto L,Joseph G,Bouvet P,Erard M

    更新日期:2001-06-01 00:00:00

  • NAIMA: target amplification strategy allowing quantitative on-chip detection of GMOs.

    abstract::We have developed a novel multiplex quantitative DNA-based target amplification method suitable for sensitive, specific and quantitative detection on microarray. This new method named NASBA Implemented Microarray Analysis (NAIMA) was applied to GMO detection in food and feed, but its application can be extended to all...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkn524

    authors: Morisset D,Dobnik D,Hamels S,Zel J,Gruden K

    更新日期:2008-10-01 00:00:00

  • Generation of single-chain LAGLIDADG homing endonucleases from native homodimeric precursor proteins.

    abstract::Homing endonucleases (HEs) cut long DNA target sites with high specificity to initiate and target the lateral transfer of mobile introns or inteins. This high site specificity of HEs makes them attractive reagents for gene targeting to promote DNA modification or repair. We have generated several hundred catalytically...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkp004

    authors: Li H,Pellenz S,Ulge U,Stoddard BL,Monnat RJ Jr

    更新日期:2009-04-01 00:00:00

  • The integrase family of tyrosine recombinases: evolution of a conserved active site domain.

    abstract::The integrases are a diverse family of tyrosine recombinases which rearrange DNA duplexes by means of conservative site-specific recombination reactions. Members of this family, of which the well-studied lambda Int protein is the prototype, were previously found to share four strongly conserved residues, including an ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/25.18.3605

    authors: Esposito D,Scocca JJ

    更新日期:1997-09-15 00:00:00

  • DBTGR: a database of tunicate promoters and their regulatory elements.

    abstract::The high similarity of tunicates and vertebrates during their development coupled with the transparency of tunicate larvae, their well-studied cell lineages and the availability of simple and efficient transgenesis methods makes of this subphylum an ideal system for the investigation of vertebrate physiological and de...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkj064

    authors: Sierro N,Kusakabe T,Park KJ,Yamashita R,Kinoshita K,Nakai K

    更新日期:2006-01-01 00:00:00

  • The human M creatine kinase gene enhancer contains multiple functional interacting domains.

    abstract::Cis-elements (-933 to -641) upstream of the human M creatine kinase gene cap site contain an enhancer that confers developmental and tissue-specific expression to the chloramphenicol acetyltransferase gene in C2C12 myogenic cells transfected in culture. Division of the enhancer at -770 into a 5' fragment that includes...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/20.9.2313

    authors: Trask RV,Koster JC,Ritchie ME,Billadello JJ

    更新日期:1992-05-11 00:00:00

  • Spectral clustering of protein sequences.

    abstract::An important problem in genomics is automatically clustering homologous proteins when only sequence information is available. Most methods for clustering proteins are local, and are based on simply thresholding a measure related to sequence distance. We first show how locality limits the performance of such methods by...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkj515

    authors: Paccanaro A,Casbon JA,Saqi MA

    更新日期:2006-03-17 00:00:00

  • Rational design of point mutation-selective antisense DNA targeted to codon 12 of Ha-ras mRNA in human cells.

    abstract::Antisense oligodeoxynucleotides targeted to Ha-ras mRNA have been designed to discriminate between the codon 12-mutated oncogene and the normal proto-oncogene. An in vitro assay using two different sources of RNase H (rabbit reticulocyte lysates and nuclear extract from HeLa cells) was used to characterize oligonucleo...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/23.17.3411

    authors: Duroux I,Godard G,Boidot-Forget M,Schwab G,Hélène C,Saison-Behmoaras T

    更新日期:1995-09-11 00:00:00

  • MHCPred: A server for quantitative prediction of peptide-MHC binding.

    abstract::Accurate T-cell epitope prediction is a principal objective of computational vaccinology. As a service to the immunology and vaccinology communities at large, we have implemented, as a server on the World Wide Web, a partial least squares-based multivariate statistical approach to the quantitative prediction of peptid...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkg510

    authors: Guan P,Doytchinova IA,Zygouri C,Flower DR

    更新日期:2003-07-01 00:00:00

  • Primer specific and mispair extension analysis (PSMEA) as a simple approach to fast genotyping.

    abstract::A simple method, primer specific and mispair extension analysis (PSMEA) with pfu DNA polymerase was developed for genotyping. PSMEA is based on the unique properties of 3'-->5' exonuclease proofreading activity. In the presence of an incomplete set of dNTPs, pfu was found to be extremely discriminative in nucleotide i...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/26.21.5013

    authors: Hu YW,Balaskas E,Kessler G,Issid C,Scully LJ,Murphy DG,Rinfret A,Giulivi A,Scalia V,Gill P

    更新日期:1998-11-01 00:00:00

  • Purification and properties of the Eco57I restriction endonuclease and methylase--prototypes of a new class (type IV).

    abstract::The Eco57I restriction endonuclease and methylase were purified to homogeneity from the E.coli RR1 strain carrying the eco57IRM genes on a recombinant plasmid. The molecular weight of the denaturated methylase is 63 kDa. The restriction endonuclease exists in a monomeric form with an apparent molecular weight of 104-1...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/20.22.6043

    authors: Janulaitis A,Petrusyte M,Maneliene Z,Klimasauskas S,Butkus V

    更新日期:1992-11-25 00:00:00