CSI-Tree: a regression tree approach for modeling binding properties of DNA-binding molecules based on cognate site identification (CSI) data.

Abstract:

:The identification and characterization of binding sites of DNA-binding molecules, including transcription factors (TFs), is a critical problem at the interface of chemistry, biology and molecular medicine. The Cognate Site Identification (CSI) array is a high-throughput microarray platform for measuring comprehensive recognition profiles of DNA-binding molecules. This technique produces datasets that are useful not only for identifying binding sites of previously uncharacterized TFs but also for elucidating dependencies, both local and nonlocal, between the nucleotides at different positions of the recognition sites. We have developed a regression tree technique, CSI-Tree, for exploring the spectrum of binding sites of DNA-binding molecules. Our approach constructs regression trees utilizing the CSI data of unaligned sequences. The resulting model partitions the binding spectrum into homogeneous regions of position specific nucleotide effects. Each homogeneous partition is then summarized by a position weight matrix (PWM). Hence, the final outcome is a binding intensity rank-ordered collection of PWMs each of which spans a different region in the binding spectrum. Nodes of the regression tree depict the critical position/nucleotide combinations. We analyze the CSI data of the eukaryotic TF Nkx-2.5 and two engineered small molecule DNA ligands and obtain unique insights into their binding properties. The CSI tree for Nkx-2.5 reveals an interaction between two positions of the binding profile and elucidates how different nucleotide combinations at these two positions lead to different binding affinities. The CSI trees for the engineered DNA ligands exhibit a common preference for the dinucleotide AA in the first two positions, which is consistent with preference for a narrow and relatively flat minor groove. We carry out a reanalysis of these data with a mixture of PWMs approach. This approach is an advancement over the simple PWM model and accommodates position dependencies based on only sequence data. Our analysis indicates that the dependencies revealed by the CSI-Tree are challenging to discover without the actual binding intensities. Moreover, such a mixture model is highly sensitive to the number and length of the sequences analyzed. In contrast, CSI-Tree provides interpretable and concise summaries of the complete recognition profiles of DNA-binding molecules by utilizing binding affinities.

journal_name

Nucleic Acids Res

journal_title

Nucleic acids research

authors

Keleş S,Warren CL,Carlson CD,Ansari AZ

doi

10.1093/nar/gkn057

subject

Has Abstract

pub_date

2008-06-01 00:00:00

pages

3171-84

issue

10

eissn

0305-1048

issn

1362-4962

pii

gkn057

journal_volume

36

pub_type

杂志文章
  • Interaction of N-acetyl-phenylalanyl-tRNAPhe with 70S ribosomes of Escherichia coli.

    abstract::The interaction of N--Acetyl--Phe--tRNA Phe with 70 S ribosomes is a reversible process in the absence as well as in the presence of messenger. The equilibrium binding constants of these interactions were measured at different magnesium concentrations and temperatures and thermodynamical quantities computed. The entha...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/5.10.3871

    authors: Odinzov VB,Kirillov SV

    更新日期:1978-10-01 00:00:00

  • SUVH1, a Su(var)3-9 family member, promotes the expression of genes targeted by DNA methylation.

    abstract::Transposable elements are found throughout the genomes of all organisms. Repressive marks such as DNA methylation and histone H3 lysine 9 (H3K9) methylation silence these elements and maintain genome integrity. However, how silencing mechanisms are themselves regulated to avoid the silencing of genes remains unclear. ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkv958

    authors: Li S,Liu L,Li S,Gao L,Zhao Y,Kim YJ,Chen X

    更新日期:2016-01-29 00:00:00

  • EVEREST: a collection of evolutionary conserved protein domains.

    abstract::Protein domains are subunits of proteins that recur throughout the protein world. There are many definitions attempting to capture the essence of a protein domain, and several systems that identify protein domains and classify them into families. EVEREST, recently described in Portugaly et al. (2006) BMC Bioinformatic...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkl850

    authors: Portugaly E,Linial N,Linial M

    更新日期:2007-01-01 00:00:00

  • The primary structure of E. coli RNA polymerase, Nucleotide sequence of the rpoC gene and amino acid sequence of the beta'-subunit.

    abstract::The primary structure of the E. coli rpoC gene (5321 base pairs) coding the beta'-subunit of RNA polymerase as well as its adjacent segment have been determined. The structure analysis of the peptides obtained by cleavage of the protein with cyanogen bromide and trypsin has confirmed the amino acid sequence of the bet...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/10.13.4035

    authors: Ovchinnikov YuA,Monastyrskaya GS,Gubanov VV,Guryev SO,Salomatina IS,Shuvaeva TM,Lipkin VM,Sverdlov ED

    更新日期:1982-07-10 00:00:00

  • Selection for mutations in the PR promoter of bacteriophage lambda.

    abstract::Insertion of DNA containing PR, the early rightward promoter of bacteriophage lambda, is lethal to M13-derived vectors when the promoter directs transcription (using the '+' strand as template) toward the M13 origin of replication (ori). Lethality can be relieved by mutation of PR, repression of the promoter by the la...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/18.20.5961

    authors: Brown S,Ferm J,Woody S,Gussin G

    更新日期:1990-10-25 00:00:00

  • YY1 associates with the macrosatellite DXZ4 on the inactive X chromosome and binds with CTCF to a hypomethylated form in some male carcinomas.

    abstract::DXZ4 is an X-linked macrosatellite composed of 12-100 tandemly arranged 3-kb repeat units. In females, it adopts opposite chromatin arrangements at the two alleles in response to X-chromosome inactivation. In males and on the active X chromosome, it is packaged into heterochromatin, but on the inactive X chromosome (X...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkr964

    authors: Moseley SC,Rizkallah R,Tremblay DC,Anderson BR,Hurt MM,Chadwick BP

    更新日期:2012-02-01 00:00:00

  • Reactivity and selectivity in light-induced free radical reactions of 2-propanol with purine and pyrimidine mononucleotides and dinucleoside monophosphates.

    abstract::Photoalkylation reactions with 2-propanol, initiated with di-tert-butyl peroxide, of a variety of purine and pyrimidine mononucleotides and dinucleoside monophosphates lead to the substitution of an alpha-hydroxyisopropyl group for the H-8 atom of adenosine and the addition of the alcohol across the 5,6-double bond of...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/3.7.1715

    authors: Havron A,Sperling J,Elad D

    更新日期:1976-07-01 00:00:00

  • Semisynthesis of site-specifically succinylated histone reveals that succinylation regulates nucleosome unwrapping rate and DNA accessibility.

    abstract::Posttranslational modifications (PTMs) of histones represent a crucial regulatory mechanism of nucleosome and chromatin dynamics in various of DNA-based cellular processes, such as replication, transcription and DNA damage repair. Lysine succinylation (Ksucc) is a newly identified histone PTM, but its regulation and f...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkaa663

    authors: Jing Y,Ding D,Tian G,Kwan KCJ,Liu Z,Ishibashi T,Li XD

    更新日期:2020-09-25 00:00:00

  • A variable tandem repeat locus mapped to chromosome band 10q26 is amplified and rearranged in leukocyte DNAs of two cancer patients.

    abstract::A highly polymorphic locus associated with the variable tandem repetition of a 35 bp consensus sequence was mapped to chromosome 10, band q26. Examination of leukocyte DNA from a cancer patient revealed the twenty-fold amplification of one allelic fragment of this locus, while the other allelic fragment demonstrated a...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/14.20.7929

    authors: Colb M,Yang-Feng T,Francke U,Mermer B,Parkinson DR,Krontiris TG

    更新日期:1986-10-24 00:00:00

  • LINKER: a web server to generate peptide sequences with extended conformation.

    abstract::LINKER was developed as an online server to assist biomedical researchers to design linker sequences for constructing functional fusion proteins. The program automatically generates a set of peptide sequences that are known to adopt extended conformations as determined by X-ray crystallography and NMR. In addition to ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkh422

    authors: Xue F,Gu Z,Feng JA

    更新日期:2004-07-01 00:00:00

  • An insulator embedded in the chicken α-globin locus regulates chromatin domain configuration and differential gene expression.

    abstract::Genome organization into transcriptionally active domains denotes one of the first levels of gene expression regulation. Although the chromatin domain concept is generally accepted, only little is known on how domain organization impacts the regulation of differential gene expression. Insulators might hold answers to ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkq740

    authors: Furlan-Magaril M,Rebollar E,Guerrero G,Fernández A,Moltó E,González-Buendía E,Cantero M,Montoliu L,Recillas-Targa F

    更新日期:2011-01-01 00:00:00

  • Somatotroph- and lactotroph-specific interactions with the homeobox protein binding sites in the rat growth hormone gene promoter.

    abstract::Nuclear extracts prepared from growth hormone-secreting (GC) and prolactin-secreting (235-1) rat anterior pituitary cell lines were compared for their ability to bind to the DNA sequences conferring tissue-specificity to the expression of the rat growth hormone (rGH) gene promoter. Cell-specific differences in the int...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/18.17.5235

    authors: Schaufele F,West BL,Reudelhuber T

    更新日期:1990-09-11 00:00:00

  • Adenine-guanine base pairing ribosomal RNA.

    abstract::Analyses of secondary structures proposed for ribosomal RNA's show that, of the different kinds of base pairs directly adjoining the ends of postulated double-helical regions, only A-G with A at the 5' end significantly exceeds the number expected for a random base distribution. An A(syn)-G(trans) hydrogen-bonded base...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/10.8.2701

    authors: Traub W,Sussman JL

    更新日期:1982-04-24 00:00:00

  • MHCPEP--a database of MHC-binding peptides: update 1995.

    abstract::MHCPEP is a curated database comprising over 6000 peptide sequences known to bind MHC molecules. Entries are compiled from published reports as well as from direct submissions of experimental data. Each entry contains peptide sequence, MHC specificity and when available, experimental method, observed activity, binding...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/24.1.242

    authors: Brusic V,Rudy G,Kyne AP,Harrison LC

    更新日期:1996-01-01 00:00:00

  • mRNA transfection of a novel TAL effector nuclease (TALEN) facilitates efficient knockout of HIV co-receptor CCR5.

    abstract::Homozygosity for a natural deletion variant of the HIV-coreceptor molecule CCR5, CCR5Δ32, confers resistance toward HIV infection. Allogeneic stem cell transplantation from a CCR5Δ32-homozygous donor has resulted in the first cure from HIV ('Berlin patient'). Based thereon, genetic disruption of CCR5 using designer nu...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkv469

    authors: Mock U,Machowicz R,Hauber I,Horn S,Abramowski P,Berdien B,Hauber J,Fehse B

    更新日期:2015-06-23 00:00:00

  • Highly conserved elements discovered in vertebrates are present in non-syntenic loci of tunicates, act as enhancers and can be transcribed during development.

    abstract::Co-option of cis-regulatory modules has been suggested as a mechanism for the evolution of expression sites during development. However, the extent and mechanisms involved in mobilization of cis-regulatory modules remains elusive. To trace the history of non-coding elements, which may represent candidate ancestral cis...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkt030

    authors: Sanges R,Hadzhiev Y,Gueroult-Bellone M,Roure A,Ferg M,Meola N,Amore G,Basu S,Brown ER,De Simone M,Petrera F,Licastro D,Strähle U,Banfi S,Lemaire P,Birney E,Müller F,Stupka E

    更新日期:2013-04-01 00:00:00

  • Grad-seq shines light on unrecognized RNA and protein complexes in the model bacterium Escherichia coli.

    abstract::Stable protein complexes, including those formed with RNA, are major building blocks of every living cell. Escherichia coli has been the leading bacterial organism with respect to global protein-protein networks. Yet, there has been no global census of RNA/protein complexes in this model species of microbiology. Here,...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkaa676

    authors: Hör J,Di Giorgio S,Gerovac M,Venturini E,Förstner KU,Vogel J

    更新日期:2020-09-18 00:00:00

  • Studies on deoxyribonucleases from Saccharomyces cerevisiae. Characterization of two endonuclease activities with a preference for double-stranded DNA.

    abstract::Two new endonuclease activities, endonuclease B and endonuclease C, obtained from yeast nuclear preparations have been separated and partially characterized. Endonuclease B has a primary requirement for Mn2+ which cannot be replaced by Mg2+ or Ca2+, and makes single-strand scissions in double-stranded DNA. Endonculeas...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/2.7.1023

    authors: Pinon R,Leney E

    更新日期:1975-07-01 00:00:00

  • Padlock oligonucleotides as a tool for labeling superhelical DNA.

    abstract::Labeling of a covalently closed circular double-stranded DNA was achieved using a so-called 'padlock oligonucleotide'. The oligonucleotide was targeted to a sequence which is present in the replication origin of phage f1 and thus in numerous commonly used plasmids. After winding around the double-stranded target DNA s...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/30.3.e12

    authors: Roulon T,Coulaud D,Delain E,Le Cam E,Hélène C,Escudé C

    更新日期:2002-02-01 00:00:00

  • Rapid synthesis of oligodeoxyribonucleotides. IV. Improved solid phase synthesis of oligodeoxyribonucleotides through phosphotriester intermediates.

    abstract::A phosphotriester solid phase method on a polyamide support has been used to prepare oligodeoxyribonucleotides up to 12 units long. Compared to solid phase phosphodiester synthesis the new methodology is quicker, more flexible and gives 10-60-fold better overall yields. ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/8.5.1081

    authors: Gait MJ,Singh M,Sheppard RC,Edge MD,Greene AR,Heathcliffe GR,Atkinson TC,Newton CR,Markham AF

    更新日期:1980-03-11 00:00:00

  • The FALC-Loop web server for protein loop modeling.

    abstract::The FALC-Loop web server provides an online interface for protein loop modeling by employing an ab initio loop modeling method called FALC (fragment assembly and analytical loop closure). The server may be used to construct loop regions in homology modeling, to refine unreliable loop regions in experimental structures...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkr352

    authors: Ko J,Lee D,Park H,Coutsias EA,Lee J,Seok C

    更新日期:2011-07-01 00:00:00

  • Construction and functional analyses of a comprehensive sigma54 site-directed mutant library using alanine-cysteine mutagenesis.

    abstract::The sigma(54) factor associates with core RNA polymerase (RNAP) to form a holoenzyme that is unable to initiate transcription unless acted on by an activator protein. sigma(54) is closely involved in many steps of activator-dependent transcription, such as core RNAP binding, promoter recognition, activator interaction...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkp419

    authors: Xiao Y,Wigneshweraraj SR,Weinzierl R,Wang YP,Buck M

    更新日期:2009-07-01 00:00:00

  • Interaction of the resolving enzyme YDC2 with the four-way DNA junction.

    abstract::Holliday junctions (four-way DNA junctions), formed during homologous recombination, are bound and resolved by junction-specific endonucleases to yield recombinant duplex DNA products. The junction-resolving enzymes are a structurally diverse class of proteins that nevertheless have many properties in common; in parti...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/26.24.5609

    authors: White MF,Lilley DM

    更新日期:1998-12-15 00:00:00

  • Discrete regions of sequence homology between cloned rodent VL30 genetic elements and AKV-related MuLV provirus genomes.

    abstract::Southern blot analyses using reduced stringency hybridization conditions have been employed to search for sequence homologies between rodent VL30 genes and murine leukemia virus (MuLV) proviruses. These constitute two classes of transposon-like elements previously believed to be genetically unrelated. Our results demo...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/11.2.305

    authors: Giri CP,Hodgson CP,Elder PK,Courtney MG,Getz MJ

    更新日期:1983-01-25 00:00:00

  • Monitoring of chromatin organization in live cells by FRIC. Effects of the inner nuclear membrane protein Samp1.

    abstract::In most cells, transcriptionally inactive heterochromatin is preferentially localized in the nuclear periphery and transcriptionally active euchromatin is localized in the nuclear interior. Different cell types display characteristic chromatin distribution patterns, which change dramatically during cell differentiatio...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkz123

    authors: Bergqvist C,Niss F,Figueroa RA,Beckman M,Maksel D,Jafferali MH,Kulyté A,Ström AL,Hallberg E

    更新日期:2019-05-21 00:00:00

  • Replication of tomato golden mosaic virus DNA B in transgenic plants expressing open reading frames (ORFs) of DNA A: requirement of ORF AL2 for production of single-stranded DNA.

    abstract::Tomato golden mosaic geminivirus has a genome of two single-stranded (ss) DNA components, A and B. An almost identical 'common' region in DNA A and DNA B is thought to contain sequence elements controlling replication and transcription. Hence investigation of sequences important for DNA replication by in vitro mutagen...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/17.24.10213

    authors: Hayes RJ,Buck KW

    更新日期:1989-12-25 00:00:00

  • AptaBlocks: Designing RNA complexes and accelerating RNA-based drug delivery systems.

    abstract::RNA-based therapeutics, i.e. the utilization of synthetic RNA molecules to alter cellular functions, have the potential to address targets which are currently out of scope for traditional drug design pipelines. This potential however hinges on the ability to selectively deliver and internalize therapeutic RNAs into ce...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gky577

    authors: Wang Y,Hoinka J,Liang Y,Adamus T,Swiderski P,Przytycka TM

    更新日期:2018-09-19 00:00:00

  • Antisense-induced ribosomal frameshifting.

    abstract::Programmed ribosomal frameshifting provides a mechanism to decode information located in two overlapping reading frames by diverting a proportion of translating ribosomes into a second open reading frame (ORF). The result is the production of two proteins: the product of standard translation from ORF1 and an ORF1-ORF2...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkl531

    authors: Henderson CM,Anderson CB,Howard MT

    更新日期:2006-01-01 00:00:00

  • Fine structure mapping of an avian tumor virus RNA by immunoelectron microscopy.

    abstract::The RNA of a deleted strain (lacking Src gene) of an avian sarcoma virus (ASV) was examined by a newly developed immunoelectron microscopic procedure which uses anti-nucleotide antibodies as probes. After denaturation of the RNA and reaction with a high affinity, highly specific anti-7-methylguanosine-5'-phosphate (an...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/8.19.4485

    authors: Castleman H,Meredith RD,Erlanger BF

    更新日期:1980-10-10 00:00:00

  • Programmable gene regulation for metabolic engineering using decoy transcription factor binding sites.

    abstract::Transcription factor decoy binding sites are short DNA sequences that can titrate a transcription factor away from its natural binding site, therefore regulating gene expression. In this study, we harness synthetic transcription factor decoy systems to regulate gene expression for metabolic pathways in Escherichia col...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkaa1234

    authors: Wang T,Tague N,Whelan SA,Dunlop MJ

    更新日期:2021-01-25 00:00:00