Synthetic promoter design in Escherichia coli based on a deep generative network.

Abstract:

:Promoter design remains one of the most important considerations in metabolic engineering and synthetic biology applications. Theoretically, there are 450 possible sequences for a 50-nt promoter, of which naturally occurring promoters make up only a small subset. To explore the vast number of potential sequences, we report a novel AI-based framework for de novo promoter design in Escherichia coli. The model, which was guided by sequence features learned from natural promoters, could capture interactions between nucleotides at different positions and design novel synthetic promoters in silico. We combined a deep generative model that guides the search for artificial sequences with a predictive model to preselect the most promising promoters. The AI-designed promoters were optimized based on the promoter activity in E. coli and the predictive model. After two rounds of optimization, up to 70.8% of the AI-designed promoters were experimentally demonstrated to be functional, and few of them shared significant sequence similarity with the E. coli genome. Our work provided an end-to-end approach to the de novo design of novel promoter elements, indicating the potential to apply deep learning methods to de novo genetic element design.

journal_name

Nucleic Acids Res

journal_title

Nucleic acids research

authors

Wang Y,Wang H,Wei L,Li S,Liu L,Wang X

doi

10.1093/nar/gkaa325

subject

Has Abstract

pub_date

2020-07-09 00:00:00

pages

6403-6412

issue

12

eissn

0305-1048

issn

1362-4962

pii

5837049

journal_volume

48

pub_type

杂志文章
  • In vitro DNA dependent synthesis of globin RNA sequences from erythroleukemic cell chromatin.

    abstract::Murine erythroleukemic cells in culture accumulate cytoplasmic globin mRNA during differentiation induced by dimethyl sulfoxide (DMSO)1. Chromatin was prepared from DMSO induced erythroleukemic cells that were transcribing globin RNA in order to determine whether in vitro synthesis of globin RNA sequences was possible...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/6.1.275

    authors: Reff ME,Davidson RL

    更新日期:1979-01-01 00:00:00

  • Dealing with the genetic load in bacterial synthetic biology circuits: convergences with the Ohm's law.

    abstract::Synthetic biology seeks to envision living cells as a matter of engineering. However, increasing evidence suggests that the genetic load imposed by the incorporation of synthetic devices in a living organism introduces a sort of unpredictability in the design process. As a result, individual part characterization is n...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkv1280

    authors: Carbonell-Ballestero M,Garcia-Ramallo E,Montañez R,Rodriguez-Caso C,Macía J

    更新日期:2016-01-08 00:00:00

  • DNA polymerase δ stalls on telomeric lagging strand templates independently from G-quadruplex formation.

    abstract::Previous evidence indicates that telomeres resemble common fragile sites and present a challenge for DNA replication. The precise impediments to replication fork progression at telomeric TTAGGG repeats are unknown, but are proposed to include G-quadruplexes (G4) on the G-rich strand. Here we examined DNA synthesis and...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkt813

    authors: Lormand JD,Buncher N,Murphy CT,Kaur P,Lee MY,Burgers P,Wang H,Kunkel TA,Opresko PL

    更新日期:2013-12-01 00:00:00

  • ADGO 2.0: interpreting microarray data and list of genes using composite annotations.

    abstract::ADGO 2.0 is a web-based tool that provides composite interpretations for microarray data comparing two sample groups as well as lists of genes from diverse sources of biological information. Some other tools also incorporate composite annotations solely for interpreting lists of genes but usually provide highly redund...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkr392

    authors: Chi SM,Kim J,Kim SY,Nam D

    更新日期:2011-07-01 00:00:00

  • Crystal structure of trioxacarcin A covalently bound to DNA.

    abstract::We report a crystal structure that shows an antibiotic that extracts a nucleobase from a DNA molecule 'caught in the act' after forming a covalent bond but before departing with the base. The structure of trioxacarcin A covalently bound to double-stranded d(AACCGGTT) was determined to 1.78 A resolution by MAD phasing ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkn245

    authors: Pfoh R,Laatsch H,Sheldrick GM

    更新日期:2008-06-01 00:00:00

  • Structure of the yeast Pml1 splicing factor and its integration into the RES complex.

    abstract::The RES complex was previously identified in yeast as a splicing factor affecting nuclear pre-mRNA retention. This complex was shown to contain three subunits, namely Snu17, Bud13 and Pml1, but its mode of action remains ill-defined. To obtain insights into its function, we have performed a structural investigation of...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkn894

    authors: Brooks MA,Dziembowski A,Quevillon-Cheruel S,Henriot V,Faux C,van Tilbeurgh H,Séraphin B

    更新日期:2009-01-01 00:00:00

  • Cloning the gyrA gene of Bacillus subtilis.

    abstract::We have isolated an eight kilobase fragment of Bacillus subtilis DNA by specific integration and excision of a plasmid containing a sequence adjacent to ribosomal operon rrn O. The genetic locus of the cloned fragment was verified by linkage of the integrated vector to nearby genetic markers using both transduction an...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/12.15.6307

    authors: Lampe MF,Bott KF

    更新日期:1984-08-10 00:00:00

  • Content discovery and retrieval services at the European Nucleotide Archive.

    abstract::The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary resource for nucleotide sequence information. With the growing volume and diversity of public sequencing data comes the need for increased sophistication in data organisation, presentation and search services so as to maximise its disc...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gku1129

    authors: Silvester N,Alako B,Amid C,Cerdeño-Tárraga A,Cleland I,Gibson R,Goodgame N,Ten Hoopen P,Kay S,Leinonen R,Li W,Liu X,Lopez R,Pakseresht N,Pallreddy S,Plaister S,Radhakrishnan R,Rossello M,Senf A,Smirnov D,Toribio A

    更新日期:2015-01-01 00:00:00

  • Polynucleotide:adenosine glycosidase activity of ribosome-inactivating proteins: effect on DNA, RNA and poly(A).

    abstract::Ribosome-inactivating proteins (RIP) are a family of plant enzymes for which a unique activity was determined: rRNAN-glycosidase at a specific universally conserved position, A4324in the case of rat ribosomes. Recently we have shown that the RIP from Saponaria officinalis have a much wider substrate specificity: they ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/25.3.518

    authors: Barbieri L,Valbonesi P,Bonora E,Gorini P,Bolognesi A,Stirpe F

    更新日期:1997-02-01 00:00:00

  • Interaction of dimeric intercalating dyes with single-stranded DNA.

    abstract::The unsymmetrical cyanine dye thiazole orange homodimer (TOTO) binds to single-stranded DNA (ssDNA, M13mp18 ssDNA) to form a fluorescent complex that is stable under the standard conditions of electrophoresis. The stability of this complex is indistinguishable from that of the corresponding complex of TOTO with double...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/23.7.1215

    authors: Rye HS,Glazer AN

    更新日期:1995-04-11 00:00:00

  • Analysis of the Escherichia coli genome. V. DNA sequence of the region from 76.0 to 81.5 minutes.

    abstract::The DNA sequence of a 225.4 kilobase segment of the Escherichia coli K-12 genome is described here, from 76.0 to 81.5 minutes on the genetic map. This brings the total of contiguous sequence from the E.coli genome project to 725.1 kb (76.0 to 92.8 minutes). We found 191 putative coding genes (ORFs) of which 72 genes w...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/22.13.2576

    authors: Sofia HJ,Burland V,Daniels DL,Plunkett G 3rd,Blattner FR

    更新日期:1994-07-11 00:00:00

  • Unexpected A-form formation of 4'-thioDNA in solution, revealed by NMR, and the implications as to the mechanism of nuclease resistance.

    abstract::Fully modified 4'-thioDNA, an oligonucleotide only comprising 2'-deoxy-4'-thionucleosides, exhibited resistance to an endonuclease, in addition to preferable hybridization with RNA. Therefore, 4'-thioDNA is promising for application as a functional oligonucleotide. Fully modified 4'-thioDNA was found to behave like an...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkn011

    authors: Matsugami A,Ohyama T,Inada M,Inoue N,Minakawa N,Matsuda A,Katahira M

    更新日期:2008-04-01 00:00:00

  • Redundancy of primary RNA-binding functions of the bacterial transcription terminator Rho.

    abstract::The bacterial transcription terminator, Rho, terminates transcription at half of the operons. According to the classical model derived from in vitro assays on a few terminators, Rho is recruited to the transcription elongation complex (EC) by recognizing specific sites (rut) on the nascent RNA. Here, we explored the m...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gku690

    authors: Shashni R,Qayyum MZ,Vishalini V,Dey D,Sen R

    更新日期:2014-09-01 00:00:00

  • A stereospecifically 18O-labelled deoxydinucleoside phosphate block for incorporation into an oligonucleotide.

    abstract::Fully protected diastereoisomers of deoxyguanylyl (3' leads to 5') deoxyadenosine stereospecifically labelled on phosphorus with oxygen-18 have been synthesized by oxidation of phosphite triester intermediates in the presence of 18O-labelled water. The diastereoisomers have been chromatographically separated and their...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/11.20.7087

    authors: Potter BV,Eckstein F,Uznański B

    更新日期:1983-10-25 00:00:00

  • Intensity-based analysis of two-colour microarrays enables efficient and flexible hybridization designs.

    abstract::In two-colour microarrays, the ratio of signal intensities of two co-hybridized samples is used as a relative measure of gene expression. Ratio-based analysis becomes complicated and inefficient in multi-class comparisons. We therefore investigated the validity of an intensity-based analysis procedure. To this end, tw...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gnh038

    authors: 't Hoen PA,Turk R,Boer JM,Sterrenburg E,de Menezes RX,van Ommen GJ,den Dunnen JT

    更新日期:2004-02-24 00:00:00

  • Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames.

    abstract::A new method which predicts internal exon sequences in human DNA has been developed. The method is based on a splice site prediction algorithm that uses the linear discriminant function to combine information about significant triplet frequencies of various functional parts of splice site regions and preferences of ol...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/22.24.5156

    authors: Solovyev VV,Salamov AA,Lawrence CB

    更新日期:1994-12-11 00:00:00

  • Analysis of protein sequence and interaction data for candidate disease gene prediction.

    abstract::Linkage analysis is a successful procedure to associate diseases with specific genomic regions. These regions are often large, containing hundreds of genes, which make experimental methods employed to identify the disease gene arduous and expensive. We present two methods to prioritize candidates for further experimen...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkl707

    authors: George RA,Liu JY,Feng LL,Bryson-Richardson RJ,Fatkin D,Wouters MA

    更新日期:2006-01-01 00:00:00

  • Frequent oligonucleotide motifs in genomes of three streptococci.

    abstract::Complete genomes of three closely related Gram-positive bacteria Streptococcus pyogenes, Streptococcus pneumoniae and Lactococcus lactis are analyzed for abundances of short DNA sequence motifs (frequent words). The character and extent of frequent words are strikingly different among these genomes. The frequent words...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkf534

    authors: Mrázek J,Gaynon LH,Karlin S

    更新日期:2002-10-01 00:00:00

  • Interactions between avian myeloblastosis reverse transcriptase and tRNATrp. Mapping of complexed tRNA with chemicals and nucleases.

    abstract::The interactions between beef tRNATrp with avian myeloblastosis reverse transcriptase have been studied by statistical chemical modifications of phosphate (ethylnitrosourea) and cytidine (dimethyl sulfate) residues, as well as by digestion of complexed tRNA by Cobra venom nuclease and Neurospora crassa endonuclease. R...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/12.5.2259

    authors: Garret M,Romby P,Giegé R,Litvak S

    更新日期:1984-03-12 00:00:00

  • A novel CRISPR-engineered prostate cancer cell line defines the AR-V transcriptome and identifies PARP inhibitor sensitivities.

    abstract::Resistance to androgen receptor (AR)-targeted therapies in prostate cancer (PC) is a major clinical problem. A key mechanism of treatment resistance in advanced PC is the generation of alternatively spliced forms of the AR termed AR variants (AR-Vs) that are refractory to targeted agents and drive tumour progression. ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkz286

    authors: Kounatidou E,Nakjang S,McCracken SRC,Dehm SM,Robson CN,Jones D,Gaughan L

    更新日期:2019-06-20 00:00:00

  • The COG database: new developments in phylogenetic classification of proteins from complete genomes.

    abstract::The database of Clusters of Orthologous Groups of proteins (COGs), which represents an attempt on a phylogenetic classification of the proteins encoded in complete genomes, currently consists of 2791 COGs including 45 350 proteins from 30 genomes of bacteria, archaea and the yeast Saccharomyces cerevisiae (http://www....

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/29.1.22

    authors: Tatusov RL,Natale DA,Garkavtsev IV,Tatusova TA,Shankavaram UT,Rao BS,Kiryutin B,Galperin MY,Fedorova ND,Koonin EV

    更新日期:2001-01-01 00:00:00

  • Distinct roles of Pcf11 zinc-binding domains in pre-mRNA 3'-end processing.

    abstract::New transcripts generated by RNA polymerase II (RNAPII) are generally processed in order to form mature mRNAs. Two key processing steps include a precise cleavage within the 3' end of the pre-mRNA, and the subsequent polymerization of adenosines to produce the poly(A) tail. In yeast, these two functions are performed ...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkx674

    authors: Guéguéniat J,Dupin AF,Stojko J,Beaurepaire L,Cianférani S,Mackereth CD,Minvielle-Sébastia L,Fribourg S

    更新日期:2017-09-29 00:00:00

  • DNA modification by sulfur: analysis of the sequence recognition specificity surrounding the modification sites.

    abstract::The Dnd (DNA degradation) phenotype, reflecting a novel DNA modification by sulfur in Streptomyces lividans 1326, was strongly aggravated when one (dndB) of the five genes (dndABCDE) controlling it was mutated. Electrophoretic banding patterns of a plasmid (pHZ209), reflecting DNA degradation, displayed a clear change...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkm176

    authors: Liang J,Wang Z,He X,Li J,Zhou X,Deng Z

    更新日期:2007-01-01 00:00:00

  • Molecular determinants of the DprA-RecA interaction for nucleation on ssDNA.

    abstract::Natural transformation is a major mechanism of horizontal gene transfer in bacteria that depends on DNA recombination. RecA is central to the homologous recombination pathway, catalyzing DNA strand invasion and homology search. DprA was shown to be a key binding partner of RecA acting as a specific mediator for its lo...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gku349

    authors: Lisboa J,Andreani J,Sanchez D,Boudes M,Collinet B,Liger D,van Tilbeurgh H,Guérois R,Quevillon-Cheruel S

    更新日期:2014-06-01 00:00:00

  • Characterization of the interaction of lambda exonuclease with the ends of DNA.

    abstract::Lambda exonuclease processively degrades one strand of double-stranded DNA (dsDNA) in the 5"-3" direction. To understand the mechanism through which this enzyme generates high processivity we are analyzing the first step in the reaction, namely the interaction of lambda exonuclease with the ends of substrate DNA. Endo...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/27.15.3057

    authors: Mitsis PG,Kwagh JG

    更新日期:1999-08-01 00:00:00

  • Unexpected correlations between gene expression and codon usage bias from microarray data for the whole Escherichia coli K-12 genome.

    abstract::Escherichia coli has long been regarded as a model organism in the study of codon usage bias (CUB). However, most studies in this organism regarding this topic have been computational or, when experimental, restricted to small datasets; particularly poor attention has been given to genes with low CUB. In this work, co...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkg897

    authors: dos Reis M,Wernisch L,Savva R

    更新日期:2003-12-01 00:00:00

  • The structure of the RbBP5 β-propeller domain reveals a surface with potential nucleic acid binding sites.

    abstract::The multi-protein complex WRAD, formed by WDR5, RbBP5, Ash2L and Dpy30, binds to the MLL SET domain to stabilize the catalytically active conformation required for histone H3K4 methylation. In addition, the WRAD complex contributes to the targeting of the activated complex to specific sites on chromatin. RbBP5 is cent...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gky199

    authors: Mittal A,Hobor F,Zhang Y,Martin SR,Gamblin SJ,Ramos A,Wilson JR

    更新日期:2018-04-20 00:00:00

  • An intermediate in the phage lambda site-specific recombination reaction is revealed by phosphorothioate substitution in DNA.

    abstract::It has been proposed that phage lambda site-specific recombination proceeds via two independent strand exchanges: the first exchange forming a Holliday-structure which is then converted into complete recombinant products by the second strand exchange. If this hypothesis is correct, one should be able to trap the putat...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/16.14.6839

    authors: Kitts PA,Nash HA

    更新日期:1988-07-25 00:00:00

  • Metabolism of dITP in HeLa cell extracts, incorporation into DNA by isolated nuclei and release of hypoxanthine from DNA by a hypoxanthine-DNA glycosylase activity.

    abstract::dITP may be generated from dATP by a slow, nonenzymatic hydrolysis. While [3H]dITP was degraded rapidly to [3H]deoxyinosine by HeLa cell nuclear extracts, no net degradation of [3H]dITP was observed in the presence of physiological concentrations of ATP, apparently because the extract contained deoxynucleoside diphosp...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/10.12.3693

    authors: Myrnes B,Guddal PH,Krokan H

    更新日期:1982-06-25 00:00:00

  • The effects of polymorphisms on human gene targeting.

    abstract::DNA mismatches that occur between vector homology arms and chromosomal target sequences reduce gene targeting frequencies in several species; however, this has not been reported in human cells. Here we demonstrate that even a single mismatched base pair can significantly decrease human gene targeting frequencies. In a...

    journal_title:Nucleic acids research

    pub_type: 杂志文章

    doi:10.1093/nar/gkt1303

    authors: Deyle DR,Li LB,Ren G,Russell DW

    更新日期:2014-03-01 00:00:00