Abstract:
:Promoter design remains one of the most important considerations in metabolic engineering and synthetic biology applications. Theoretically, there are 450 possible sequences for a 50-nt promoter, of which naturally occurring promoters make up only a small subset. To explore the vast number of potential sequences, we report a novel AI-based framework for de novo promoter design in Escherichia coli. The model, which was guided by sequence features learned from natural promoters, could capture interactions between nucleotides at different positions and design novel synthetic promoters in silico. We combined a deep generative model that guides the search for artificial sequences with a predictive model to preselect the most promising promoters. The AI-designed promoters were optimized based on the promoter activity in E. coli and the predictive model. After two rounds of optimization, up to 70.8% of the AI-designed promoters were experimentally demonstrated to be functional, and few of them shared significant sequence similarity with the E. coli genome. Our work provided an end-to-end approach to the de novo design of novel promoter elements, indicating the potential to apply deep learning methods to de novo genetic element design.
journal_name
Nucleic Acids Resjournal_title
Nucleic acids researchauthors
Wang Y,Wang H,Wei L,Li S,Liu L,Wang Xdoi
10.1093/nar/gkaa325subject
Has Abstractpub_date
2020-07-09 00:00:00pages
6403-6412issue
12eissn
0305-1048issn
1362-4962pii
5837049journal_volume
48pub_type
杂志文章abstract::Murine erythroleukemic cells in culture accumulate cytoplasmic globin mRNA during differentiation induced by dimethyl sulfoxide (DMSO)1. Chromatin was prepared from DMSO induced erythroleukemic cells that were transcribing globin RNA in order to determine whether in vitro synthesis of globin RNA sequences was possible...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/6.1.275
更新日期:1979-01-01 00:00:00
abstract::Synthetic biology seeks to envision living cells as a matter of engineering. However, increasing evidence suggests that the genetic load imposed by the incorporation of synthetic devices in a living organism introduces a sort of unpredictability in the design process. As a result, individual part characterization is n...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkv1280
更新日期:2016-01-08 00:00:00
abstract::Previous evidence indicates that telomeres resemble common fragile sites and present a challenge for DNA replication. The precise impediments to replication fork progression at telomeric TTAGGG repeats are unknown, but are proposed to include G-quadruplexes (G4) on the G-rich strand. Here we examined DNA synthesis and...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkt813
更新日期:2013-12-01 00:00:00
abstract::ADGO 2.0 is a web-based tool that provides composite interpretations for microarray data comparing two sample groups as well as lists of genes from diverse sources of biological information. Some other tools also incorporate composite annotations solely for interpreting lists of genes but usually provide highly redund...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkr392
更新日期:2011-07-01 00:00:00
abstract::We report a crystal structure that shows an antibiotic that extracts a nucleobase from a DNA molecule 'caught in the act' after forming a covalent bond but before departing with the base. The structure of trioxacarcin A covalently bound to double-stranded d(AACCGGTT) was determined to 1.78 A resolution by MAD phasing ...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkn245
更新日期:2008-06-01 00:00:00
abstract::The RES complex was previously identified in yeast as a splicing factor affecting nuclear pre-mRNA retention. This complex was shown to contain three subunits, namely Snu17, Bud13 and Pml1, but its mode of action remains ill-defined. To obtain insights into its function, we have performed a structural investigation of...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkn894
更新日期:2009-01-01 00:00:00
abstract::We have isolated an eight kilobase fragment of Bacillus subtilis DNA by specific integration and excision of a plasmid containing a sequence adjacent to ribosomal operon rrn O. The genetic locus of the cloned fragment was verified by linkage of the integrated vector to nearby genetic markers using both transduction an...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/12.15.6307
更新日期:1984-08-10 00:00:00
abstract::The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary resource for nucleotide sequence information. With the growing volume and diversity of public sequencing data comes the need for increased sophistication in data organisation, presentation and search services so as to maximise its disc...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gku1129
更新日期:2015-01-01 00:00:00
abstract::Ribosome-inactivating proteins (RIP) are a family of plant enzymes for which a unique activity was determined: rRNAN-glycosidase at a specific universally conserved position, A4324in the case of rat ribosomes. Recently we have shown that the RIP from Saponaria officinalis have a much wider substrate specificity: they ...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/25.3.518
更新日期:1997-02-01 00:00:00
abstract::The unsymmetrical cyanine dye thiazole orange homodimer (TOTO) binds to single-stranded DNA (ssDNA, M13mp18 ssDNA) to form a fluorescent complex that is stable under the standard conditions of electrophoresis. The stability of this complex is indistinguishable from that of the corresponding complex of TOTO with double...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/23.7.1215
更新日期:1995-04-11 00:00:00
abstract::The DNA sequence of a 225.4 kilobase segment of the Escherichia coli K-12 genome is described here, from 76.0 to 81.5 minutes on the genetic map. This brings the total of contiguous sequence from the E.coli genome project to 725.1 kb (76.0 to 92.8 minutes). We found 191 putative coding genes (ORFs) of which 72 genes w...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/22.13.2576
更新日期:1994-07-11 00:00:00
abstract::Fully modified 4'-thioDNA, an oligonucleotide only comprising 2'-deoxy-4'-thionucleosides, exhibited resistance to an endonuclease, in addition to preferable hybridization with RNA. Therefore, 4'-thioDNA is promising for application as a functional oligonucleotide. Fully modified 4'-thioDNA was found to behave like an...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkn011
更新日期:2008-04-01 00:00:00
abstract::The bacterial transcription terminator, Rho, terminates transcription at half of the operons. According to the classical model derived from in vitro assays on a few terminators, Rho is recruited to the transcription elongation complex (EC) by recognizing specific sites (rut) on the nascent RNA. Here, we explored the m...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gku690
更新日期:2014-09-01 00:00:00
abstract::Fully protected diastereoisomers of deoxyguanylyl (3' leads to 5') deoxyadenosine stereospecifically labelled on phosphorus with oxygen-18 have been synthesized by oxidation of phosphite triester intermediates in the presence of 18O-labelled water. The diastereoisomers have been chromatographically separated and their...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/11.20.7087
更新日期:1983-10-25 00:00:00
abstract::In two-colour microarrays, the ratio of signal intensities of two co-hybridized samples is used as a relative measure of gene expression. Ratio-based analysis becomes complicated and inefficient in multi-class comparisons. We therefore investigated the validity of an intensity-based analysis procedure. To this end, tw...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gnh038
更新日期:2004-02-24 00:00:00
abstract::A new method which predicts internal exon sequences in human DNA has been developed. The method is based on a splice site prediction algorithm that uses the linear discriminant function to combine information about significant triplet frequencies of various functional parts of splice site regions and preferences of ol...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/22.24.5156
更新日期:1994-12-11 00:00:00
abstract::Linkage analysis is a successful procedure to associate diseases with specific genomic regions. These regions are often large, containing hundreds of genes, which make experimental methods employed to identify the disease gene arduous and expensive. We present two methods to prioritize candidates for further experimen...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkl707
更新日期:2006-01-01 00:00:00
abstract::Complete genomes of three closely related Gram-positive bacteria Streptococcus pyogenes, Streptococcus pneumoniae and Lactococcus lactis are analyzed for abundances of short DNA sequence motifs (frequent words). The character and extent of frequent words are strikingly different among these genomes. The frequent words...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkf534
更新日期:2002-10-01 00:00:00
abstract::The interactions between beef tRNATrp with avian myeloblastosis reverse transcriptase have been studied by statistical chemical modifications of phosphate (ethylnitrosourea) and cytidine (dimethyl sulfate) residues, as well as by digestion of complexed tRNA by Cobra venom nuclease and Neurospora crassa endonuclease. R...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/12.5.2259
更新日期:1984-03-12 00:00:00
abstract::Resistance to androgen receptor (AR)-targeted therapies in prostate cancer (PC) is a major clinical problem. A key mechanism of treatment resistance in advanced PC is the generation of alternatively spliced forms of the AR termed AR variants (AR-Vs) that are refractory to targeted agents and drive tumour progression. ...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkz286
更新日期:2019-06-20 00:00:00
abstract::The database of Clusters of Orthologous Groups of proteins (COGs), which represents an attempt on a phylogenetic classification of the proteins encoded in complete genomes, currently consists of 2791 COGs including 45 350 proteins from 30 genomes of bacteria, archaea and the yeast Saccharomyces cerevisiae (http://www....
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/29.1.22
更新日期:2001-01-01 00:00:00
abstract::New transcripts generated by RNA polymerase II (RNAPII) are generally processed in order to form mature mRNAs. Two key processing steps include a precise cleavage within the 3' end of the pre-mRNA, and the subsequent polymerization of adenosines to produce the poly(A) tail. In yeast, these two functions are performed ...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkx674
更新日期:2017-09-29 00:00:00
abstract::The Dnd (DNA degradation) phenotype, reflecting a novel DNA modification by sulfur in Streptomyces lividans 1326, was strongly aggravated when one (dndB) of the five genes (dndABCDE) controlling it was mutated. Electrophoretic banding patterns of a plasmid (pHZ209), reflecting DNA degradation, displayed a clear change...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkm176
更新日期:2007-01-01 00:00:00
abstract::Natural transformation is a major mechanism of horizontal gene transfer in bacteria that depends on DNA recombination. RecA is central to the homologous recombination pathway, catalyzing DNA strand invasion and homology search. DprA was shown to be a key binding partner of RecA acting as a specific mediator for its lo...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gku349
更新日期:2014-06-01 00:00:00
abstract::Lambda exonuclease processively degrades one strand of double-stranded DNA (dsDNA) in the 5"-3" direction. To understand the mechanism through which this enzyme generates high processivity we are analyzing the first step in the reaction, namely the interaction of lambda exonuclease with the ends of substrate DNA. Endo...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/27.15.3057
更新日期:1999-08-01 00:00:00
abstract::Escherichia coli has long been regarded as a model organism in the study of codon usage bias (CUB). However, most studies in this organism regarding this topic have been computational or, when experimental, restricted to small datasets; particularly poor attention has been given to genes with low CUB. In this work, co...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkg897
更新日期:2003-12-01 00:00:00
abstract::The multi-protein complex WRAD, formed by WDR5, RbBP5, Ash2L and Dpy30, binds to the MLL SET domain to stabilize the catalytically active conformation required for histone H3K4 methylation. In addition, the WRAD complex contributes to the targeting of the activated complex to specific sites on chromatin. RbBP5 is cent...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gky199
更新日期:2018-04-20 00:00:00
abstract::It has been proposed that phage lambda site-specific recombination proceeds via two independent strand exchanges: the first exchange forming a Holliday-structure which is then converted into complete recombinant products by the second strand exchange. If this hypothesis is correct, one should be able to trap the putat...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/16.14.6839
更新日期:1988-07-25 00:00:00
abstract::dITP may be generated from dATP by a slow, nonenzymatic hydrolysis. While [3H]dITP was degraded rapidly to [3H]deoxyinosine by HeLa cell nuclear extracts, no net degradation of [3H]dITP was observed in the presence of physiological concentrations of ATP, apparently because the extract contained deoxynucleoside diphosp...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/10.12.3693
更新日期:1982-06-25 00:00:00
abstract::DNA mismatches that occur between vector homology arms and chromosomal target sequences reduce gene targeting frequencies in several species; however, this has not been reported in human cells. Here we demonstrate that even a single mismatched base pair can significantly decrease human gene targeting frequencies. In a...
journal_title:Nucleic acids research
pub_type: 杂志文章
doi:10.1093/nar/gkt1303
更新日期:2014-03-01 00:00:00