Abstract:
:The representation and discovery of transcription factor (TF) sequence binding specificities is critical for understanding gene regulatory networks and interpreting the impact of disease-associated noncoding genetic variants. We present a novel TF binding motif representation, the k-mer set memory (KSM), which consists of a set of aligned k-mers that are overrepresented at TF binding sites, and a new method called KMAC for de novo discovery of KSMs. We find that KSMs more accurately predict in vivo binding sites than position weight matrix (PWM) models and other more complex motif models across a large set of ChIP-seq experiments. Furthermore, KSMs outperform PWMs and more complex motif models in predicting in vitro binding sites. KMAC also identifies correct motifs in more experiments than five state-of-the-art motif discovery methods. In addition, KSM-derived features outperform both PWM and deep learning model derived sequence features in predicting differential regulatory activities of expression quantitative trait loci (eQTL) alleles. Finally, we have applied KMAC to 1600 ENCODE TF ChIP-seq data sets and created a public resource of KSM and PWM motifs. We expect that the KSM representation and KMAC method will be valuable in characterizing TF binding specificities and in interpreting the effects of noncoding genetic variations.
journal_name
Genome Resjournal_title
Genome researchauthors
Guo Y,Tian K,Zeng H,Guo X,Gifford DKdoi
10.1101/gr.226852.117subject
Has Abstractpub_date
2018-06-01 00:00:00pages
891-900issue
6eissn
1088-9051issn
1549-5469pii
gr.226852.117journal_volume
28pub_type
杂志文章相关文献
GENOME RESEARCH文献大全abstract::Cellular senescence is a mechanism that virtually irreversibly suppresses the proliferative capacity of cells in response to various stress signals. This includes the expression of activated oncogenes, which causes Oncogene-Induced Senescence (OIS). A body of evidence points to the involvement in OIS of chromatin reor...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.225763.117
更新日期:2017-10-01 00:00:00
abstract::We report the construction of a dense linkage map of the rat genome integrating 767 simple sequence length polymorphism markers, combined over three crosses with high rates of polymorphism. F2 populations from WKY x S (n = 159), BN x S (n = 91), and BN x GK (n = 139) were selected and genotyped for combinations of mic...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.7.5.434
更新日期:1997-05-01 00:00:00
abstract::Large-scale sequencing of human and model organism genomes will have a profound impact on our ability to use sequence data base searching to predict the biochemical functions of sequences of interest. Despite the great value of more sequences in the data bases, a huge increase in data base size will also have adverse ...
journal_title:Genome research
pub_type: 杂志文章,评审
doi:10.1101/gr.6.8.653
更新日期:1996-08-01 00:00:00
abstract::Identifying genes in the genomic context is central to a cell's ability to interpret the genome. Yet, in general, the signals used to define eukaryotic genes are poorly described. Here, we derived simple classifiers that identify where transcription will initiate and terminate using nucleic acid sequence features dete...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.164327.113
更新日期:2014-01-01 00:00:00
abstract::A small set of core transcription factors (TFs) dominates control of the gene expression program in embryonic stem cells and other well-studied cellular models. These core TFs collectively regulate their own gene expression, thus forming an interconnected auto-regulatory loop that can be considered the core transcript...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.197590.115
更新日期:2016-03-01 00:00:00
abstract::We have established a landmark framework map over 20-25 Mb of the long arm of the human X chromosome using yeast artificial chromosome (YAC) clones. The map has approximately one landmark per 45 kb of DNA and stretches from DXS7531 in proximal Xq23 to DXS895 in proximal Xq26, connecting to published framework maps on ...
journal_title:Genome research
pub_type: 杂志文章
doi:
更新日期:1999-08-01 00:00:00
abstract::DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.183749.114
更新日期:2015-06-01 00:00:00
abstract::We report on the development of a methylation analysis workflow for optical detection of fluorescent methylation profiles along chromosomal DNA molecules. In combination with Bionano Genomics genome mapping technology, these profiles provide a hybrid genetic/epigenetic genome-wide map composed of DNA molecules spannin...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.240739.118
更新日期:2019-04-01 00:00:00
abstract::In this study we quantify the features of meiotic recombination on the long arm of human chromosome 21. We constructed a 67. 3-centimorgan (cM) high-resolution, comprehensive, and accurate genetic linkage map of chromosome 21q using 187 highly polymorphic markers covering almost the entire long arm; 46 loci, consistin...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.138100
更新日期:2000-09-01 00:00:00
abstract::The genetics of aging in the yeast Saccharomyces cerevisiae has involved the manipulation of individual genes in laboratory strains. We have instituted a quantitative genetic analysis of the yeast replicative lifespan by sampling the natural genetic variation in a wild yeast isolate. Haploid segregants from a cross be...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.136549.111
更新日期:2012-10-01 00:00:00
abstract::Advanced prostate cancer can progress to systemic metastatic tumors, which are generally androgen insensitive and ultimately lethal. Here, we report a comprehensive genomic survey for somatic events in systemic metastatic prostate tumors using both high-resolution copy number analysis and targeted mutational survey of...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.107961.110
更新日期:2011-01-01 00:00:00
abstract::Coevolution maintains interactions between phenotypic traits through the process of reciprocal natural selection. Detecting molecular coevolution can expose functional interactions between molecules in the cell, generating insights into biological processes, pathways, and the networks of interactions important for cel...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.092452.109
更新日期:2009-10-01 00:00:00
abstract::Histone modifications are now well-established mediators of transcriptional programs that distinguish cell states. However, the kinetics of histone modification and their role in mediating rapid, signal-responsive gene expression changes has been little studied on a genome-wide scale. Vascular endothelial growth facto...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.149674.112
更新日期:2013-06-01 00:00:00
abstract::Through comparative studies of the model organism Arabidopsis thaliana and its close relative Brassica oleracea, we have identified conserved regions that represent potentially functional sequences overlooked by previous Arabidopsis genome annotation methods. A total of 454,274 whole genome shotgun sequences covering ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.3176505
更新日期:2005-04-01 00:00:00
abstract::The centromere is the structural unit responsible for the faithful segregation of chromosomes. Although regulation of centromeric function by epigenetic factors has been well-studied, the contributions of the underlying DNA sequences have been much less well defined, and existing methodologies for studying centromere ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.219709.116
更新日期:2017-12-01 00:00:00
abstract::DNA is a universal language encrypted with biological instruction for life. In higher organisms, the genetic information is preserved predominantly in an organized exon/intron structure. When a gene is expressed, the exons are spliced together to form the transcript for protein synthesis. We have developed a complexit...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.313703
更新日期:2003-02-01 00:00:00
abstract::Forty-three yeast artificial chromosomes (YACs) from the X chromosome have been overlapped across the 4-Mb Xq21.3 region, which is homologous to a segment in Yp11.1. The region is formatted to 60-kb resolution with 57 STSs and is merged at its edges with contigs specific for X. This allows a direct comparison of marke...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.7.4.307
更新日期:1997-04-01 00:00:00
abstract::When transcription is to the right of the promoter, the "top," mRNA-synonymous strand of DNA tends to be purine-rich. When transcription is to the left of the promoter, the top, mRNA-template strand tends to be pyrimidine-rich. This transcription-direction rule suggests that there has been an evolutionary selection pr...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.10.2.228
更新日期:2000-02-01 00:00:00
abstract::The regulation of gene expression is mediated at the transcriptional level by enhancer regions that are bound by sequence-specific transcription factors (TFs). Recent studies have shown that the in vivo binding sites of single TFs differ between developmental or cellular contexts. How this context-specific binding is ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.132811.111
更新日期:2012-10-01 00:00:00
abstract::We have cloned the human gene encoding the transcription factor T. T protein is vital for the formation of posterior mesoderm and axial development in all vertebrates. Brachyury mutant mice, which lack T protein, die in utero with abnormal notochord, posterior somites, and allantois. We have identified human T genomic...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6.3.226
更新日期:1996-03-01 00:00:00
abstract::A total of 202 genes were cytogenetically mapped to goat chromosomes, multiplying by five the total number of regional gene localizations in domestic ruminants (255). This map encompasses 249 and 173 common anchor loci regularly spaced along human and murine chromosomes, respectively, which makes it possible to perfor...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.8.9.901
更新日期:1998-09-01 00:00:00
abstract::Hammerhead ribozymes previously were found in satellite RNAs from plant viroids and in repetitive DNA from certain species of newts and schistosomes. To determine if this catalytic RNA motif has a wider distribution, we decided to scrutinize the GenBank database for RNAs that contain hammerhead or hammerhead-like moti...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.10.7.1011
更新日期:2000-07-01 00:00:00
abstract::Random spontaneous genome rearrangements are difficult to detect in vivo, especially in postmitotic tissues. Using a lacZ-plasmid reporter mouse model, we have previously presented evidence for the accumulation of large genome rearrangements in various tissues, including postmitotic tissues, during aging. These rearra...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.125502
更新日期:2002-11-01 00:00:00
abstract::The main objectives of the study reported here were to construct a molecular map of wild emmer wheat, Triticum dicoccoides, to characterize the marker-related anatomy of the genome, and to evaluate segregation and recombination patterns upon crossing T. dicoccoides with its domesticated descendant Triticum durum (cult...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.150300
更新日期:2000-10-01 00:00:00
abstract::Despite claims that the mammalian Y Chromosome is on a path to extinction, comparative sequence analysis of primate Y Chromosomes has shown the decay of the ancestral single-copy genes has all but ceased in this eutherian lineage. The suite of single-copy Y-linked genes is highly conserved among the majority of euther...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.237586.118
更新日期:2018-12-01 00:00:00
abstract::Despite much research, our understanding of the architecture and cis-regulatory elements of human promoters is still lacking. Here, we devised a high-throughput assay to quantify the activity of approximately 15,000 fully designed sequences that we integrated and expressed from a fixed location within the human genome...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.236075.118
更新日期:2019-02-01 00:00:00
abstract::All individuals in a finite population are related if traced back long enough and will, therefore, share regions of their genomes identical by descent (IBD). Detection of such regions has several important applications-from answering questions about human evolution to locating regions in the human genome containing di...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.115360.110
更新日期:2011-07-01 00:00:00
abstract::Large terminal fragments of human chromosomes 2p, 6p, 8q, 12q, and 18q were cloned using yeast artificial chromosomes (YACs). RecA-assisted restriction endonuclease (RARE) cleavage analysis of genomic DNA samples from II unrelated individuals using YAC-derived probes confirmed the telomeric localizations of the half-Y...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.5.3.225
更新日期:1995-10-01 00:00:00
abstract::Oncoviral infection is responsible for 12%-15% of cancer in humans. Convergent evidence from epidemiology, pathology, and oncology suggests that new viral etiologies for cancers remain to be discovered. Oncoviral profiles can be obtained from cancer genome sequencing data; however, widespread viral sequence contaminat...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.242529.118
更新日期:2019-05-01 00:00:00
abstract::While copy number variation (CNV) is an active area of research, de novo mutation rates within human populations are not well characterized. By focusing on large (>100 kbp) events, we estimate the rate of de novo CNV formation in humans by analyzing 4394 transmissions from human pedigrees with and without neurocogniti...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.107680.110
更新日期:2010-11-01 00:00:00