Abstract:
:A rigorous analysis of the Merck-sponsored EST data with respect to known gene sequences increases the utility of the data set and helps refine methods for building a gene index. A highly curated human transcript data base was used as a reference data set of known genes. A detailed analysis of EST sequences derived from known genes was performed to assess the accuracy of EST sequence annotation. The EST data was screened to remove low-quality and low-complexity sequences. A set of high-quality ESTs similar to the transcript data base was identified using BLAST; this subset of ESTs was compared with the set of known genes using the Smith-Waterman algorithm. Error rates of several types were assessed based on a flexible match criterion defining sequence identity. The rate of lane-tracking errors is very low, approximately 0.5%. Insert size data is accurate within approximately 20%. Reversed clone and internal priming error rates are approximately 5% and 2.5%, respectively, contributing to the incorrect identification of reads as 3' ends of genes. Follow-up investigation reveals that a significant number of clones, miscategorized as reversed, represent overlapping genes on the opposite strand of entries in the transcript data base. Relevance of these results to the creation of a high-quality index to the human genome capable of supporting diverse genomic investigations is discussed.
journal_name
Genome Resjournal_title
Genome researchauthors
Aaronson JS,Eckman B,Blevins RA,Borkowski JA,Myerson J,Imran S,Elliston KOdoi
10.1101/gr.6.9.829subject
Has Abstractpub_date
1996-09-01 00:00:00pages
829-45issue
9eissn
1088-9051issn
1549-5469journal_volume
6pub_type
杂志文章相关文献
GENOME RESEARCH文献大全abstract::We have identified three new families of insulin homologs in Caenorhabditis elegans. In two of these families, concerted mutations suggest that an additional disulfide bond links B and A domains, and that the A-domain internal disulfide bond is substituted by a hydrophobic interaction. Homology modeling remarkably con...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.8.4.348
更新日期:1998-04-01 00:00:00
abstract::Translocations are a common class of chromosomal aberrations and can cause disease by physically disrupting genes or altering their regulatory environment. Some translocations, apparently balanced at the microscopic level, include deletions, duplications, insertions, or inversions at the molecular level. Traditionally...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.122986.111
更新日期:2011-10-01 00:00:00
abstract::The accurate mapping of clones derived from genomic regions containing complex arrangements of repeated elements presents special problems for DNA sequencers. Recent advances in the automation of optical mapping have enabled us to map a set of 16 BAC clones derived from the DAZ locus of the human Y chromosome long arm...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.112100
更新日期:2000-09-01 00:00:00
abstract::The Mammalian Gene Collection (MGC) consortium (http://mgc.nci.nih.gov) seeks to establish publicly available collections of full-ORF cDNAs for several organisms of significance to biomedical research, including human. To date over 15,200 human cDNA clones containing full-length open reading frames (ORFs) have been id...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.2473704
更新日期:2004-10-01 00:00:00
abstract::Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, an...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.2438004
更新日期:2005-01-01 00:00:00
abstract::The incorporation and creation of modified nucleobases in DNA have profound effects on genome function. We describe methods for mapping positions and local content of modified DNA nucleobases in genomic DNA. We combined in vitro nucleobase excision with massively parallel DNA sequencing (Excision-seq) to determine the...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.174052.114
更新日期:2014-09-01 00:00:00
abstract::To effectively analyze the increasing amounts of available genomic data, improved comparative analytical tools that are accessible to and applicable by a broad scientific community are essential. We built the "2-n-way" software suite to provide a fundamental and innovative processing framework for revealing and compar...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.262261.120
更新日期:2020-10-01 00:00:00
abstract::Comparative functional genomics studies the evolution of biological processes by analyzing functional data, such as gene expression profiles, across species. A major challenge is to compare profiles collected in a complex phylogeny. Here, we present Arboretum, a novel scalable computational algorithm that integrates e...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.146233.112
更新日期:2013-06-01 00:00:00
abstract::Analyzing vertebrate genomes requires rapid mRNA/DNA and cross-species protein alignments. A new tool, BLAT, is more accurate and 500 times faster than popular existing tools for mRNA/DNA alignments and 50 times faster for protein alignments at sensitivity settings typically used when comparing vertebrate sequences. B...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.229202
更新日期:2002-04-01 00:00:00
abstract::Transposable elements (TEs) are an integral part of the host transcriptome. TE-containing noncoding RNAs (ncRNAs) show considerable tissue specificity and play important roles during development, including stem cell maintenance and cell differentiation. Recent advances in single-cell RNA-seq (scRNA-seq) revolutionized...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.265173.120
更新日期:2021-01-01 00:00:00
abstract::Human genomic data of many types are readily available, but the complexity and scale of human molecular biology make it difficult to integrate this body of data, understand it from a systems level, and apply it to the study of specific pathways or genetic disorders. An investigator could best explore a particular prot...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.082214.108
更新日期:2009-06-01 00:00:00
abstract::Interactions mediated by cell surface receptors initiate important instructive signaling cues but can be difficult to detect in biochemical assays because they are often highly transient and membrane-embedded receptors are difficult to solubilize in their native conformation. Here, we address these biochemical challen...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.231183.117
更新日期:2018-09-01 00:00:00
abstract::Analysis procedures are needed to extract useful information from the large amount of gene expression data that is becoming available. This work describes a set of analytical tools and their application to yeast cell cycle data. The components of our approach are (1) a similarity measure that reduces the number of fal...
journal_title:Genome research
pub_type: 杂志文章,评审
doi:10.1101/gr.9.11.1106
更新日期:1999-11-01 00:00:00
abstract::All individuals in a finite population are related if traced back long enough and will, therefore, share regions of their genomes identical by descent (IBD). Detection of such regions has several important applications-from answering questions about human evolution to locating regions in the human genome containing di...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.115360.110
更新日期:2011-07-01 00:00:00
abstract::Levels of diversity vary across the human genome. This variation is caused by two forces: differences in mutation rates and the differential impact of natural selection. Pertinent to the question of the relative importance of these two forces is the observation that both diversity within species and interspecies diver...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.3461105
更新日期:2005-09-01 00:00:00
abstract::Microsatellites--tandem repeats of short DNA motifs--are abundant in the human genome and have high mutation rates. While microsatellite instability is implicated in numerous genetic diseases, the molecular processes involved in their emergence and disappearance are still not well understood. Microsatellites are hypot...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.122937.111
更新日期:2011-12-01 00:00:00
abstract::Transcriptional networks have been shown to evolve very rapidly, prompting questions as to how such changes arise and are tolerated. Recent comparisons of transcriptional networks across species have implicated variations in the cis-acting DNA sequences near genes as the main cause of divergence. What is less clear is...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.111765.110
更新日期:2010-12-01 00:00:00
abstract::Many CpG islands have tissue-dependent and differentially methylated regions (T-DMRs) in normal cells and tissues. To elucidate how DNA methyltransferases (Dnmts) participate in methylation of the genomic components, we investigated the genome-wide DNA methylation pattern of the T-DMRs with Dnmt1-, Dnmt3a-, and/or Dnm...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.2431504
更新日期:2004-09-01 00:00:00
abstract::Microsatellites are abundant in vertebrate genomes, but their sequence representation and length distributions vary greatly within each family of repeats (e.g., tetranucleotides). Biophysical studies of 82 synthetic single-stranded oligonucleotides comprising all tetra- and trinucleotide repeats revealed an inverse co...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.078303.108
更新日期:2008-10-01 00:00:00
abstract::Much of the available human genomic sequence data exist in a fragmentary draft state following the completion of the initial high-volume sequencing performed by the International Human Genome Sequencing Consortium (IHGSC) and Celera Genomics (CG). We compared six draft genome assemblies over a region of chromosome 4p ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.207902
更新日期:2002-03-01 00:00:00
abstract::LSH, a member of the SNF2 family of chromatin remodeling ATPases encoded by the Hells gene, is essential for normal levels of DNA methylation in the mammalian genome. While the role of LSH in the methylation of repetitive DNA sequences is well characterized, its contribution to the regulation of DNA methylation and th...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.108498.110
更新日期:2011-01-01 00:00:00
abstract::DNA microarrays produced by deposition (or 'spotting')of a single long oligonucleotide probe for each gene may be an attractive alternative to other types of arrays. We produced spotted oligonucleotide arrays using two large collections of approximately 70-mer probes, and used these arrays to analyze gene expression i...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.1048803
更新日期:2003-07-01 00:00:00
abstract::The maternal and paternal copies of the genome are both required for mammalian development, and this is primarily due to imprinted genes, those that are monoallelically expressed based on parent-of-origin. Typically, this pattern of expression is regulated by differentially methylated regions (DMRs) that are establish...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.196139.115
更新日期:2016-06-01 00:00:00
abstract::The next-generation sequencing technology coupled with the growing number of genome sequences opens the opportunity to redesign genotyping strategies for more effective genetic mapping and genome analysis. We have developed a high-throughput method for genotyping recombinant populations utilizing whole-genome resequen...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.089516.108
更新日期:2009-06-01 00:00:00
abstract::Intra-tumor heterogeneity poses substantial challenges for cancer treatment. A tumor's composition can be deduced by reconstructing its mutational history. Central to current approaches is the infinite sites assumption that every genomic position can only mutate once over the lifetime of a tumor. The validity of this ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.220707.117
更新日期:2017-11-01 00:00:00
abstract::Gene order in prokaryotes is conserved to a much lesser extent than protein sequences. Only several operons, primarily those that code for physically interacting proteins, are conserved in all or most of the bacterial and archaeal genomes. Nevertheless, even the limited conservation of operon organization that is obse...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.gr-1619r
更新日期:2001-03-01 00:00:00
abstract::Whole-genome sequencing using massively parallel sequencing technologies enables accurate detection of somatic rearrangements in cancer. Pinpointing large numbers of rearrangement breakpoints to base-pair resolution allows analysis of rearrangement microhomology and genomic location for every sample. Here we analyze 9...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.141382.112
更新日期:2013-02-01 00:00:00
abstract::We present a database of copy number variations (CNVs) detected in 2026 disease-free individuals, using high-density, SNP-based oligonucleotide microarrays. This large cohort, comprised mainly of Caucasians (65.2%) and African-Americans (34.2%), was analyzed for CNVs in a single study using a uniform array platform an...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.083501.108
更新日期:2009-09-01 00:00:00
abstract::We report on the development of a methylation analysis workflow for optical detection of fluorescent methylation profiles along chromosomal DNA molecules. In combination with Bionano Genomics genome mapping technology, these profiles provide a hybrid genetic/epigenetic genome-wide map composed of DNA molecules spannin...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.240739.118
更新日期:2019-04-01 00:00:00
abstract::Nutrient availability profoundly influences gene expression. Many animal genes encode multiple transcript isoforms, yet the effect of nutrient availability on transcript isoform expression has not been studied in genome-wide fashion. When Caenorhabditis elegans larvae hatch without food, they arrest development in the...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.133587.111
更新日期:2012-10-01 00:00:00