Abstract:
:Designing effective and accurate tools for identifying the functional and structural elements in a genome remains at the frontier of genome annotation owing to incompleteness and inaccuracy of the data, limitations in the computational models, and shifting paradigms in genomics, such as alternative splicing. We present a methodology for the automated annotation of genes and their alternatively spliced mRNA transcripts based on existing cDNA and protein sequence evidence from the same species or projected from a related species using syntenic mapping information. At the core of the method is the splice graph, a compact representation of a gene, its exons, introns, and alternatively spliced isoforms. The putative transcripts are enumerated from the graph and assigned confidence scores based on the strength of sequence evidence, and a subset of the high-scoring candidates are selected and promoted into the annotation. The method is highly selective, eliminating the unlikely candidates while retaining 98% of the high-quality mRNA evidence in well-formed transcripts, and produces annotation that is measurably more accurate than some evidence-based gene sets. The process is fast, accurate, and fully automated, and combines the traditionally distinct gene annotation and alternative splicing detection processes in a comprehensive and systematic way, thus considerably aiding in the ensuing manual curation efforts.
journal_name
Genome Resjournal_title
Genome researchauthors
Florea L,Di Francesco V,Miller J,Turner R,Yao A,Harris M,Walenz B,Mobarry C,Merkulov GV,Charlab R,Dew I,Deng Z,Istrail S,Li P,Sutton Gdoi
10.1101/gr.2889405subject
Has Abstractpub_date
2005-01-01 00:00:00pages
54-66issue
1eissn
1088-9051issn
1549-5469pii
15/1/54journal_volume
15pub_type
杂志文章相关文献
GENOME RESEARCH文献大全abstract::Live-cell imaging allows detailed dynamic cellular phenotyping for cell biology and, in combination with small molecule or drug libraries, for high-content screening. Fully automated analysis of live cell movies has been hampered by the lack of computational approaches that allow tracking and recognition of individual...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.092494.109
更新日期:2009-11-01 00:00:00
abstract::X inactivation equalizes the dosage of gene expression between the sexes, but some genes escape silencing and are thus expressed from both alleles in females. To survey X inactivation and escape in mouse, we performed RNA sequencing in Mus musculus x Mus spretus cells with complete skewing of X inactivation, relying o...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.103200.109
更新日期:2010-05-01 00:00:00
abstract::Developments in microarray and high-throughput sequencing (HTS) technologies have resulted in a rapid expansion of research into epigenomic changes that occur in normal development and in the progression of disease, such as cancer. Not surprisingly, copy number variation (CNV) has a direct effect on HTS read densities...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.139055.112
更新日期:2012-12-01 00:00:00
abstract::Here we describe a high-throughput screen to isolate transcripts with spatially restricted patterns of expression in early embryos. Our approach utilizes robotic automation for rapid analysis of sequence-selected cDNAs in a whole-mount in situ hybridization assay. We determined the spatial distribution of a random col...
journal_title:Genome research
pub_type: 信件
doi:10.1101/gr.84402
更新日期:2002-07-01 00:00:00
abstract::PipMaker (http://bio.cse.psu.edu) is a World-Wide Web site for comparing two long DNA sequences to identify conserved segments and for producing informative, high-resolution displays of the resulting alignments. One display is a percent identity plot (pip), which shows both the position in one sequence and the degree ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.10.4.577
更新日期:2000-04-01 00:00:00
abstract::Gene expression levels can be an important link DNA between variation and phenotypic manifestations. Our previous map of global gene expression, based on ~400K single nucleotide polymorphisms (SNPs) and 50K transcripts in 400 sib pairs from the MRCA family panel, has been widely used to interpret the results of genome...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.142521.112
更新日期:2013-04-01 00:00:00
abstract::In addition to mediating sister chromatid cohesion during the cell cycle, the cohesin complex associates with CTCF and with active gene regulatory elements to form long-range interactions between its binding sites. Genome-wide chromosome conformation capture had shown that cohesin's main role in interphase genome orga...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.184986.114
更新日期:2015-04-01 00:00:00
abstract::We compared the genome of the nematode Caenorhabditis elegans to 13% of that of Caenorhabditis briggsae, identifying 252 conserved segments along their chromosomes. We detected 517 chromosomal rearrangements, with the ratio of translocations to inversions to transpositions being approximately 1:1:2. We estimate that t...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.172702
更新日期:2002-06-01 00:00:00
abstract::Next-generation sequencing is a powerful approach for discovering genetic variation. Sensitive variant calling and haplotype inference from population sequencing data remain challenging. We describe methods for high-quality discovery, genotyping, and phasing of SNPs for low-coverage (approximately 5×) sequencing of po...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.146084.112
更新日期:2013-05-01 00:00:00
abstract::Cys2-His2 zinc finger proteins (ZFPs) are the largest group of transcription factors in higher metazoans. A complete characterization of these ZFPs and their associated target sequences is pivotal to fully annotate transcriptional regulatory networks in metazoan genomes. As a first step in this process, we have charac...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.151472.112
更新日期:2013-06-01 00:00:00
abstract::Obesity is reaching epidemic proportions in developed countries and represents a significant risk factor for hypertension, heart disease, diabetes, and dyslipidemia. Splicing mutations constitute at least 14% of disease-causing mutations, thus implicating polymorphisms that affect splicing as likely candidates for dis...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6661308
更新日期:2008-02-01 00:00:00
abstract::Evolutionary constraints on gene regulatory elements are poorly understood: Little is known about how the strength of transcription factor binding correlates with DNA sequence conservation, and whether transcription factor binding sites can evolve rapidly while retaining their function. Here we use the model of the NF...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6490707
更新日期:2007-09-01 00:00:00
abstract::The Mammalian Gene Collection (MGC) consortium (http://mgc.nci.nih.gov) seeks to establish publicly available collections of full-ORF cDNAs for several organisms of significance to biomedical research, including human. To date over 15,200 human cDNA clones containing full-length open reading frames (ORFs) have been id...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.2473704
更新日期:2004-10-01 00:00:00
abstract::The identification and interpretation of the regulatory signals within the human genome remain among the greatest goals and most difficult challenges in genome analysis. The ability to predict the temporal and spatial control of transcription is likely to require a combination of methods to address the contribution of...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.180601
更新日期:2001-09-01 00:00:00
abstract::The higher-order structural organization and dynamics of the chromosomes play a central role in gene regulation. To explore this structure-function relationship, it is necessary to directly visualize genomic elements in living cells. Genome imaging based on the CRISPR system is a powerful approach but has limited appl...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.260018.119
更新日期:2020-09-01 00:00:00
abstract::Mammalian X and Y Chromosomes evolved from an ordinary autosomal pair. Genetic decay of the Y led to X Chromosome inactivation (XCI) in females, but some Y-linked genes were retained during the course of sex chromosome evolution, and many X-linked genes did not become subject to XCI. We reconstructed gene-by-gene dosa...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.230433.117
更新日期:2018-04-01 00:00:00
abstract::C2H2 zinc finger proteins represent the largest and most enigmatic class of human transcription factors. Their C2H2-ZF arrays are highly variable, indicating that most will have unique DNA binding motifs. However, most of the binding motifs have not been directly determined. In addition, little is known about whether ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.209643.116
更新日期:2016-12-01 00:00:00
abstract::One approach to understanding the process of speciation is to characterize the genetic architecture of post-zygotic isolation. As gene regulation requires interactions between loci, negative epistatic interactions between divergent regulatory elements might underlie hybrid incompatibilities and contribute to reproduct...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.195743.115
更新日期:2016-04-01 00:00:00
abstract::We present the sequence of a contiguous 2.63 Mb of DNA extending from the tip of the X chromosome of Drosophila melanogaster. Within this sequence, we predict 277 protein coding genes, of which 94 had been sequenced already in the course of studying the biology of their gene products, and examples of 12 different tran...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.173801
更新日期:2001-05-01 00:00:00
abstract::The study of genetic variability within natural populations of pathogens may provide insight into their evolution and pathogenesis. We used a Mycobacterium tuberculosis high-density oligonucleotide microarray to detect small-scale genomic deletions among 19 clinically and epidemiologically well-characterized isolates ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.166401
更新日期:2001-04-01 00:00:00
abstract::Integration of DNA viruses into the human genome plays an important role in various types of tumors, including hepatitis B virus (HBV)-related hepatocellular carcinoma. However, the molecular details and clinical impact of HBV integration on either human or HBV epigenomes are unknown. Here, we show that methylation of...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.175240.114
更新日期:2015-03-01 00:00:00
abstract::Sex chromosomes differentiated from different ancestral autosomes in various vertebrate lineages. Here, we trace the functional evolution of the XY Chromosomes of the green anole lizard (Anolis carolinensis), on the basis of extensive high-throughput genome, transcriptome and histone modification sequencing data and r...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.223727.117
更新日期:2017-12-01 00:00:00
abstract::The genetics of aging in the yeast Saccharomyces cerevisiae has involved the manipulation of individual genes in laboratory strains. We have instituted a quantitative genetic analysis of the yeast replicative lifespan by sampling the natural genetic variation in a wild yeast isolate. Haploid segregants from a cross be...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.136549.111
更新日期:2012-10-01 00:00:00
abstract::The representation and discovery of transcription factor (TF) sequence binding specificities is critical for understanding gene regulatory networks and interpreting the impact of disease-associated noncoding genetic variants. We present a novel TF binding motif representation, the k-mer set memory (KSM), which consist...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.226852.117
更新日期:2018-06-01 00:00:00
abstract::The remarkable olfactory power of insect species is thought to be generated by a combinatorial action of two large protein families, G protein-coupled olfactory receptors (ORs) and odorant binding proteins (OBPs). In olfactory sensilla, OBPs deliver hydrophobic airborne molecules to ORs, but their expression in nonolf...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.5075706
更新日期:2006-11-01 00:00:00
abstract::Detecting and estimating DNA sample contamination are important steps to ensure high-quality genotype calls and reliable downstream analysis. Existing methods rely on population allele frequency information for accurate estimation of contamination rates. Correctly specifying population allele frequencies for each indi...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.246934.118
更新日期:2020-02-01 00:00:00
abstract::We have developed the CADLIVE (Computer-Aided Design of LIVing systEms) Simulator that provided a rule-based automatic way to convert biochemical network maps into dynamic models, which enables simulating their dynamics without going through all of the reactions down to the details of exact kinetic parameters. The sim...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.3463705
更新日期:2005-04-01 00:00:00
abstract::The nuclear space is not a homogeneous biochemical environment. Many studies have demonstrated that the transcriptional activity of a gene is linked to its positioning within the nuclear space. Following the discovery of lamin-associated domains (LADs), which are transcriptionally repressed chromatin regions, the nonr...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.215186.116
更新日期:2017-07-01 00:00:00
abstract::Changes in gene order between the genomes of two related yeast species, Saccharomyces cerevisiae and Saccharomyces bayanus var. uvarum were studied. From the dataset of a previous low coverage sequencing of the S. bayanus var. uvarum genome, 35 different synteny breakpoints between neighboring genes and two cases of l...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.212701
更新日期:2001-12-01 00:00:00
abstract::Meiotic recombination, including crossovers (COs) and gene conversions (GCs), impacts natural variation and is an important evolutionary force. COs increase genetic diversity by redistributing existing variation, whereas GCs can alter allelic frequency. Here, we sequenced Arabidopsis Landsberg erecta (Ler) and two set...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.127522.111
更新日期:2012-03-01 00:00:00