Gene and alternative splicing annotation with AIR.

Abstract:

:Designing effective and accurate tools for identifying the functional and structural elements in a genome remains at the frontier of genome annotation owing to incompleteness and inaccuracy of the data, limitations in the computational models, and shifting paradigms in genomics, such as alternative splicing. We present a methodology for the automated annotation of genes and their alternatively spliced mRNA transcripts based on existing cDNA and protein sequence evidence from the same species or projected from a related species using syntenic mapping information. At the core of the method is the splice graph, a compact representation of a gene, its exons, introns, and alternatively spliced isoforms. The putative transcripts are enumerated from the graph and assigned confidence scores based on the strength of sequence evidence, and a subset of the high-scoring candidates are selected and promoted into the annotation. The method is highly selective, eliminating the unlikely candidates while retaining 98% of the high-quality mRNA evidence in well-formed transcripts, and produces annotation that is measurably more accurate than some evidence-based gene sets. The process is fast, accurate, and fully automated, and combines the traditionally distinct gene annotation and alternative splicing detection processes in a comprehensive and systematic way, thus considerably aiding in the ensuing manual curation efforts.

journal_name

Genome Res

journal_title

Genome research

authors

Florea L,Di Francesco V,Miller J,Turner R,Yao A,Harris M,Walenz B,Mobarry C,Merkulov GV,Charlab R,Dew I,Deng Z,Istrail S,Li P,Sutton G

doi

10.1101/gr.2889405

subject

Has Abstract

pub_date

2005-01-01 00:00:00

pages

54-66

issue

1

eissn

1088-9051

issn

1549-5469

pii

15/1/54

journal_volume

15

pub_type

杂志文章
  • Automatic analysis of dividing cells in live cell movies to detect mitotic delays and correlate phenotypes in time.

    abstract::Live-cell imaging allows detailed dynamic cellular phenotyping for cell biology and, in combination with small molecule or drug libraries, for high-content screening. Fully automated analysis of live cell movies has been hampered by the lack of computational approaches that allow tracking and recognition of individual...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.092494.109

    authors: Harder N,Mora-Bermúdez F,Godinez WJ,Wünsche A,Eils R,Ellenberg J,Rohr K

    更新日期:2009-11-01 00:00:00

  • Global survey of escape from X inactivation by RNA-sequencing in mouse.

    abstract::X inactivation equalizes the dosage of gene expression between the sexes, but some genes escape silencing and are thus expressed from both alleles in females. To survey X inactivation and escape in mouse, we performed RNA sequencing in Mus musculus x Mus spretus cells with complete skewing of X inactivation, relying o...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.103200.109

    authors: Yang F,Babak T,Shendure J,Disteche CM

    更新日期:2010-05-01 00:00:00

  • Copy-number-aware differential analysis of quantitative DNA sequencing data.

    abstract::Developments in microarray and high-throughput sequencing (HTS) technologies have resulted in a rapid expansion of research into epigenomic changes that occur in normal development and in the progression of disease, such as cancer. Not surprisingly, copy number variation (CNV) has a direct effect on HTS read densities...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.139055.112

    authors: Robinson MD,Strbenac D,Stirzaker C,Statham AL,Song J,Speed TP,Clark SJ

    更新日期:2012-12-01 00:00:00

  • Profiling patterned transcripts in Drosophila embryos.

    abstract::Here we describe a high-throughput screen to isolate transcripts with spatially restricted patterns of expression in early embryos. Our approach utilizes robotic automation for rapid analysis of sequence-selected cDNAs in a whole-mount in situ hybridization assay. We determined the spatial distribution of a random col...

    journal_title:Genome research

    pub_type: 信件

    doi:10.1101/gr.84402

    authors: Simin K,Scuderi A,Reamey J,Dunn D,Weiss R,Metherall JE,Letsou A

    更新日期:2002-07-01 00:00:00

  • PipMaker--a web server for aligning two genomic DNA sequences.

    abstract::PipMaker (http://bio.cse.psu.edu) is a World-Wide Web site for comparing two long DNA sequences to identify conserved segments and for producing informative, high-resolution displays of the resulting alignments. One display is a percent identity plot (pip), which shows both the position in one sequence and the degree ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.10.4.577

    authors: Schwartz S,Zhang Z,Frazer KA,Smit A,Riemer C,Bouck J,Gibbs R,Hardison R,Miller W

    更新日期:2000-04-01 00:00:00

  • A cross-platform analysis of 14,177 expression quantitative trait loci derived from lymphoblastoid cell lines.

    abstract::Gene expression levels can be an important link DNA between variation and phenotypic manifestations. Our previous map of global gene expression, based on ~400K single nucleotide polymorphisms (SNPs) and 50K transcripts in 400 sib pairs from the MRCA family panel, has been widely used to interpret the results of genome...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.142521.112

    authors: Liang L,Morar N,Dixon AL,Lathrop GM,Abecasis GR,Moffatt MF,Cookson WO

    更新日期:2013-04-01 00:00:00

  • Spatial enhancer clustering and regulation of enhancer-proximal genes by cohesin.

    abstract::In addition to mediating sister chromatid cohesion during the cell cycle, the cohesin complex associates with CTCF and with active gene regulatory elements to form long-range interactions between its binding sites. Genome-wide chromosome conformation capture had shown that cohesin's main role in interphase genome orga...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.184986.114

    authors: Ing-Simmons E,Seitan VC,Faure AJ,Flicek P,Carroll T,Dekker J,Fisher AG,Lenhard B,Merkenschlager M

    更新日期:2015-04-01 00:00:00

  • Fourfold faster rate of genome rearrangement in nematodes than in Drosophila.

    abstract::We compared the genome of the nematode Caenorhabditis elegans to 13% of that of Caenorhabditis briggsae, identifying 252 conserved segments along their chromosomes. We detected 517 chromosomal rearrangements, with the ratio of translocations to inversions to transpositions being approximately 1:1:2. We estimate that t...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.172702

    authors: Coghlan A,Wolfe KH

    更新日期:2002-06-01 00:00:00

  • An integrative variant analysis pipeline for accurate genotype/haplotype inference in population NGS data.

    abstract::Next-generation sequencing is a powerful approach for discovering genetic variation. Sensitive variant calling and haplotype inference from population sequencing data remain challenging. We describe methods for high-quality discovery, genotyping, and phasing of SNPs for low-coverage (approximately 5×) sequencing of po...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.146084.112

    authors: Wang Y,Lu J,Yu J,Gibbs RA,Yu F

    更新日期:2013-05-01 00:00:00

  • Global analysis of Drosophila Cys₂-His₂ zinc finger proteins reveals a multitude of novel recognition motifs and binding determinants.

    abstract::Cys2-His2 zinc finger proteins (ZFPs) are the largest group of transcription factors in higher metazoans. A complete characterization of these ZFPs and their associated target sequences is pivotal to fully annotate transcriptional regulatory networks in metazoan genomes. As a first step in this process, we have charac...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.151472.112

    authors: Enuameh MS,Asriyan Y,Richards A,Christensen RG,Hall VL,Kazemian M,Zhu C,Pham H,Cheng Q,Blatti C,Brasefield JA,Basciotta MD,Ou J,McNulty JC,Zhu LJ,Celniker SE,Sinha S,Stormo GD,Brodsky MH,Wolfe SA

    更新日期:2013-06-01 00:00:00

  • Alternative approach to a heavy weight problem.

    abstract::Obesity is reaching epidemic proportions in developed countries and represents a significant risk factor for hypertension, heart disease, diabetes, and dyslipidemia. Splicing mutations constitute at least 14% of disease-causing mutations, thus implicating polymorphisms that affect splicing as likely candidates for dis...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6661308

    authors: Goren A,Kim E,Amit M,Bochner R,Lev-Maor G,Ahituv N,Ast G

    更新日期:2008-02-01 00:00:00

  • Functional conservation of Rel binding sites in drosophilid genomes.

    abstract::Evolutionary constraints on gene regulatory elements are poorly understood: Little is known about how the strength of transcription factor binding correlates with DNA sequence conservation, and whether transcription factor binding sites can evolve rapidly while retaining their function. Here we use the model of the NF...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6490707

    authors: Copley RR,Totrov M,Linnell J,Field S,Ragoussis J,Udalova IA

    更新日期:2007-09-01 00:00:00

  • Systematic recovery and analysis of full-ORF human cDNA clones.

    abstract::The Mammalian Gene Collection (MGC) consortium (http://mgc.nci.nih.gov) seeks to establish publicly available collections of full-ORF cDNAs for several organisms of significance to biomedical research, including human. To date over 15,200 human cDNA clones containing full-length open reading frames (ORFs) have been id...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.2473704

    authors: Baross A,Butterfield YS,Coughlin SM,Zeng T,Griffith M,Griffith OL,Petrescu AS,Smailus DE,Khattra J,McDonald HL,McKay SJ,Moksa M,Holt RA,Marra MA

    更新日期:2004-10-01 00:00:00

  • A predictive model for regulatory sequences directing liver-specific transcription.

    abstract::The identification and interpretation of the regulatory signals within the human genome remain among the greatest goals and most difficult challenges in genome analysis. The ability to predict the temporal and spatial control of transcription is likely to require a combination of methods to address the contribution of...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.180601

    authors: Krivan W,Wasserman WW

    更新日期:2001-09-01 00:00:00

  • Background-suppressed live visualization of genomic loci with an improved CRISPR system based on a split fluorophore.

    abstract::The higher-order structural organization and dynamics of the chromosomes play a central role in gene regulation. To explore this structure-function relationship, it is necessary to directly visualize genomic elements in living cells. Genome imaging based on the CRISPR system is a powerful approach but has limited appl...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.260018.119

    authors: Chaudhary N,Nho SH,Cho H,Gantumur N,Ra JS,Myung K,Kim H

    更新日期:2020-09-01 00:00:00

  • Conserved microRNA targeting reveals preexisting gene dosage sensitivities that shaped amniote sex chromosome evolution.

    abstract::Mammalian X and Y Chromosomes evolved from an ordinary autosomal pair. Genetic decay of the Y led to X Chromosome inactivation (XCI) in females, but some Y-linked genes were retained during the course of sex chromosome evolution, and many X-linked genes did not become subject to XCI. We reconstructed gene-by-gene dosa...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.230433.117

    authors: Naqvi S,Bellott DW,Lin KS,Page DC

    更新日期:2018-04-01 00:00:00

  • Multiparameter functional diversity of human C2H2 zinc finger proteins.

    abstract::C2H2 zinc finger proteins represent the largest and most enigmatic class of human transcription factors. Their C2H2-ZF arrays are highly variable, indicating that most will have unique DNA binding motifs. However, most of the binding motifs have not been directly determined. In addition, little is known about whether ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.209643.116

    authors: Schmitges FW,Radovani E,Najafabadi HS,Barazandeh M,Campitelli LF,Yin Y,Jolma A,Zhong G,Guo H,Kanagalingam T,Dai WF,Taipale J,Emili A,Greenblatt JF,Hughes TR

    更新日期:2016-12-01 00:00:00

  • Gene regulation and speciation in house mice.

    abstract::One approach to understanding the process of speciation is to characterize the genetic architecture of post-zygotic isolation. As gene regulation requires interactions between loci, negative epistatic interactions between divergent regulatory elements might underlie hybrid incompatibilities and contribute to reproduct...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.195743.115

    authors: Mack KL,Campbell P,Nachman MW

    更新日期:2016-04-01 00:00:00

  • From first base: the sequence of the tip of the X chromosome of Drosophila melanogaster, a comparison of two sequencing strategies.

    abstract::We present the sequence of a contiguous 2.63 Mb of DNA extending from the tip of the X chromosome of Drosophila melanogaster. Within this sequence, we predict 277 protein coding genes, of which 94 had been sequenced already in the course of studying the biology of their gene products, and examples of 12 different tran...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.173801

    authors: Benos PV,Gatt MK,Murphy L,Harris D,Barrell B,Ferraz C,Vidal S,Brun C,Demaille J,Cadieu E,Dreano S,Gloux S,Lelaure V,Mottier S,Galibert F,Borkova D,Miñana B,Kafatos FC,Bolshakov S,Sidén-Kiamos I,Papagiannakis G,S

    更新日期:2001-05-01 00:00:00

  • Comparing genomes within the species Mycobacterium tuberculosis.

    abstract::The study of genetic variability within natural populations of pathogens may provide insight into their evolution and pathogenesis. We used a Mycobacterium tuberculosis high-density oligonucleotide microarray to detect small-scale genomic deletions among 19 clinically and epidemiologically well-characterized isolates ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.166401

    authors: Kato-Maeda M,Rhee JT,Gingeras TR,Salamon H,Drenkow J,Smittipat N,Small PM

    更新日期:2001-04-01 00:00:00

  • DNA methylation at hepatitis B viral integrants is associated with methylation at flanking human genomic sequences.

    abstract::Integration of DNA viruses into the human genome plays an important role in various types of tumors, including hepatitis B virus (HBV)-related hepatocellular carcinoma. However, the molecular details and clinical impact of HBV integration on either human or HBV epigenomes are unknown. Here, we show that methylation of...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.175240.114

    authors: Watanabe Y,Yamamoto H,Oikawa R,Toyota M,Yamamoto M,Kokudo N,Tanaka S,Arii S,Yotsuyanagi H,Koike K,Itoh F

    更新日期:2015-03-01 00:00:00

  • Convergent origination of a Drosophila-like dosage compensation mechanism in a reptile lineage.

    abstract::Sex chromosomes differentiated from different ancestral autosomes in various vertebrate lineages. Here, we trace the functional evolution of the XY Chromosomes of the green anole lizard (Anolis carolinensis), on the basis of extensive high-throughput genome, transcriptome and histone modification sequencing data and r...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.223727.117

    authors: Marin R,Cortez D,Lamanna F,Pradeepa MM,Leushkin E,Julien P,Liechti A,Halbert J,Brüning T,Mössinger K,Trefzer T,Conrad C,Kerver HN,Wade J,Tschopp P,Kaessmann H

    更新日期:2017-12-01 00:00:00

  • Natural genetic variation in yeast longevity.

    abstract::The genetics of aging in the yeast Saccharomyces cerevisiae has involved the manipulation of individual genes in laboratory strains. We have instituted a quantitative genetic analysis of the yeast replicative lifespan by sampling the natural genetic variation in a wild yeast isolate. Haploid segregants from a cross be...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.136549.111

    authors: Stumpferl SW,Brand SE,Jiang JC,Korona B,Tiwari A,Dai J,Seo JG,Jazwinski SM

    更新日期:2012-10-01 00:00:00

  • A novel k-mer set memory (KSM) motif representation improves regulatory variant prediction.

    abstract::The representation and discovery of transcription factor (TF) sequence binding specificities is critical for understanding gene regulatory networks and interpreting the impact of disease-associated noncoding genetic variants. We present a novel TF binding motif representation, the k-mer set memory (KSM), which consist...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.226852.117

    authors: Guo Y,Tian K,Zeng H,Guo X,Gifford DK

    更新日期:2018-06-01 00:00:00

  • Function and evolution of a gene family encoding odorant binding-like proteins in a social insect, the honey bee (Apis mellifera).

    abstract::The remarkable olfactory power of insect species is thought to be generated by a combinatorial action of two large protein families, G protein-coupled olfactory receptors (ORs) and odorant binding proteins (OBPs). In olfactory sensilla, OBPs deliver hydrophobic airborne molecules to ORs, but their expression in nonolf...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5075706

    authors: Forêt S,Maleszka R

    更新日期:2006-11-01 00:00:00

  • Ancestry-agnostic estimation of DNA sample contamination from sequence reads.

    abstract::Detecting and estimating DNA sample contamination are important steps to ensure high-quality genotype calls and reliable downstream analysis. Existing methods rely on population allele frequency information for accurate estimation of contamination rates. Correctly specifying population allele frequencies for each indi...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.246934.118

    authors: Zhang F,Flickinger M,Taliun SAG,InPSYght Psychiatric Genetics Consortium.,Abecasis GR,Scott LJ,McCaroll SA,Pato CN,Boehnke M,Kang HM

    更新日期:2020-02-01 00:00:00

  • CADLIVE dynamic simulator: direct link of biochemical networks to dynamic models.

    abstract::We have developed the CADLIVE (Computer-Aided Design of LIVing systEms) Simulator that provided a rule-based automatic way to convert biochemical network maps into dynamic models, which enables simulating their dynamics without going through all of the reactions down to the details of exact kinetic parameters. The sim...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.3463705

    authors: Kurata H,Masaki K,Sumida Y,Iwasaki R

    更新日期:2005-04-01 00:00:00

  • Nonrandom domain organization of the Arabidopsis genome at the nuclear periphery.

    abstract::The nuclear space is not a homogeneous biochemical environment. Many studies have demonstrated that the transcriptional activity of a gene is linked to its positioning within the nuclear space. Following the discovery of lamin-associated domains (LADs), which are transcriptionally repressed chromatin regions, the nonr...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.215186.116

    authors: Bi X,Cheng YJ,Hu B,Ma X,Wu R,Wang JW,Liu C

    更新日期:2017-07-01 00:00:00

  • Evolution of gene order in the genomes of two related yeast species.

    abstract::Changes in gene order between the genomes of two related yeast species, Saccharomyces cerevisiae and Saccharomyces bayanus var. uvarum were studied. From the dataset of a previous low coverage sequencing of the S. bayanus var. uvarum genome, 35 different synteny breakpoints between neighboring genes and two cases of l...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.212701

    authors: Fischer G,Neuvéglise C,Durrens P,Gaillardin C,Dujon B

    更新日期:2001-12-01 00:00:00

  • Analysis of Arabidopsis genome-wide variations before and after meiosis and meiotic recombination by resequencing Landsberg erecta and all four products of a single meiosis.

    abstract::Meiotic recombination, including crossovers (COs) and gene conversions (GCs), impacts natural variation and is an important evolutionary force. COs increase genetic diversity by redistributing existing variation, whereas GCs can alter allelic frequency. Here, we sequenced Arabidopsis Landsberg erecta (Ler) and two set...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.127522.111

    authors: Lu P,Han X,Qi J,Yang J,Wijeratne AJ,Li T,Ma H

    更新日期:2012-03-01 00:00:00