GeneID in Drosophila.

Abstract:

:GeneID is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure. In the first step, splice sites, and start and stop codons are predicted and scored along the sequence using position weight matrices (PWMs). In the second step, exons are built from the sites. Exons are scored as the sum of the scores of the defining sites, plus the log-likelihood ratio of a Markov model for coding DNA. In the last step, from the set of predicted exons, the gene structure is assembled, maximizing the sum of the scores of the assembled exons. In this paper we describe the obtention of PWMs for sites, and the Markov model of coding DNA in Drosophila melanogaster. We also compare other models of coding DNA with the Markov model. Finally, we present and discuss the results obtained when GeneID is used to predict genes in the Adh region. These results show that the accuracy of GeneID predictions compares currently with that of other existing tools but that GeneID is likely to be more efficient in terms of speed and memory usage.

journal_name

Genome Res

journal_title

Genome research

authors

Parra G,Blanco E,Guigó R

doi

10.1101/gr.10.4.511

subject

Has Abstract

pub_date

2000-04-01 00:00:00

pages

511-5

issue

4

eissn

1088-9051

issn

1549-5469

journal_volume

10

pub_type

杂志文章
  • Detecting copy number variation with mated short reads.

    abstract::The development of high-throughput sequencing (HTS) technologies has opened the door to novel methods for detecting copy number variants (CNVs) in the human genome. While in the past CNVs have been detected based on array CGH data, recent studies have shown that depth-of-coverage information from HTS technologies can ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.106344.110

    authors: Medvedev P,Fiume M,Dzamba M,Smith T,Brudno M

    更新日期:2010-11-01 00:00:00

  • Sensitive mapping of recombination hotspots using sequencing-based detection of ssDNA.

    abstract::Meiotic DNA double-stranded breaks (DSBs) initiate genetic recombination in discrete areas of the genome called recombination hotspots. DSBs can be directly mapped using chromatin immunoprecipitation followed by sequencing (ChIP-seq). Nevertheless, the genome-wide mapping of recombination hotspots in mammals is still ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.130583.111

    authors: Khil PP,Smagulova F,Brick KM,Camerini-Otero RD,Petukhova GV

    更新日期:2012-05-01 00:00:00

  • BLAT--the BLAST-like alignment tool.

    abstract::Analyzing vertebrate genomes requires rapid mRNA/DNA and cross-species protein alignments. A new tool, BLAT, is more accurate and 500 times faster than popular existing tools for mRNA/DNA alignments and 50 times faster for protein alignments at sensitivity settings typically used when comparing vertebrate sequences. B...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.229202

    authors: Kent WJ

    更新日期:2002-04-01 00:00:00

  • Characterization of the RNA content of chromatin.

    abstract::Noncoding RNA (ncRNA) constitutes a significant portion of the mammalian transcriptome. Emerging evidence suggests that it regulates gene expression in cis or trans by modulating the chromatin structure. To uncover the functional role of ncRNA in chromatin organization, we deep sequenced chromatin-associated RNAs (CAR...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.103473.109

    authors: Mondal T,Rasmussen M,Pandey GK,Isaksson A,Kanduri C

    更新日期:2010-07-01 00:00:00

  • Impact of genomics on research in the rat.

    abstract::The need to translate genes to function has positioned the rat as an invaluable animal model for genomic research. The significant increase in genomic resources in recent years has had an immediate functional application in the rat. Many of the resources for translational research are already in place and are ready to...

    journal_title:Genome research

    pub_type: 杂志文章,评审

    doi:10.1101/gr.3744005

    authors: Lazar J,Moreno C,Jacob HJ,Kwitek AE

    更新日期:2005-12-01 00:00:00

  • Inference of population genetic parameters in metagenomics: a clean look at messy data.

    abstract::Metagenomic projects generate short, overlapping fragments of DNA sequence, each deriving from a different individual. We report a new method for inferring the scaled mutation rate, theta = 2Neu, and the scaled exponential growth rate, R = Ner, from the site-frequency spectrum of these data while accounting for sequen...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5431206

    authors: Johnson PL,Slatkin M

    更新日期:2006-10-01 00:00:00

  • Nutritional control of mRNA isoform expression during developmental arrest and recovery in C. elegans.

    abstract::Nutrient availability profoundly influences gene expression. Many animal genes encode multiple transcript isoforms, yet the effect of nutrient availability on transcript isoform expression has not been studied in genome-wide fashion. When Caenorhabditis elegans larvae hatch without food, they arrest development in the...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.133587.111

    authors: Maxwell CS,Antoshechkin I,Kurhanewicz N,Belsky JA,Baugh LR

    更新日期:2012-10-01 00:00:00

  • Identification of protein features encoded by alternative exons using Exon Ontology.

    abstract::Transcriptomic genome-wide analyses demonstrate massive variation of alternative splicing in many physiological and pathological situations. One major challenge is now to establish the biological contribution of alternative splicing variation in physiological- or pathological-associated cellular phenotypes. Toward thi...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.212696.116

    authors: Tranchevent LC,Aubé F,Dulaurier L,Benoit-Pilven C,Rey A,Poret A,Chautard E,Mortada H,Desmet FO,Chakrama FZ,Moreno-Garcia MA,Goillot E,Janczarski S,Mortreux F,Bourgeois CF,Auboeuf D

    更新日期:2017-06-01 00:00:00

  • Polymorphic centromere locations in the pathogenic yeast Candida parapsilosis.

    abstract::Centromeres pose an evolutionary paradox: strongly conserved in function but rapidly changing in sequence and structure. However, in the absence of damage, centromere locations are usually conserved within a species. We report here that isolates of the pathogenic yeast species Candida parapsilosis show within-species ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.257816.119

    authors: Ola M,O'Brien CE,Coughlan AY,Ma Q,Donovan PD,Wolfe KH,Butler G

    更新日期:2020-05-01 00:00:00

  • Uncovering cis-regulatory sequence requirements for context-specific transcription factor binding.

    abstract::The regulation of gene expression is mediated at the transcriptional level by enhancer regions that are bound by sequence-specific transcription factors (TFs). Recent studies have shown that the in vivo binding sites of single TFs differ between developmental or cellular contexts. How this context-specific binding is ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.132811.111

    authors: Yáñez-Cuna JO,Dinh HQ,Kvon EZ,Shlyueva D,Stark A

    更新日期:2012-10-01 00:00:00

  • BRAFV600E remodels the melanocyte transcriptome and induces BANCR to regulate melanoma cell migration.

    abstract::Aberrations of protein-coding genes are a focus of cancer genomics; however, the impact of oncogenes on expression of the ~50% of transcripts without protein-coding potential, including long noncoding RNAs (lncRNAs), has been largely uncharacterized. Activating mutations in the BRAF oncogene are present in >70% of mel...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.140061.112

    authors: Flockhart RJ,Webster DE,Qu K,Mascarenhas N,Kovalski J,Kretz M,Khavari PA

    更新日期:2012-06-01 00:00:00

  • A simplified procedure for developing multiplex PCRs.

    abstract::We have developed a simplified method for multiplex PCR based on the use of chimeric primers. Each primer contains a 3' region complementary to sequence-specific recognition sites and a 5' region made up of an unrelated 20-nucleotide sequence. Identical reaction conditions, cycling times, and annealing temperatures ha...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5.5.488

    authors: Shuber AP,Grondin VJ,Klinger KW

    更新日期:1995-12-01 00:00:00

  • Transcriptional alterations in glioma result primarily from DNA methylation-independent mechanisms.

    abstract::In cancer cells, aberrant DNA methylation is commonly associated with transcriptional alterations, including silencing of tumor suppressor genes. However, multiple epigenetic mechanisms, including polycomb repressive marks, contribute to gene deregulation in cancer. To dissect the relative contribution of DNA methylat...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.249219.119

    authors: Court F,Le Boiteux E,Fogli A,Müller-Barthélémy M,Vaurs-Barrière C,Chautard E,Pereira B,Biau J,Kemeny JL,Khalil T,Karayan-Tapon L,Verrelle P,Arnaud P

    更新日期:2019-10-01 00:00:00

  • YY1 and CTCF orchestrate a 3D chromatin looping switch during early neural lineage commitment.

    abstract::CTCF is an architectural protein with a critical role in connecting higher-order chromatin folding in pluripotent stem cells. Recent reports have suggested that CTCF binding is more dynamic during development than previously appreciated. Here, we set out to understand the extent to which shifts in genome-wide CTCF occ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.215160.116

    authors: Beagan JA,Duong MT,Titus KR,Zhou L,Cao Z,Ma J,Lachanski CV,Gillis DR,Phillips-Cremins JE

    更新日期:2017-07-01 00:00:00

  • The mRNA-bound proteome of the early fly embryo.

    abstract::Early embryogenesis is characterized by the maternal to zygotic transition (MZT), in which maternally deposited messenger RNAs are degraded while zygotic transcription begins. Before the MZT, post-transcriptional gene regulation by RNA-binding proteins (RBPs) is the dominant force in embryo patterning. We used two mRN...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.200386.115

    authors: Wessels HH,Imami K,Baltz AG,Kolinski M,Beldovskaya A,Selbach M,Small S,Ohler U,Landthaler M

    更新日期:2016-07-01 00:00:00

  • Abundance and length of simple repeats in vertebrate genomes are determined by their structural properties.

    abstract::Microsatellites are abundant in vertebrate genomes, but their sequence representation and length distributions vary greatly within each family of repeats (e.g., tetranucleotides). Biophysical studies of 82 synthetic single-stranded oligonucleotides comprising all tetra- and trinucleotide repeats revealed an inverse co...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.078303.108

    authors: Bacolla A,Larson JE,Collins JR,Li J,Milosavljevic A,Stenson PD,Cooper DN,Wells RD

    更新日期:2008-10-01 00:00:00

  • An MCMC algorithm for haplotype assembly from whole-genome sequence data.

    abstract::In comparison to genotypes, knowledge about haplotypes (the combination of alleles present on a single chromosome) is much more useful for whole-genome association studies and for making inferences about human evolutionary history. Haplotypes are typically inferred from population genotype data using computational met...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.077065.108

    authors: Bansal V,Halpern AL,Axelrod N,Bafna V

    更新日期:2008-08-01 00:00:00

  • Theories and applications for sequencing randomly selected clones.

    abstract::Theory is developed for the process of sequencing randomly selected large-insert clones. Genome size, library depth, clone size, and clone distribution are considered relevant properties and perfect overlap detection for contig assembly is assumed. Genome-specific and nonrandom effects are neglected. Order of magnitud...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.gr-1339r

    authors: Wendl MC,Marra MA,Hillier LW,Chinwalla AT,Wilson RK,Waterston RH

    更新日期:2001-02-01 00:00:00

  • Exploring expression data: identification and analysis of coexpressed genes.

    abstract::Analysis procedures are needed to extract useful information from the large amount of gene expression data that is becoming available. This work describes a set of analytical tools and their application to yeast cell cycle data. The components of our approach are (1) a similarity measure that reduces the number of fal...

    journal_title:Genome research

    pub_type: 杂志文章,评审

    doi:10.1101/gr.9.11.1106

    authors: Heyer LJ,Kruglyak S,Yooseph S

    更新日期:1999-11-01 00:00:00

  • The effect of translocation-induced nuclear reorganization on gene expression.

    abstract::Translocations are known to affect the expression of genes at the breakpoints and, in the case of unbalanced translocations, alter the gene copy number. However, a comprehensive understanding of the functional impact of this class of variation is lacking. Here, we have studied the effect of balanced chromosomal rearra...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.103622.109

    authors: Harewood L,Schütz F,Boyle S,Perry P,Delorenzi M,Bickmore WA,Reymond A

    更新日期:2010-05-01 00:00:00

  • Closing the gaps on human chromosome 19 revealed genes with a high density of repetitive tandemly arrayed elements.

    abstract::The reported human genome sequence includes about 400 gaps of unknown sequence that were not found in the bacterial artificial chromosome (BAC) and cosmid libraries used for sequencing of the genome. These missing sequences correspond to approximately 1% of euchromatic regions of the human genome. Gap filling is a lab...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1929904

    authors: Leem SH,Kouprina N,Grimwood J,Kim JH,Mullokandov M,Yoon YH,Chae JY,Morgan J,Lucas S,Richardson P,Detter C,Glavina T,Rubin E,Barrett JC,Larionov V

    更新日期:2004-02-01 00:00:00

  • Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer.

    abstract::Loss of heterozygosity (LOH) and copy number alteration (CNA) feature prominently in the somatic genomic landscape of tumors. As such, karyotypic aberrations in cancer genomes have been studied extensively to discover novel oncogenes and tumor-suppressor genes. Advances in sequencing technology have enabled the cost-e...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.137570.112

    authors: Ha G,Roth A,Lai D,Bashashati A,Ding J,Goya R,Giuliany R,Rosner J,Oloumi A,Shumansky K,Chin SF,Turashvili G,Hirst M,Caldas C,Marra MA,Aparicio S,Shah SP

    更新日期:2012-10-01 00:00:00

  • DNA enrichment by allele-specific hybridization (DEASH): a novel method for haplotyping and for detecting low-frequency base substitutional variants and recombinant DNA molecules.

    abstract::Detecting rare sequence variants in genomic DNA is central to the analysis of de novo mutation and recombination events and the detection of rare pathological mutations in mixed cell populations. Current PCR techniques suffer from noise that limits detection to variants present at a frequency of at least 10(-4)-10(-5)...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1214603

    authors: Jeffreys AJ,May CA

    更新日期:2003-10-01 00:00:00

  • Fourfold faster rate of genome rearrangement in nematodes than in Drosophila.

    abstract::We compared the genome of the nematode Caenorhabditis elegans to 13% of that of Caenorhabditis briggsae, identifying 252 conserved segments along their chromosomes. We detected 517 chromosomal rearrangements, with the ratio of translocations to inversions to transpositions being approximately 1:1:2. We estimate that t...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.172702

    authors: Coghlan A,Wolfe KH

    更新日期:2002-06-01 00:00:00

  • Annotation transfer for genomics: measuring functional divergence in multi-domain proteins.

    abstract::Annotation transfer is a principal process in genome annotation. It involves "transferring" structural and functional annotation to uncharacterized open reading frames (ORFs) in a newly completed genome from experimentally characterized proteins similar in sequence. To prevent errors in genome annotation, it is import...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.183801

    authors: Hegyi H,Gerstein M

    更新日期:2001-10-01 00:00:00

  • Copy number variation at the breakpoint region of isochromosome 17q.

    abstract::Isochromosome 17q, or i(17q), is one of the most frequent nonrandom changes occurring in human neoplasia. Most of the i(17q) breakpoints cluster within a approximately 240-kb interval located in the Smith-Magenis syndrome common deletion region in 17p11.2. The breakpoint cluster region is characterized by a complex ar...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.080697.108

    authors: Carvalho CM,Lupski JR

    更新日期:2008-11-01 00:00:00

  • A platform for curated products from novel open reading frames prompts reinterpretation of disease variants.

    abstract::Recent evidence from proteomics and deep massively parallel sequencing studies have revealed that eukaryotic genomes contain substantial numbers of as-yet-uncharacterized open reading frames (ORFs). We define these uncharacterized ORFs as novel ORFs (nORFs). nORFs in humans are mostly under 100 codons and are found in...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.263202.120

    authors: Neville MDC,Kohze R,Erady C,Meena N,Hayden M,Cooper DN,Mort M,Prabakaran S

    更新日期:2021-01-19 00:00:00

  • The extensive and condition-dependent nature of epistasis among whole-genome duplicates in yeast.

    abstract::Since complete redundancy between extant duplicates (paralogs) is evolutionarily unfavorable, some degree of functional congruency is eventually lost. However, in budding yeast, experimental evidence collected for duplicated metabolic enzymes and in global physical interaction surveys had suggested widespread function...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.076174.108

    authors: Musso G,Costanzo M,Huangfu M,Smith AM,Paw J,San Luis BJ,Boone C,Giaever G,Nislow C,Emili A,Zhang Z

    更新日期:2008-07-01 00:00:00

  • Dynamic changes in replication timing and gene expression during lineage specification of human pluripotent stem cells.

    abstract::Duplication of the genome in mammalian cells occurs in a defined temporal order referred to as its replication-timing (RT) program. RT changes dynamically during development, regulated in units of 400-800 kb referred to as replication domains (RDs). Changes in RT are generally coordinated with transcriptional competen...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.187989.114

    authors: Rivera-Mulia JC,Buckley Q,Sasaki T,Zimmerman J,Didier RA,Nazor K,Loring JF,Lian Z,Weissman S,Robins AJ,Schulz TC,Menendez L,Kulik MJ,Dalton S,Gabr H,Kahveci T,Gilbert DM

    更新日期:2015-08-01 00:00:00

  • Reprogramming of the human intestinal epigenome by surgical tissue transposition.

    abstract::Extracellular cues play critical roles in the establishment of the epigenome during development and may also contribute to epigenetic perturbations found in disease states. The direct role of the local tissue environment on the post-development human epigenome, however, remains unclear due to limitations in studies of...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.166439.113

    authors: Lay FD,Triche TJ Jr,Tsai YC,Su SF,Martin SE,Daneshmand S,Skinner EC,Liang G,Chihara Y,Jones PA

    更新日期:2014-04-01 00:00:00