Perspectives: sequence data base searching in the era of large-scale genomic sequencing.

Abstract:

:Large-scale sequencing of human and model organism genomes will have a profound impact on our ability to use sequence data base searching to predict the biochemical functions of sequences of interest. Despite the great value of more sequences in the data bases, a huge increase in data base size will also have adverse effects on data base searches. Upcoming problems will include (1) greatly increased search times, (2) an increase in background noise of high-scoring but biologically irrelevant matches, (3) inaccurate coding region prediction, leading to problems in protein data base searching, and (4) limited first-pass sequence annotation, making it difficult to determine the biological relevance of data base hits. Improved data base annotation tools and construction of smaller data bases of representative and highly-annotated sequences for first-pass analyses will be essential to deal with the impending flood of new genomic sequence.

journal_name

Genome Res

journal_title

Genome research

authors

Smith RF

doi

10.1101/gr.6.8.653

subject

Has Abstract

pub_date

1996-08-01 00:00:00

pages

653-60

issue

8

eissn

1088-9051

issn

1549-5469

journal_volume

6

pub_type

杂志文章,评审
  • Computational comparison of human genomic sequence assemblies for a region of chromosome 4.

    abstract::Much of the available human genomic sequence data exist in a fragmentary draft state following the completion of the initial high-volume sequencing performed by the International Human Genome Sequencing Consortium (IHGSC) and Celera Genomics (CG). We compared six draft genome assemblies over a region of chromosome 4p ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.207902

    authors: Semple CA,Morris SW,Porteous DJ,Evans KL

    更新日期:2002-03-01 00:00:00

  • Bacterial genomes as new gene homes: the genealogy of ORFans in E. coli.

    abstract::Differences in gene repertoire among bacterial genomes are usually ascribed to gene loss or to lateral gene transfer from unrelated cellular organisms. However, most bacteria contain large numbers of ORFans, that is, annotated genes that are restricted to a particular genome and that possess no known homologs. The uni...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.2231904

    authors: Daubin V,Ochman H

    更新日期:2004-06-01 00:00:00

  • Genome-wide mapping of human DNA-replication origins: levels of transcription at ORC1 sites regulate origin selection and replication timing.

    abstract::We report the genome-wide mapping of ORC1 binding sites in mammals, by chromatin immunoprecipitation and parallel sequencing (ChIP-seq). ORC1 binding sites in HeLa cells were validated as active DNA replication origins (ORIs) using Repli-seq, a method that allows identification of ORI-containing regions by parallel se...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.142331.112

    authors: Dellino GI,Cittaro D,Piccioni R,Luzi L,Banfi S,Segalla S,Cesaroni M,Mendoza-Maldonado R,Giacca M,Pelicci PG

    更新日期:2013-01-01 00:00:00

  • Detection of numerous Y chromosome biallelic polymorphisms by denaturing high-performance liquid chromatography.

    abstract::Y chromosome haplotypes are particularly useful in deciphering human evolutionary history because they accentuate the effects of drift, migration, and range expansion. Significant acceleration of Y biallelic marker discovery and subsequent typing involving heteroduplex detection has been achieved by implementing an in...

    journal_title:Genome research

    pub_type: 信件

    doi:10.1101/gr.7.10.996

    authors: Underhill PA,Jin L,Lin AA,Mehdi SQ,Jenkins T,Vollrath D,Davis RW,Cavalli-Sforza LL,Oefner PJ

    更新日期:1997-10-01 00:00:00

  • A bioinformatics-based strategy identifies c-Myc and Cdc25A as candidates for the Apmt mammary tumor latency modifiers.

    abstract::The epistatically interacting modifier loci (Apmt1 and Apmt2) accelerate the polyoma Middle-T (PyVT)-induced mammary tumor. To identify potential candidate genes loci, a combined bioinformatics and genomics strategy was used. On the basis of the assumption that the loci were functioning in the same or intersecting pat...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.210502

    authors: Cozma D,Lukes L,Rouse J,Qiu TH,Liu ET,Hunter KW

    更新日期:2002-06-01 00:00:00

  • CRISPR RNAs trigger innate immune responses in human cells.

    abstract::Here, we report that CRISPR guide RNAs (gRNAs) with a 5'-triphosphate group (5'-ppp gRNAs) produced via in vitro transcription trigger RNA-sensing innate immune responses in human and murine cells, leading to cytotoxicity. 5'-ppp gRNAs in the cytosol are recognized by DDX58, which in turn activates type I interferon r...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.231936.117

    authors: Kim S,Koo T,Jee HG,Cho HY,Lee G,Lim DG,Shin HS,Kim JS

    更新日期:2018-02-22 00:00:00

  • The distribution of variation in regulatory gene segments, as present in MHC class II promoters.

    abstract::Diversity in the antigen-binding receptors of the immune system has long been a primary interest of biologists. Recently it has been suggested that polymorphism in regulatory (noncoding) gene segments is of substantial importance as well. Here, we survey the level of variation in MHC class II gene promoters in man and...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.8.2.124

    authors: Cowell LG,Kepler TB,Janitz M,Lauster R,Mitchison NA

    更新日期:1998-02-01 00:00:00

  • Relationship between histone modifications and transcription factor binding is protein family specific.

    abstract::The very small fraction of putative binding sites (BSs) that are occupied by transcription factors (TFs) in vivo can be highly variable across different cell types. This observation has been partly attributed to changes in chromatin accessibility and histone modification (HM) patterns surrounding BSs. Previous studies...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.220079.116

    authors: Xin B,Rohs R

    更新日期:2018-01-11 00:00:00

  • Analysis of Arabidopsis genome-wide variations before and after meiosis and meiotic recombination by resequencing Landsberg erecta and all four products of a single meiosis.

    abstract::Meiotic recombination, including crossovers (COs) and gene conversions (GCs), impacts natural variation and is an important evolutionary force. COs increase genetic diversity by redistributing existing variation, whereas GCs can alter allelic frequency. Here, we sequenced Arabidopsis Landsberg erecta (Ler) and two set...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.127522.111

    authors: Lu P,Han X,Qi J,Yang J,Wijeratne AJ,Li T,Ma H

    更新日期:2012-03-01 00:00:00

  • Capture of a functionally active methyl-CpG binding domain by an arthropod retrotransposon family.

    abstract::The repressive capacity of cytosine DNA methylation is mediated by recruitment of silencing complexes by methyl-CpG binding domain (MBD) proteins. Despite MBD proteins being associated with silencing, we discovered that a family of arthropod Copia retrotransposons have incorporated a host-derived MBD. We functionally ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.243774.118

    authors: de Mendoza A,Pflueger J,Lister R

    更新日期:2019-08-01 00:00:00

  • Systematic interrogation of human promoters.

    abstract::Despite much research, our understanding of the architecture and cis-regulatory elements of human promoters is still lacking. Here, we devised a high-throughput assay to quantify the activity of approximately 15,000 fully designed sequences that we integrated and expressed from a fixed location within the human genome...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.236075.118

    authors: Weingarten-Gabbay S,Nir R,Lubliner S,Sharon E,Kalma Y,Weinberger A,Segal E

    更新日期:2019-02-01 00:00:00

  • Time course regulatory analysis based on paired expression and chromatin accessibility data.

    abstract::A time course experiment is a widely used design in the study of cellular processes such as differentiation or response to stimuli. In this paper, we propose time course regulatory analysis (TimeReg) as a method for the analysis of gene regulatory networks based on paired gene expression and chromatin accessibility da...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.257063.119

    authors: Duren Z,Chen X,Xin J,Wang Y,Wong WH

    更新日期:2020-04-01 00:00:00

  • A platform for curated products from novel open reading frames prompts reinterpretation of disease variants.

    abstract::Recent evidence from proteomics and deep massively parallel sequencing studies have revealed that eukaryotic genomes contain substantial numbers of as-yet-uncharacterized open reading frames (ORFs). We define these uncharacterized ORFs as novel ORFs (nORFs). nORFs in humans are mostly under 100 codons and are found in...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.263202.120

    authors: Neville MDC,Kohze R,Erady C,Meena N,Hayden M,Cooper DN,Mort M,Prabakaran S

    更新日期:2021-01-19 00:00:00

  • BLAT--the BLAST-like alignment tool.

    abstract::Analyzing vertebrate genomes requires rapid mRNA/DNA and cross-species protein alignments. A new tool, BLAT, is more accurate and 500 times faster than popular existing tools for mRNA/DNA alignments and 50 times faster for protein alignments at sensitivity settings typically used when comparing vertebrate sequences. B...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.229202

    authors: Kent WJ

    更新日期:2002-04-01 00:00:00

  • Analysis of the floral transcriptome uncovers new regulators of organ determination and gene families related to flower organ differentiation in Gerbera hybrida (Asteraceae).

    abstract::Development of composite inflorescences in the plant family Asteraceae has features that cannot be studied in the traditional model plants for flower development. In Gerbera hybrida, inflorescences are composed of morphologically different types of flowers tightly packed into a flower head (capitulum). Individual flor...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.3043705

    authors: Laitinen RA,Immanen J,Auvinen P,Rudd S,Alatalo E,Paulin L,Ainasoja M,Kotilainen M,Koskela S,Teeri TH,Elomaa P

    更新日期:2005-04-01 00:00:00

  • The portability of tagSNPs across populations: a worldwide survey.

    abstract::In the search for common genetic variants that contribute to prevalent human diseases, patterns of linkage disequilibrium (LD) among linked markers should be considered when selecting SNPs. Genotyping efficiency can be increased by choosing tagging SNPs (tagSNPs) in LD with other SNPs. However, it remains to be seen w...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.4138406

    authors: González-Neira A,Ke X,Lao O,Calafell F,Navarro A,Comas D,Cann H,Bumpstead S,Ghori J,Hunt S,Deloukas P,Dunham I,Cardon LR,Bertranpetit J

    更新日期:2006-03-01 00:00:00

  • Comparing genomes within the species Mycobacterium tuberculosis.

    abstract::The study of genetic variability within natural populations of pathogens may provide insight into their evolution and pathogenesis. We used a Mycobacterium tuberculosis high-density oligonucleotide microarray to detect small-scale genomic deletions among 19 clinically and epidemiologically well-characterized isolates ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.166401

    authors: Kato-Maeda M,Rhee JT,Gingeras TR,Salamon H,Drenkow J,Smittipat N,Small PM

    更新日期:2001-04-01 00:00:00

  • Noncoding origins of anthropoid traits and a new null model of transposon functionalization.

    abstract::Little is known about novel genetic elements that drove the emergence of anthropoid primates. We exploited the sequencing of the marmoset genome to identify 23,849 anthropoid-specific constrained (ASC) regions and confirmed their robust functional signatures. Of the ASC base pairs, 99.7% were noncoding, suggesting tha...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.168963.113

    authors: del Rosario RC,Rayan NA,Prabhakar S

    更新日期:2014-09-01 00:00:00

  • DNA methylation profiling in human B cells reveals immune regulatory elements and epigenetic plasticity at Alu elements during B-cell activation.

    abstract::Memory is a hallmark of adaptive immunity, wherein lymphocytes mount a superior response to a previously encountered antigen. It has been speculated that epigenetic alterations in memory lymphocytes contribute to their functional distinction from their naive counterparts. However, the nature and extent of epigenetic a...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.155473.113

    authors: Lai AY,Mav D,Shah R,Grimm SA,Phadke D,Hatzi K,Melnick A,Geigerman C,Sobol SE,Jaye DL,Wade PA

    更新日期:2013-12-01 00:00:00

  • Genomic organization of the sex-determining and adjacent regions of the sex chromosomes of medaka.

    abstract::Sequencing of the human Y chromosome has uncovered the peculiarities of the genomic organization of a heterogametic sex chromosome of old evolutionary age, and has led to many insights into the evolutionary changes that occurred during its long history. We have studied the genomic organization of the medaka fish Y chr...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.5016106

    authors: Kondo M,Hornung U,Nanda I,Imai S,Sasaki T,Shimizu A,Asakawa S,Hori H,Schmid M,Shimizu N,Schartl M

    更新日期:2006-07-01 00:00:00

  • Immune signatures correlate with L1 retrotransposition in gastrointestinal cancers.

    abstract::Long interspersed nuclear element-1 (LINE-1 or L1) retrotransposons are normally suppressed in somatic tissues mainly due to DNA methylation and antiviral defense. However, the mechanism to suppress L1s may be disrupted in cancers, thus allowing L1s to act as insertional mutagens and cause genomic rearrangement and in...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.231837.117

    authors: Jung H,Choi JK,Lee EA

    更新日期:2018-08-01 00:00:00

  • The identification and functional annotation of RNA structures conserved in vertebrates.

    abstract::Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than seq...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.208652.116

    authors: Seemann SE,Mirza AH,Hansen C,Bang-Berthelsen CH,Garde C,Christensen-Dalsgaard M,Torarinsson E,Yao Z,Workman CT,Pociot F,Nielsen H,Tommerup N,Ruzzo WL,Gorodkin J

    更新日期:2017-08-01 00:00:00

  • Efficient identification of Y chromosome sequences in the human and Drosophila genomes.

    abstract::Notwithstanding their biological importance, Y chromosomes remain poorly known in most species. A major obstacle to their study is the identification of Y chromosome sequences; due to its high content of repetitive DNA, in most genome projects, the Y chromosome sequence is fragmented into a large number of small, unma...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.156034.113

    authors: Carvalho AB,Clark AG

    更新日期:2013-11-01 00:00:00

  • Evolutionary dynamics of segmental duplications from human Y-chromosomal euchromatin/heterochromatin transition regions.

    abstract::Human chromosomal regions enriched in segmental duplications are subject to extensive genomic reorganization. Such regions are particularly informative for illuminating the evolutionary history of a given chromosome. We have analyzed 866 kb of Y-chromosomal non-palindromic segmental duplications delineating four euchr...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.076711.108

    authors: Kirsch S,Münch C,Jiang Z,Cheng Z,Chen L,Batz C,Eichler EE,Schempp W

    更新日期:2008-07-01 00:00:00

  • Annotation transfer for genomics: measuring functional divergence in multi-domain proteins.

    abstract::Annotation transfer is a principal process in genome annotation. It involves "transferring" structural and functional annotation to uncharacterized open reading frames (ORFs) in a newly completed genome from experimentally characterized proteins similar in sequence. To prevent errors in genome annotation, it is import...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.183801

    authors: Hegyi H,Gerstein M

    更新日期:2001-10-01 00:00:00

  • Nutritional control of mRNA isoform expression during developmental arrest and recovery in C. elegans.

    abstract::Nutrient availability profoundly influences gene expression. Many animal genes encode multiple transcript isoforms, yet the effect of nutrient availability on transcript isoform expression has not been studied in genome-wide fashion. When Caenorhabditis elegans larvae hatch without food, they arrest development in the...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.133587.111

    authors: Maxwell CS,Antoshechkin I,Kurhanewicz N,Belsky JA,Baugh LR

    更新日期:2012-10-01 00:00:00

  • Novel susceptibility locus for mouse hepatomas: evidence for a conserved tumor suppressor gene.

    abstract::We have identified previously a putative tumor suppressor gene (TSG) locus at human chromosome (hchr) 7q31 showing that it is altered in a variety of human epithelial tumors. To determine whether this TSG is conserved in mice, we studied loss of heterozygosity (LOH) in chemically induced mouse liver adenomas. The LOH ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6.11.1070

    authors: Zenklusen JC,Rodriguez LV,LaCava M,Wang Z,Goldstein LS,Conti CJ

    更新日期:1996-11-01 00:00:00

  • A nuclear matrix attachment site in the 4q35 locus has an enhancer-blocking activity in vivo: implications for the facio-scapulo-humeral dystrophy.

    abstract::Facio-scapulo-humeral dystrophy (FSHD), a muscular hereditary disease with a prevalence of 1 in 20,000, is caused by a partial deletion of a subtelomeric repeat array on chromosome 4q. Earlier, we demonstrated the existence in the vicinity of the D4Z4 repeat of a nuclear matrix attachment site, FR-MAR, efficient in no...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6620908

    authors: Petrov A,Allinne J,Pirozhkova I,Laoudj D,Lipinski M,Vassetzky YS

    更新日期:2008-01-01 00:00:00

  • Connecting sequence and biology in the laboratory mouse.

    abstract::The Mouse Genome Sequencing Consortium and the RIKEN Genome Exploration Research grouphave generated large sets of sequence data representing the mouse genome and transcriptome, respectively. These data provide a valuable foundation for genomic research. The challenges for the informatics community are how to integrat...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.991003

    authors: Baldarelli RM,Hill DP,Blake JA,Adachi J,Furuno M,Bradt D,Corbani LE,Cousins S,Frazer KS,Qi D,Yang L,Ramachandran S,Reed D,Zhu Y,Kasukawa T,Ringwald M,King BL,Maltais LJ,McKenzie LM,Schriml LM,Maglott D,Church DM

    更新日期:2003-06-01 00:00:00

  • A scalable high-throughput chemical synthesizer.

    abstract::A machine that employs a novel reagent delivery technique for biomolecular synthesis has been developed. This machine separates the addressing of individual synthesis sites from the actual process of reagent delivery by using masks placed over the sites. Because of this separation, this machine is both cost-effective ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.359002

    authors: Livesay EA,Liu YH,Luebke KJ,Irick J,Belosludtsev Y,Rayner S,Balog R,Johnston SA

    更新日期:2002-12-01 00:00:00