Abstract:
:Gene order in prokaryotes is conserved to a much lesser extent than protein sequences. Only several operons, primarily those that code for physically interacting proteins, are conserved in all or most of the bacterial and archaeal genomes. Nevertheless, even the limited conservation of operon organization that is observed can provide valuable evolutionary and functional clues through multiple genome comparisons. A program for constructing gapped local alignments of conserved gene strings in two genomes was developed. The statistical significance of the local alignments was assessed using Monte Carlo simulations. Sets of local alignments were generated for all pairs of completely sequenced bacterial and archaeal genomes, and for each genome a template-anchored multiple alignment was constructed. In most pairwise genome comparisons, <10% of the genes in each genome belonged to conserved gene strings. When closely related pairs of species (i.e., two mycoplasmas) are excluded, the total coverage of genomes by conserved gene strings ranged from <5% for the cyanobacterium Synechocystis sp to 24% for the minimal genome of Mycoplasma genitalium, and 23% in Thermotoga maritima. The coverage of the archaeal genomes was only slightly lower than that of bacterial genomes. The majority of the conserved gene strings are known operons, with the ribosomal superoperon being the top-scoring string in most genome comparisons. However, in some of the bacterial-archaeal pairs, the superoperon is rearranged to the extent that other operons, primarily those subject to horizontal transfer, show the greatest level of conservation, such as the archaeal-type H+-ATPase operon or ABC-type transport cassettes. The level of gene order conservation among prokaryotic genomes was compared to the cooccurrence of genomes in clusters of orthologous genes (COGs) and to the conservation of protein sequences themselves. Only limited correlation was observed between these evolutionary variables. Gene order conservation shows a much lower variance than the cooccurrence of genomes in COGs, which indicates that intragenome homogenization via recombination occurs in evolution much faster than intergenome homogenization via horizontal gene transfer and lineage-specific gene loss. The potential of using template-anchored multiple-genome alignments for predicting functions of uncharacterized genes was quantitatively assessed. Functions were predicted or significantly clarified for approximately 90 COGs (approximately 4% of the total of 2414 analyzed COGs). The most significant predictions were obtained for the poorly characterized archaeal genomes; these include a previously uncharacterized restriction-modification system, a nuclease-helicase combination implicated in DNA repair, and the probable archaeal counterpart of the eukaryotic exosome. Multiple genome alignments are a resource for studies on operon rearrangement and disruption, which is central to our understanding of the evolution of prokaryotic genomes. Because of the rapid evolution of the gene order, the potential of genome alignment for prediction of gene functions is limited, but nevertheless, such predictions information significantly complements the results obtained through protein sequence and structure analysis.
journal_name
Genome Resjournal_title
Genome researchauthors
Wolf YI,Rogozin IB,Kondrashov AS,Koonin EVdoi
10.1101/gr.gr-1619rsubject
Has Abstractpub_date
2001-03-01 00:00:00pages
356-72issue
3eissn
1088-9051issn
1549-5469journal_volume
11pub_type
杂志文章相关文献
GENOME RESEARCH文献大全abstract::By analyzing 1,780,295 5'-end sequences of human full-length cDNAs derived from 164 kinds of oligo-cap cDNA libraries, we identified 269,774 independent positions of transcriptional start sites (TSSs) for 14,628 human RefSeq genes. These TSSs were clustered into 30,964 clusters that were separated from each other by m...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.4039406
更新日期:2006-01-01 00:00:00
abstract::Next-generation sequencing technologies have made it possible to sequence targeted regions of the human genome in hundreds of individuals. Deep sequencing represents a powerful approach for the discovery of the complete spectrum of DNA sequence variants in functionally important genomic intervals. Current methods for ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.100040.109
更新日期:2010-04-01 00:00:00
abstract::We have cloned the human gene encoding the transcription factor T. T protein is vital for the formation of posterior mesoderm and axial development in all vertebrates. Brachyury mutant mice, which lack T protein, die in utero with abnormal notochord, posterior somites, and allantois. We have identified human T genomic...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6.3.226
更新日期:1996-03-01 00:00:00
abstract::We have developed a new tool to visualize expression data on metabolic pathways and to evaluate which metabolic pathways are most affected by transcriptional changes in whole-genome expression experiments. Using the Fisher Exact Test, the method scores biochemical pathways according to the probability that as many or ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.226602
更新日期:2002-07-01 00:00:00
abstract::Long sequencing reads generated by single-molecule sequencing technology offer the possibility of dramatically improving the contiguity of genome assemblies. The biggest challenge today is that long reads have relatively high error rates, currently around 15%. The high error rates make it difficult to use this data al...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.213405.116
更新日期:2017-05-01 00:00:00
abstract::The human genome is estimated to contain 23,000 to 33,000 retropseudogenes. To study the properties of genes giving rise to these retroelements, we compared the structure and expression of genes with or without known retropseudogenes. Four main features have emerged from the analysis of 181 genes associated to retrops...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.10.5.672
更新日期:2000-05-01 00:00:00
abstract::An open question in bacterial genomics is the role that adaptive evolution of the core genome plays in diversification and adaptation of bacterial species, and how this might differ between groups of bacteria occupying different environmental circumstances. The genus Campylobacter encompasses several important human a...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.089250.108
更新日期:2009-07-01 00:00:00
abstract::We generated high-resolution maps of histone H3 lysine 9/14 acetylation (H3ac), histone H4 lysine 5/8/12/16 acetylation (H4ac), and histone H3 at lysine 4 mono-, di-, and trimethylation (H3K4me1, H3K4me2, H3K4me3, respectively) across the ENCODE regions. Studying each modification in five human cell lines including th...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.5704207
更新日期:2007-06-01 00:00:00
abstract::Retrotransposons have proliferated extensively in eukaryotic lineages; the genomes of many animals and plants comprise 50% or more retrotransposon sequences by weight. There are several persuasive arguments that the enzymatic lynchpin of retrotransposon replication, reverse transcriptase (RT), is an ancient enzyme. Mo...
journal_title:Genome research
pub_type: 杂志文章,评审
doi:10.1101/gr.1392003
更新日期:2003-09-01 00:00:00
abstract::Forty-three yeast artificial chromosomes (YACs) from the X chromosome have been overlapped across the 4-Mb Xq21.3 region, which is homologous to a segment in Yp11.1. The region is formatted to 60-kb resolution with 57 STSs and is merged at its edges with contigs specific for X. This allows a direct comparison of marke...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.7.4.307
更新日期:1997-04-01 00:00:00
abstract::Accurate gene tree-species tree reconciliation is fundamental to inferring the evolutionary history of a gene family. However, although it has long been appreciated that population-related effects such as incomplete lineage sorting (ILS) can dramatically affect the gene tree, many of the most popular reconciliation me...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.161968.113
更新日期:2014-03-01 00:00:00
abstract::Mammalian genomes are partitioned into domains that replicate in a defined temporal order. These domains can replicate at similar times in all cell types (constitutive) or at cell type-specific times (developmental). Genome-wide chromatin conformation capture (Hi-C) has revealed sub-megabase topologically associating ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.183699.114
更新日期:2015-08-01 00:00:00
abstract::The study of genetic variability within natural populations of pathogens may provide insight into their evolution and pathogenesis. We used a Mycobacterium tuberculosis high-density oligonucleotide microarray to detect small-scale genomic deletions among 19 clinically and epidemiologically well-characterized isolates ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.166401
更新日期:2001-04-01 00:00:00
abstract::Size sexual dimorphism occurs in almost all mammals. In Portuguese Water Dogs, much of the difference in skeletal size between females and males is due to the interaction between a Quantitative Trait Locus (QTL) on the X-chromosome and a QTL linked to Insulin-like Growth Factor 1 (IGF-1) on the CFA 15 autosome. In fem...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.3712705
更新日期:2005-12-01 00:00:00
abstract::In this study we quantify the features of meiotic recombination on the long arm of human chromosome 21. We constructed a 67. 3-centimorgan (cM) high-resolution, comprehensive, and accurate genetic linkage map of chromosome 21q using 187 highly polymorphic markers covering almost the entire long arm; 46 loci, consistin...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.138100
更新日期:2000-09-01 00:00:00
abstract::Trace Recalling is a novel method for deconvoluting double traces that result from simultaneously sequencing two DNA templates. Trace Recalling identifies up to two bases at each position of such a trace. The resulting ambiguity sequence is aligned to the genome, identifying one template sequence. A second template se...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.5661407
更新日期:2007-02-01 00:00:00
abstract::Dictyostelium discoideum (DD), an extensively studied model organism for cell and developmental biology, belongs to the most derived group 4 of social amoebas, a clade of altruistic multicellular organisms. To understand genome evolution over long time periods and the genetic basis of social evolution, we sequenced th...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.121137.111
更新日期:2011-11-01 00:00:00
abstract::All individuals in a finite population are related if traced back long enough and will, therefore, share regions of their genomes identical by descent (IBD). Detection of such regions has several important applications-from answering questions about human evolution to locating regions in the human genome containing di...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.115360.110
更新日期:2011-07-01 00:00:00
abstract::Whole-genome sequencing using massively parallel sequencing technologies enables accurate detection of somatic rearrangements in cancer. Pinpointing large numbers of rearrangement breakpoints to base-pair resolution allows analysis of rearrangement microhomology and genomic location for every sample. Here we analyze 9...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.141382.112
更新日期:2013-02-01 00:00:00
abstract::MicroRNAs (miRNAs) are major post-transcriptional regulators of gene expression, yet their origins and functional evolution in mammals remain little understood due to the lack of appropriate comparative data. Using RNA sequencing, we have generated extensive and comparable miRNA data for five organs in six species tha...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.140269.112
更新日期:2013-01-01 00:00:00
abstract::Integrating the genotype with epigenetic marks holds the promise of better understanding the biology that underlies the complex interactions of inherited and environmental components that define the developmental origins of a range of disorders. The quality of the in utero environment significantly influences health o...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.171439.113
更新日期:2014-07-01 00:00:00
abstract::We have generated an improved assembly and gene annotation of the pig X Chromosome, and a first draft assembly of the pig Y Chromosome, by sequencing BAC and fosmid clones from Duroc animals and incorporating information from optical mapping and fiber-FISH. The X Chromosome carries 1033 annotated genes, 690 of which a...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.188839.114
更新日期:2016-01-01 00:00:00
abstract::Repetitive DNA is a significant component of eukaryotic genomes. We have developed a strategy to efficiently and accurately sequence repetitive DNA in the nematode Caenorhabditis elegans using integrated artificial transposons and automated fluorescent sequencing. Mapping and assembly tools represent important compone...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.7.5.551
更新日期:1997-05-01 00:00:00
abstract::The recently identified mouse obese (ob) gene apparently encodes a secreted protein that may function in the signaling pathway of adipose tissue. Mutations in the mouse ob gene are associated with the early development of gross obesity. A detailed knowledge concerning the RNA expression pattern and precise genomic loc...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.5.1.5
更新日期:1995-08-01 00:00:00
abstract::Despite much research, our understanding of the architecture and cis-regulatory elements of human promoters is still lacking. Here, we devised a high-throughput assay to quantify the activity of approximately 15,000 fully designed sequences that we integrated and expressed from a fixed location within the human genome...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.236075.118
更新日期:2019-02-01 00:00:00
abstract::The vomeronasal system of mice is thought to be specialized in the detection of pheromones. Two multigene families have been identified that encode proteins with seven putative transmembrane domains and that are expressed selectively in subsets of neurons of the vomeronasal organ. The products of these vomeronasal rec...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.10.12.1958
更新日期:2000-12-01 00:00:00
abstract::Through comparative studies of the model organism Arabidopsis thaliana and its close relative Brassica oleracea, we have identified conserved regions that represent potentially functional sequences overlooked by previous Arabidopsis genome annotation methods. A total of 454,274 whole genome shotgun sequences covering ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.3176505
更新日期:2005-04-01 00:00:00
abstract::The mammalian cell nucleus contains numerous discrete suborganelles named nuclear bodies. While recruitment of specific genomic regions into these large ribonucleoprotein (RNP) complexes critically contributes to higher-order functional chromatin organization, such regions remain ill-defined. We have developed the hig...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.237073.118
更新日期:2018-11-01 00:00:00
abstract::We report the development of a simplified cap analysis of gene expression (CAGE) protocol adapted for single-molecule sequencers that avoids second strand synthesis, ligation, digestion, and PCR. HeliScopeCAGE directly sequences the 3' end of cap trapped first-strand cDNAs. As with previous versions of CAGE, we better...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.115469.110
更新日期:2011-07-01 00:00:00
abstract::We compared the genome of the nematode Caenorhabditis elegans to 13% of that of Caenorhabditis briggsae, identifying 252 conserved segments along their chromosomes. We detected 517 chromosomal rearrangements, with the ratio of translocations to inversions to transpositions being approximately 1:1:2. We estimate that t...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.172702
更新日期:2002-06-01 00:00:00