Abstract:
:Detecting and estimating DNA sample contamination are important steps to ensure high-quality genotype calls and reliable downstream analysis. Existing methods rely on population allele frequency information for accurate estimation of contamination rates. Correctly specifying population allele frequencies for each individual in early stage of sequence analysis is impractical or even impossible for large-scale sequencing centers that simultaneously process samples from multiple studies across diverse populations. On the other hand, incorrectly specified allele frequencies may result in substantial bias in estimated contamination rates. For example, we observed that existing methods often fail to identify 10% contaminated samples at a typical 3% contamination exclusion threshold when genetic ancestry is misspecified. Such an incomplete screening of contaminated samples substantially inflates the estimated rate of genotyping errors even in deeply sequenced genomes and exomes. We propose a robust statistical method that accurately estimates DNA contamination and is agnostic to genetic ancestry of the intended or contaminating sample. Our method integrates the estimation of genetic ancestry and DNA contamination in a unified likelihood framework by leveraging individual-specific allele frequencies projected from reference genotypes onto principal component coordinates. Our method can also be used for estimating genetic ancestries, similar to LASER or TRACE, but simultaneously accounting for potential contamination. We demonstrate that our method robustly estimates contamination rates and genetic ancestries across populations and contamination scenarios. We further demonstrate that, in the presence of contamination, genetic ancestry inference can be substantially biased with existing methods that ignore contamination, while our method corrects for such biases.
journal_name
Genome Resjournal_title
Genome researchauthors
Zhang F,Flickinger M,Taliun SAG,InPSYght Psychiatric Genetics Consortium.,Abecasis GR,Scott LJ,McCaroll SA,Pato CN,Boehnke M,Kang HMdoi
10.1101/gr.246934.118subject
Has Abstractpub_date
2020-02-01 00:00:00pages
185-194issue
2eissn
1088-9051issn
1549-5469pii
gr.246934.118journal_volume
30pub_type
杂志文章相关文献
GENOME RESEARCH文献大全abstract::We previously described the whole-genome assembly program Arachne, presenting assemblies of simulated data for small to mid-sized genomes. Here we describe algorithmic adaptations to the program, allowing for assembly of mammalian-size genomes, and also improving the assembly of smaller genomes. Three principal change...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.828403
更新日期:2003-01-01 00:00:00
abstract::The rate of transcription elongation plays an important role in the timing of expression of full-length transcripts as well as in the regulation of alternative splicing. In this study, we coupled Bru-seq technology with 5,6-dichlorobenzimidazole 1-β-D-ribofuranoside (DRB) to estimate the elongation rates of over 2000 ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.171405.113
更新日期:2014-06-01 00:00:00
abstract::Here, we report that CRISPR guide RNAs (gRNAs) with a 5'-triphosphate group (5'-ppp gRNAs) produced via in vitro transcription trigger RNA-sensing innate immune responses in human and murine cells, leading to cytotoxicity. 5'-ppp gRNAs in the cytosol are recognized by DDX58, which in turn activates type I interferon r...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.231936.117
更新日期:2018-02-22 00:00:00
abstract::Gene expression levels can be an important link DNA between variation and phenotypic manifestations. Our previous map of global gene expression, based on ~400K single nucleotide polymorphisms (SNPs) and 50K transcripts in 400 sib pairs from the MRCA family panel, has been widely used to interpret the results of genome...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.142521.112
更新日期:2013-04-01 00:00:00
abstract::Chromosomal aberrations have been thought to be random events. However, recent findings introduce a new paradigm in which certain DNA segments have the potential to adopt unusual conformations that lead to genomic instability and nonrandom chromosomal rearrangement. One of the best-studied examples is the palindromic ...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.079244.108
更新日期:2009-02-01 00:00:00
abstract::An open question in bacterial genomics is the role that adaptive evolution of the core genome plays in diversification and adaptation of bacterial species, and how this might differ between groups of bacteria occupying different environmental circumstances. The genus Campylobacter encompasses several important human a...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.089250.108
更新日期:2009-07-01 00:00:00
abstract::The sequence of the first plant genome was completed and published at the end of 2000. This spawned a series of large-scale projects aimed at discovering the functions of the 25,000+ genes identified in Arabidopsis thaliana (Arabidopsis). This review summarizes progress made in the past five years and speculates about...
journal_title:Genome research
pub_type: 杂志文章,评审
doi:10.1101/gr.3723405
更新日期:2005-12-01 00:00:00
abstract::The mammalian cell nucleus contains numerous discrete suborganelles named nuclear bodies. While recruitment of specific genomic regions into these large ribonucleoprotein (RNP) complexes critically contributes to higher-order functional chromatin organization, such regions remain ill-defined. We have developed the hig...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.237073.118
更新日期:2018-11-01 00:00:00
abstract::We describe a new method, Tag-seq, which employs ultra high-throughput sequencing of 21 base pair cDNA tags for sensitive and cost-effective gene expression profiling. We compared Tag-seq data to LongSAGE data and observed improved representation of several classes of rare transcripts, including transcription factors,...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.094482.109
更新日期:2009-10-01 00:00:00
abstract::RNA-seq protocols that focus on transcript termini are well suited for applications in which template quantity is limiting. Here we show that, when applied to end-sequencing data, analytical methods designed for global RNA-seq produce computational artifacts. To remedy this, we created the End Sequence Analysis Toolki...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.207902.116
更新日期:2016-10-01 00:00:00
abstract::The genetics of aging in the yeast Saccharomyces cerevisiae has involved the manipulation of individual genes in laboratory strains. We have instituted a quantitative genetic analysis of the yeast replicative lifespan by sampling the natural genetic variation in a wild yeast isolate. Haploid segregants from a cross be...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.136549.111
更新日期:2012-10-01 00:00:00
abstract::We report on the development of a methylation analysis workflow for optical detection of fluorescent methylation profiles along chromosomal DNA molecules. In combination with Bionano Genomics genome mapping technology, these profiles provide a hybrid genetic/epigenetic genome-wide map composed of DNA molecules spannin...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.240739.118
更新日期:2019-04-01 00:00:00
abstract::Hammerhead ribozymes previously were found in satellite RNAs from plant viroids and in repetitive DNA from certain species of newts and schistosomes. To determine if this catalytic RNA motif has a wider distribution, we decided to scrutinize the GenBank database for RNAs that contain hammerhead or hammerhead-like moti...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.10.7.1011
更新日期:2000-07-01 00:00:00
abstract::Here we describe software tools for the automated detection of DNA restriction fragments resolved on agarose fingerprinting gels. We present a mathematical model for the location and shape of the restriction fragments as a function of fragment size, with model parameters determined empirically from "marker" lanes cont...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.904303
更新日期:2003-05-01 00:00:00
abstract::Natural killer (NK) cells are innate lymphocytes important for early host defense against infectious pathogens and surveillance against malignant transformation. Resting murine NK cells regulate the translation of effector molecule mRNAs (e.g., granzyme B, GzmB) through unclear molecular mechanisms. MicroRNAs (miRNAs)...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.107995.110
更新日期:2010-11-01 00:00:00
abstract::The capacity of the honey bee to produce three phenotypically distinct organisms (two female castes; queens and sterile workers, and haploid male drones) from one genotype represents one of the most remarkable examples of developmental plasticity in any phylum. The queen-worker morphological and reproductive divide is...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.236497.118
更新日期:2018-10-01 00:00:00
abstract::Although much is known about genetic variation in human and African great ape (chimpanzee, bonobo, and gorilla) genomes, substantially less is known about variation in gene-expression profiles within and among these species. This information is necessary for defining transcriptional regulatory networks that contribute...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.1289803
更新日期:2003-07-01 00:00:00
abstract::Comparative genomics provides a general methodology for discovering functional DNA elements and understanding their evolution. The availability of many related genomes enables more powerful analyses, but requires rigorous phylogenetic methods to resolve orthologous genes and regions. Here, we use 12 recently sequenced...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.7105007
更新日期:2007-12-01 00:00:00
abstract::An important aspect of understanding a biological pathway is to delineate the transcriptional regulatory mechanisms of the genes involved. Two important tasks are often encountered when studying transcription regulation, i.e., (1) the identification of common transcriptional regulators of a set of coexpressed genes; (...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.4303406
更新日期:2006-03-01 00:00:00
abstract::Increasing evidence suggests that interactions between regulatory genomic elements play an important role in regulating gene expression. We generated a genome-wide interaction map of regulatory elements in human cells (ENCODE tier 1 cells, K562, GM12878) using Chromatin Interaction Analysis by Paired-End Tag sequencin...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.176586.114
更新日期:2014-12-01 00:00:00
abstract::Gene expression can be regulated at multiple levels, but it is not known if and how there is broad coordination between regulation at the transcriptional and post-transcriptional levels. Transcription factors and chromatin regulate gene expression transcriptionally, whereas microRNAs (miRNAs) are small regulatory RNAs...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.238311.118
更新日期:2019-02-01 00:00:00
abstract::Obesity is reaching epidemic proportions in developed countries and represents a significant risk factor for hypertension, heart disease, diabetes, and dyslipidemia. Splicing mutations constitute at least 14% of disease-causing mutations, thus implicating polymorphisms that affect splicing as likely candidates for dis...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6661308
更新日期:2008-02-01 00:00:00
abstract::In positional cloning the initial assignment of a gene to a specific chromosomal locus is followed by physical mapping of the critical region. The construction of a high-resolution physical map still involves considerable effort. However, new high-resolution fluorescence in situ hybridization (FISH) techniques have fa...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6.10.1002
更新日期:1996-10-01 00:00:00
abstract::Here we describe a high-throughput screen to isolate transcripts with spatially restricted patterns of expression in early embryos. Our approach utilizes robotic automation for rapid analysis of sequence-selected cDNAs in a whole-mount in situ hybridization assay. We determined the spatial distribution of a random col...
journal_title:Genome research
pub_type: 信件
doi:10.1101/gr.84402
更新日期:2002-07-01 00:00:00
abstract::The functional classification of genes on a genome-wide scale is now in its infancy, and we make a first attempt to assess existing methods and identify sources of error. To this end, we compared two independent efforts for associating proteins with functions, one implemented by FlyBase and the other by PANTHER at Cel...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.771603
更新日期:2003-09-01 00:00:00
abstract::Heterochromatin represents a significant portion of eukaryotic genomes and has essential structural and regulatory functions. Its molecular organization is largely unknown due to difficulties in sequencing through and assembling repetitive sequences enriched in the heterochromatin. Here we developed a novel strategy u...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.137406.112
更新日期:2012-12-01 00:00:00
abstract::All individuals in a finite population are related if traced back long enough and will, therefore, share regions of their genomes identical by descent (IBD). Detection of such regions has several important applications-from answering questions about human evolution to locating regions in the human genome containing di...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.115360.110
更新日期:2011-07-01 00:00:00
abstract::DNA is a universal language encrypted with biological instruction for life. In higher organisms, the genetic information is preserved predominantly in an organized exon/intron structure. When a gene is expressed, the exons are spliced together to form the transcript for protein synthesis. We have developed a complexit...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.313703
更新日期:2003-02-01 00:00:00
abstract::Most new genes arise by duplication of existing gene structures, after which relaxed selection on the new copy frequently leads to mutational inactivation of the duplicate; only rarely will a new gene with modified function emerge. Here we describe a unique mechanism of gene creation, whereby new combinations of funct...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.6252107
更新日期:2007-08-01 00:00:00
abstract::Human genomic data of many types are readily available, but the complexity and scale of human molecular biology make it difficult to integrate this body of data, understand it from a systems level, and apply it to the study of specific pathways or genetic disorders. An investigator could best explore a particular prot...
journal_title:Genome research
pub_type: 杂志文章
doi:10.1101/gr.082214.108
更新日期:2009-06-01 00:00:00