Ancestry-agnostic estimation of DNA sample contamination from sequence reads.

Abstract:

:Detecting and estimating DNA sample contamination are important steps to ensure high-quality genotype calls and reliable downstream analysis. Existing methods rely on population allele frequency information for accurate estimation of contamination rates. Correctly specifying population allele frequencies for each individual in early stage of sequence analysis is impractical or even impossible for large-scale sequencing centers that simultaneously process samples from multiple studies across diverse populations. On the other hand, incorrectly specified allele frequencies may result in substantial bias in estimated contamination rates. For example, we observed that existing methods often fail to identify 10% contaminated samples at a typical 3% contamination exclusion threshold when genetic ancestry is misspecified. Such an incomplete screening of contaminated samples substantially inflates the estimated rate of genotyping errors even in deeply sequenced genomes and exomes. We propose a robust statistical method that accurately estimates DNA contamination and is agnostic to genetic ancestry of the intended or contaminating sample. Our method integrates the estimation of genetic ancestry and DNA contamination in a unified likelihood framework by leveraging individual-specific allele frequencies projected from reference genotypes onto principal component coordinates. Our method can also be used for estimating genetic ancestries, similar to LASER or TRACE, but simultaneously accounting for potential contamination. We demonstrate that our method robustly estimates contamination rates and genetic ancestries across populations and contamination scenarios. We further demonstrate that, in the presence of contamination, genetic ancestry inference can be substantially biased with existing methods that ignore contamination, while our method corrects for such biases.

journal_name

Genome Res

journal_title

Genome research

authors

Zhang F,Flickinger M,Taliun SAG,InPSYght Psychiatric Genetics Consortium.,Abecasis GR,Scott LJ,McCaroll SA,Pato CN,Boehnke M,Kang HM

doi

10.1101/gr.246934.118

subject

Has Abstract

pub_date

2020-02-01 00:00:00

pages

185-194

issue

2

eissn

1088-9051

issn

1549-5469

pii

gr.246934.118

journal_volume

30

pub_type

杂志文章
  • Whole-genome sequence assembly for mammalian genomes: Arachne 2.

    abstract::We previously described the whole-genome assembly program Arachne, presenting assemblies of simulated data for small to mid-sized genomes. Here we describe algorithmic adaptations to the program, allowing for assembly of mammalian-size genomes, and also improving the assembly of smaller genomes. Three principal change...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.828403

    authors: Jaffe DB,Butler J,Gnerre S,Mauceli E,Lindblad-Toh K,Mesirov JP,Zody MC,Lander ES

    更新日期:2003-01-01 00:00:00

  • Rate of elongation by RNA polymerase II is associated with specific gene features and epigenetic modifications.

    abstract::The rate of transcription elongation plays an important role in the timing of expression of full-length transcripts as well as in the regulation of alternative splicing. In this study, we coupled Bru-seq technology with 5,6-dichlorobenzimidazole 1-β-D-ribofuranoside (DRB) to estimate the elongation rates of over 2000 ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.171405.113

    authors: Veloso A,Kirkconnell KS,Magnuson B,Biewen B,Paulsen MT,Wilson TE,Ljungman M

    更新日期:2014-06-01 00:00:00

  • CRISPR RNAs trigger innate immune responses in human cells.

    abstract::Here, we report that CRISPR guide RNAs (gRNAs) with a 5'-triphosphate group (5'-ppp gRNAs) produced via in vitro transcription trigger RNA-sensing innate immune responses in human and murine cells, leading to cytotoxicity. 5'-ppp gRNAs in the cytosol are recognized by DDX58, which in turn activates type I interferon r...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.231936.117

    authors: Kim S,Koo T,Jee HG,Cho HY,Lee G,Lim DG,Shin HS,Kim JS

    更新日期:2018-02-22 00:00:00

  • A cross-platform analysis of 14,177 expression quantitative trait loci derived from lymphoblastoid cell lines.

    abstract::Gene expression levels can be an important link DNA between variation and phenotypic manifestations. Our previous map of global gene expression, based on ~400K single nucleotide polymorphisms (SNPs) and 50K transcripts in 400 sib pairs from the MRCA family panel, has been widely used to interpret the results of genome...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.142521.112

    authors: Liang L,Morar N,Dixon AL,Lathrop GM,Abecasis GR,Moffatt MF,Cookson WO

    更新日期:2013-04-01 00:00:00

  • Chromosomal instability mediated by non-B DNA: cruciform conformation and not DNA sequence is responsible for recurrent translocation in humans.

    abstract::Chromosomal aberrations have been thought to be random events. However, recent findings introduce a new paradigm in which certain DNA segments have the potential to adopt unusual conformations that lead to genomic instability and nonrandom chromosomal rearrangement. One of the best-studied examples is the palindromic ...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.079244.108

    authors: Inagaki H,Ohye T,Kogo H,Kato T,Bolor H,Taniguchi M,Shaikh TH,Emanuel BS,Kurahashi H

    更新日期:2009-02-01 00:00:00

  • Pervasive, genome-wide positive selection leading to functional divergence in the bacterial genus Campylobacter.

    abstract::An open question in bacterial genomics is the role that adaptive evolution of the core genome plays in diversification and adaptation of bacterial species, and how this might differ between groups of bacteria occupying different environmental circumstances. The genus Campylobacter encompasses several important human a...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.089250.108

    authors: Lefébure T,Stanhope MJ

    更新日期:2009-07-01 00:00:00

  • The Arabidopsis genome: a foundation for plant research.

    abstract::The sequence of the first plant genome was completed and published at the end of 2000. This spawned a series of large-scale projects aimed at discovering the functions of the 25,000+ genes identified in Arabidopsis thaliana (Arabidopsis). This review summarizes progress made in the past five years and speculates about...

    journal_title:Genome research

    pub_type: 杂志文章,评审

    doi:10.1101/gr.3723405

    authors: Bevan M,Walsh S

    更新日期:2005-12-01 00:00:00

  • High-salt-recovered sequences are associated with the active chromosomal compartment and with large ribonucleoprotein complexes including nuclear bodies.

    abstract::The mammalian cell nucleus contains numerous discrete suborganelles named nuclear bodies. While recruitment of specific genomic regions into these large ribonucleoprotein (RNP) complexes critically contributes to higher-order functional chromatin organization, such regions remain ill-defined. We have developed the hig...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.237073.118

    authors: Baudement MO,Cournac A,Court F,Seveno M,Parrinello H,Reynes C,Sabatier R,Bouschet T,Yi Z,Sallis S,Tancelin M,Rebouissou C,Cathala G,Lesne A,Mozziconacci J,Journot L,Forné T

    更新日期:2018-11-01 00:00:00

  • Next-generation tag sequencing for cancer gene expression profiling.

    abstract::We describe a new method, Tag-seq, which employs ultra high-throughput sequencing of 21 base pair cDNA tags for sensitive and cost-effective gene expression profiling. We compared Tag-seq data to LongSAGE data and observed improved representation of several classes of rare transcripts, including transcription factors,...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.094482.109

    authors: Morrissy AS,Morin RD,Delaney A,Zeng T,McDonald H,Jones S,Zhao Y,Hirst M,Marra MA

    更新日期:2009-10-01 00:00:00

  • End Sequence Analysis Toolkit (ESAT) expands the extractable information from single-cell RNA-seq data.

    abstract::RNA-seq protocols that focus on transcript termini are well suited for applications in which template quantity is limiting. Here we show that, when applied to end-sequencing data, analytical methods designed for global RNA-seq produce computational artifacts. To remedy this, we created the End Sequence Analysis Toolki...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.207902.116

    authors: Derr A,Yang C,Zilionis R,Sergushichev A,Blodgett DM,Redick S,Bortell R,Luban J,Harlan DM,Kadener S,Greiner DL,Klein A,Artyomov MN,Garber M

    更新日期:2016-10-01 00:00:00

  • Natural genetic variation in yeast longevity.

    abstract::The genetics of aging in the yeast Saccharomyces cerevisiae has involved the manipulation of individual genes in laboratory strains. We have instituted a quantitative genetic analysis of the yeast replicative lifespan by sampling the natural genetic variation in a wild yeast isolate. Haploid segregants from a cross be...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.136549.111

    authors: Stumpferl SW,Brand SE,Jiang JC,Korona B,Tiwari A,Dai J,Seo JG,Jazwinski SM

    更新日期:2012-10-01 00:00:00

  • Long-read single-molecule maps of the functional methylome.

    abstract::We report on the development of a methylation analysis workflow for optical detection of fluorescent methylation profiles along chromosomal DNA molecules. In combination with Bionano Genomics genome mapping technology, these profiles provide a hybrid genetic/epigenetic genome-wide map composed of DNA molecules spannin...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.240739.118

    authors: Sharim H,Grunwald A,Gabrieli T,Michaeli Y,Margalit S,Torchinsky D,Arielly R,Nifker G,Juhasz M,Gularek F,Almalvez M,Dufault B,Chandra SS,Liu A,Bhattacharya S,Chen YW,Vilain E,Wagner KR,Pevsner J,Reifenberger J,Lam

    更新日期:2019-04-01 00:00:00

  • Distribution of hammerhead and hammerhead-like RNA motifs through the GenBank.

    abstract::Hammerhead ribozymes previously were found in satellite RNAs from plant viroids and in repetitive DNA from certain species of newts and schistosomes. To determine if this catalytic RNA motif has a wider distribution, we decided to scrutinize the GenBank database for RNAs that contain hammerhead or hammerhead-like moti...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.10.7.1011

    authors: Ferbeyre G,Bourdeau V,Pageau M,Miramontes P,Cedergren R

    更新日期:2000-07-01 00:00:00

  • Software for automated analysis of DNA fingerprinting gels.

    abstract::Here we describe software tools for the automated detection of DNA restriction fragments resolved on agarose fingerprinting gels. We present a mathematical model for the location and shape of the restriction fragments as a function of fragment size, with model parameters determined empirically from "marker" lanes cont...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.904303

    authors: Fuhrmann DR,Krzywinski MI,Chiu R,Saeedi P,Schein JE,Bosdet IE,Chinwalla A,Hillier LW,Waterston RH,McPherson JD,Jones SJ,Marra MA

    更新日期:2003-05-01 00:00:00

  • Next-generation sequencing identifies the natural killer cell microRNA transcriptome.

    abstract::Natural killer (NK) cells are innate lymphocytes important for early host defense against infectious pathogens and surveillance against malignant transformation. Resting murine NK cells regulate the translation of effector molecule mRNAs (e.g., granzyme B, GzmB) through unclear molecular mechanisms. MicroRNAs (miRNAs)...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.107995.110

    authors: Fehniger TA,Wylie T,Germino E,Leong JW,Magrini VJ,Koul S,Keppel CR,Schneider SE,Koboldt DC,Sullivan RP,Heinz ME,Crosby SD,Nagarajan R,Ramsingh G,Link DC,Ley TJ,Mardis ER

    更新日期:2010-11-01 00:00:00

  • Phenotypically distinct female castes in honey bees are defined by alternative chromatin states during larval development.

    abstract::The capacity of the honey bee to produce three phenotypically distinct organisms (two female castes; queens and sterile workers, and haploid male drones) from one genotype represents one of the most remarkable examples of developmental plasticity in any phylum. The queen-worker morphological and reproductive divide is...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.236497.118

    authors: Wojciechowski M,Lowe R,Maleszka J,Conn D,Maleszka R,Hurd PJ

    更新日期:2018-10-01 00:00:00

  • Comparative analysis of gene-expression patterns in human and African great ape cultured fibroblasts.

    abstract::Although much is known about genetic variation in human and African great ape (chimpanzee, bonobo, and gorilla) genomes, substantially less is known about variation in gene-expression profiles within and among these species. This information is necessary for defining transcriptional regulatory networks that contribute...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.1289803

    authors: Karaman MW,Houck ML,Chemnick LG,Nagpal S,Chawannakul D,Sudano D,Pike BL,Ho VV,Ryder OA,Hacia JG

    更新日期:2003-07-01 00:00:00

  • Accurate gene-tree reconstruction by learning gene- and species-specific substitution rates across multiple complete genomes.

    abstract::Comparative genomics provides a general methodology for discovering functional DNA elements and understanding their evolution. The availability of many related genomes enables more powerful analyses, but requires rigorous phylogenetic methods to resolve orthologous genes and regions. Here, we use 12 recently sequenced...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.7105007

    authors: Rasmussen MD,Kellis M

    更新日期:2007-12-01 00:00:00

  • A systematic model to predict transcriptional regulatory mechanisms based on overrepresentation of transcription factor binding profiles.

    abstract::An important aspect of understanding a biological pathway is to delineate the transcriptional regulatory mechanisms of the genes involved. Two important tasks are often encountered when studying transcription regulation, i.e., (1) the identification of common transcriptional regulators of a set of coexpressed genes; (...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.4303406

    authors: Chang LW,Nagarajan R,Magee JA,Milbrandt J,Stormo GD

    更新日期:2006-03-01 00:00:00

  • Genome-wide map of regulatory interactions in the human genome.

    abstract::Increasing evidence suggests that interactions between regulatory genomic elements play an important role in regulating gene expression. We generated a genome-wide interaction map of regulatory elements in human cells (ENCODE tier 1 cells, K562, GM12878) using Chromatin Interaction Analysis by Paired-End Tag sequencin...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.176586.114

    authors: Heidari N,Phanstiel DH,He C,Grubert F,Jahanbani F,Kasowski M,Zhang MQ,Snyder MP

    更新日期:2014-12-01 00:00:00

  • MicroRNAs reinforce repression of PRC2 transcriptional targets independently and through a feed-forward regulatory network.

    abstract::Gene expression can be regulated at multiple levels, but it is not known if and how there is broad coordination between regulation at the transcriptional and post-transcriptional levels. Transcription factors and chromatin regulate gene expression transcriptionally, whereas microRNAs (miRNAs) are small regulatory RNAs...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.238311.118

    authors: Shivram H,Le SV,Iyer VR

    更新日期:2019-02-01 00:00:00

  • Alternative approach to a heavy weight problem.

    abstract::Obesity is reaching epidemic proportions in developed countries and represents a significant risk factor for hypertension, heart disease, diabetes, and dyslipidemia. Splicing mutations constitute at least 14% of disease-causing mutations, thus implicating polymorphisms that affect splicing as likely candidates for dis...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6661308

    authors: Goren A,Kim E,Amit M,Bochner R,Lev-Maor G,Ahituv N,Ast G

    更新日期:2008-02-01 00:00:00

  • Utilization of FISH in positional cloning: an example on 13q22.

    abstract::In positional cloning the initial assignment of a gene to a specific chromosomal locus is followed by physical mapping of the critical region. The construction of a high-resolution physical map still involves considerable effort. However, new high-resolution fluorescence in situ hybridization (FISH) techniques have fa...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6.10.1002

    authors: Laan M,Isosomppi J,Klockars T,Peltonen L,Palotie A

    更新日期:1996-10-01 00:00:00

  • Profiling patterned transcripts in Drosophila embryos.

    abstract::Here we describe a high-throughput screen to isolate transcripts with spatially restricted patterns of expression in early embryos. Our approach utilizes robotic automation for rapid analysis of sequence-selected cDNAs in a whole-mount in situ hybridization assay. We determined the spatial distribution of a random col...

    journal_title:Genome research

    pub_type: 信件

    doi:10.1101/gr.84402

    authors: Simin K,Scuderi A,Reamey J,Dunn D,Weiss R,Metherall JE,Letsou A

    更新日期:2002-07-01 00:00:00

  • Assessment of genome-wide protein function classification for Drosophila melanogaster.

    abstract::The functional classification of genes on a genome-wide scale is now in its infancy, and we make a first attempt to assess existing methods and identify sources of error. To this end, we compared two independent efforts for associating proteins with functions, one implemented by FlyBase and the other by PANTHER at Cel...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.771603

    authors: Mi H,Vandergriff J,Campbell M,Narechania A,Majoros W,Lewis S,Thomas PD,Ashburner M

    更新日期:2003-09-01 00:00:00

  • Mapping the pericentric heterochromatin by comparative genomic hybridization analysis and chromosome deletions in Drosophila melanogaster.

    abstract::Heterochromatin represents a significant portion of eukaryotic genomes and has essential structural and regulatory functions. Its molecular organization is largely unknown due to difficulties in sequencing through and assembling repetitive sequences enriched in the heterochromatin. Here we developed a novel strategy u...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.137406.112

    authors: He B,Caudy A,Parsons L,Rosebrock A,Pane A,Raj S,Wieschaus E

    更新日期:2012-12-01 00:00:00

  • A method for detecting IBD regions simultaneously in multiple individuals--with applications to disease genetics.

    abstract::All individuals in a finite population are related if traced back long enough and will, therefore, share regions of their genomes identical by descent (IBD). Detection of such regions has several important applications-from answering questions about human evolution to locating regions in the human genome containing di...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.115360.110

    authors: Moltke I,Albrechtsen A,Hansen TV,Nielsen FC,Nielsen R

    更新日期:2011-07-01 00:00:00

  • A complexity reduction algorithm for analysis and annotation of large genomic sequences.

    abstract::DNA is a universal language encrypted with biological instruction for life. In higher organisms, the genetic information is preserved predominantly in an organized exon/intron structure. When a gene is expressed, the exons are spliced together to form the transcript for protein synthesis. We have developed a complexit...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.313703

    authors: Chuang TJ,Lin WC,Lee HC,Wang CW,Hsiao KL,Wang ZH,Shieh D,Lin SC,Ch'ang LY

    更新日期:2003-02-01 00:00:00

  • A novel testis ubiquitin-binding protein gene arose by exon shuffling in hominoids.

    abstract::Most new genes arise by duplication of existing gene structures, after which relaxed selection on the new copy frequently leads to mutational inactivation of the duplicate; only rarely will a new gene with modified function emerge. Here we describe a unique mechanism of gene creation, whereby new combinations of funct...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.6252107

    authors: Babushok DV,Ohshima K,Ostertag EM,Chen X,Wang Y,Mandal PK,Okada N,Abrams CS,Kazazian HH Jr

    更新日期:2007-08-01 00:00:00

  • Exploring the human genome with functional maps.

    abstract::Human genomic data of many types are readily available, but the complexity and scale of human molecular biology make it difficult to integrate this body of data, understand it from a systems level, and apply it to the study of specific pathways or genetic disorders. An investigator could best explore a particular prot...

    journal_title:Genome research

    pub_type: 杂志文章

    doi:10.1101/gr.082214.108

    authors: Huttenhower C,Haley EM,Hibbs MA,Dumeaux V,Barrett DR,Coller HA,Troyanskaya OG

    更新日期:2009-06-01 00:00:00