Bovine breed-specific augmented reference graphs facilitate accurate sequence read mapping and unbiased variant discovery.

Abstract:

BACKGROUND:The current bovine genomic reference sequence was assembled from a Hereford cow. The resulting linear assembly lacks diversity because it does not contain allelic variation, a drawback of linear references that causes reference allele bias. High nucleotide diversity and the separation of individuals by hundreds of breeds make cattle ideally suited to investigate the optimal composition of variation-aware references. RESULTS:We augment the bovine linear reference sequence (ARS-UCD1.2) with variants filtered for allele frequency in dairy (Brown Swiss, Holstein) and dual-purpose (Fleckvieh, Original Braunvieh) cattle breeds to construct either breed-specific or pan-genome reference graphs using the vg toolkit. We find that read mapping is more accurate to variation-aware than linear references if pre-selected variants are used to construct the genome graphs. Graphs that contain random variants do not improve read mapping over the linear reference sequence. Breed-specific augmented and pan-genome graphs enable almost similar mapping accuracy improvements over the linear reference. We construct a whole-genome graph that contains the Hereford-based reference sequence and 14 million alleles that have alternate allele frequency greater than 0.03 in the Brown Swiss cattle breed. Our novel variation-aware reference facilitates accurate read mapping and unbiased sequence variant genotyping for SNPs and Indels. CONCLUSIONS:We develop the first variation-aware reference graph for an agricultural animal ( https://doi.org/10.5281/zenodo.3759712 ). Our novel reference structure improves sequence read mapping and variant genotyping over the linear reference. Our work is a first step towards the transition from linear to variation-aware reference structures in species with high genetic diversity and many sub-populations.

journal_name

Genome Biol

journal_title

Genome biology

authors

Crysnanto D,Pausch H

doi

10.1186/s13059-020-02105-0

subject

Has Abstract

pub_date

2020-07-27 00:00:00

pages

184

issue

1

eissn

1474-7596

issn

1474-760X

pii

10.1186/s13059-020-02105-0

journal_volume

21

pub_type

杂志文章
  • A genome-wide screen for modifiers of transgene variegation identifies genes with critical roles in development.

    abstract:BACKGROUND:Some years ago we established an N-ethyl-N-nitrosourea screen for modifiers of transgene variegation in the mouse and a preliminary description of the first six mutant lines, named MommeD1-D6, has been published. We have reported the underlying genes in three cases: MommeD1 is a mutation in SMC hinge domain ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2008-9-12-r182

    authors: Ashe A,Morgan DK,Whitelaw NC,Bruxner TJ,Vickaryous NK,Cox LL,Butterfield NC,Wicking C,Blewitt ME,Wilkins SJ,Anderson GJ,Cox TC,Whitelaw E

    更新日期:2008-01-01 00:00:00

  • Pharmacogenomic analysis of patient-derived tumor cells in gynecologic cancers.

    abstract:BACKGROUND:Gynecologic malignancy is one of the leading causes of mortality in female adults worldwide. Comprehensive genomic analysis has revealed a list of molecular aberrations that are essential to tumorigenesis, progression, and metastasis of gynecologic tumors. However, targeting such alterations has frequently l...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1848-3

    authors: Sa JK,Hwang JR,Cho YJ,Ryu JY,Choi JJ,Jeong SY,Kim J,Kim MS,Paik ES,Lee YY,Choi CH,Kim TJ,Kim BG,Bae DS,Lee Y,Her NG,Shin YJ,Cho HJ,Kim JY,Seo YJ,Koo H,Oh JW,Lee T,Kim HS,Song SY,Bae JS,Park WY,Han HD

    更新日期:2019-11-26 00:00:00

  • Butterfly gene flow goes berserk.

    abstract::A new study shows that genomic introgression between two Heliconius butterfly species is not solely confined to color pattern loci. ...

    journal_title:Genome biology

    pub_type: 评论,杂志文章

    doi:10.1186/s13059-016-0898-z

    authors: Ffrench-Constant RH

    更新日期:2016-02-27 00:00:00

  • A Drosophila protein-interaction map centered on cell-cycle regulators.

    abstract:BACKGROUND:Maps depicting binary interactions between proteins can be powerful starting points for understanding biological systems. A proven technology for generating such maps is high-throughput yeast two-hybrid screening. In the most extensive screen to date, a Gal4-based two-hybrid system was used recently to detec...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2004-5-12-r96

    authors: Stanyon CA,Liu G,Mangiola BA,Patel N,Giot L,Kuang B,Zhang H,Zhong J,Finley RL Jr

    更新日期:2004-01-01 00:00:00

  • MicroPro: using metagenomic unmapped reads to provide insights into human microbiota and disease associations.

    abstract::We develop a metagenomic data analysis pipeline, MicroPro, that takes into account all reads from known and unknown microbial organisms and associates viruses with complex diseases. We utilize MicroPro to analyze four metagenomic datasets relating to colorectal cancer, type 2 diabetes, and liver cirrhosis and show tha...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1773-5

    authors: Zhu Z,Ren J,Michail S,Sun F

    更新日期:2019-08-06 00:00:00

  • The bread wheat epigenomic map reveals distinct chromatin architectural and evolutionary features of functional genetic elements.

    abstract:BACKGROUND:Bread wheat is an allohexaploid species with a 16-Gb genome that has large intergenic regions, which presents a big challenge for pinpointing regulatory elements and further revealing the transcriptional regulatory mechanisms. Chromatin profiling to characterize the combinatorial patterns of chromatin signat...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1746-8

    authors: Li Z,Wang M,Lin K,Xie Y,Guo J,Ye L,Zhuang Y,Teng W,Ran X,Tong Y,Xue Y,Zhang W,Zhang Y

    更新日期:2019-07-15 00:00:00

  • Roles of piRNAs in transposon and pseudogene regulation of germline mRNAs and lncRNAs.

    abstract::PIWI proteins, a subfamily of PAZ/PIWI Domain family RNA-binding proteins, are best known for their function in silencing transposons and germline development by partnering with small noncoding RNAs called PIWI-interacting RNAs (piRNAs). However, recent studies have revealed multifaceted roles of the PIWI-piRNA pathwa...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/s13059-020-02221-x

    authors: Wang C,Lin H

    更新日期:2021-01-08 00:00:00

  • The evolution of relapse of adult T cell acute lymphoblastic leukemia.

    abstract:BACKGROUND:Adult T cell acute lymphoblastic leukemia (T-ALL) is a rare disease that affects less than 10 individuals in one million. It has been less studied than its cognate pediatric malignancy, which is more prevalent. A higher percentage of the adult patients relapse, compared to children. It is thus essential to s...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-020-02192-z

    authors: Sentís I,Gonzalez S,Genescà E,García-Hernández V,Muiños F,Gonzalez C,López-Arribillaga E,Gonzalez J,Fernandez-Ibarrondo L,Mularoni L,Espinosa L,Bellosillo B,Ribera JM,Bigas A,Gonzalez-Perez A,Lopez-Bigas N

    更新日期:2020-11-23 00:00:00

  • Minimal genome-wide human CRISPR-Cas9 library.

    abstract::CRISPR guide RNA libraries have been iteratively improved to provide increasingly efficient reagents, although their large size is a barrier for many applications. We design an optimised minimal genome-wide human CRISPR-Cas9 library (MinLibCas9) by mining existing large-scale gene loss-of-function datasets, resulting ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-021-02268-4

    authors: Gonçalves E,Thomas M,Behan FM,Picco G,Pacini C,Allen F,Vinceti A,Sharma M,Jackson DA,Price S,Beaver CM,Dovey O,Parry-Smith D,Iorio F,Parts L,Yusa K,Garnett MJ

    更新日期:2021-01-21 00:00:00

  • Exome-chip meta-analysis identifies novel loci associated with cardiac conduction, including ADAMTS6.

    abstract:BACKGROUND:Genome-wide association studies conducted on QRS duration, an electrocardiographic measurement associated with heart failure and sudden cardiac death, have led to novel biological insights into cardiac function. However, the variants identified fall predominantly in non-coding regions and their underlying me...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-018-1457-6

    authors: Prins BP,Mead TJ,Brody JA,Sveinbjornsson G,Ntalla I,Bihlmeyer NA,van den Berg M,Bork-Jensen J,Cappellani S,Van Duijvenboden S,Klena NT,Gabriel GC,Liu X,Gulec C,Grarup N,Haessler J,Hall LM,Iorio A,Isaacs A,Li-Gao R,

    更新日期:2018-07-17 00:00:00

  • A Keystone for ncRNA.

    abstract::A report on the Keystone symposium 'Non-coding RNAs' held at Snowbird, Utah, USA, 31 March to 5 April 2012. ...

    journal_title:Genome biology

    pub_type:

    doi:10.1186/gb-2012-13-5-315

    authors: Hacisuleyman E,Cabili MN,Rinn JL

    更新日期:2012-05-25 00:00:00

  • Assessing taxonomic metagenome profilers with OPAL.

    abstract::The explosive growth in taxonomic metagenome profiling methods over the past years has created a need for systematic comparisons using relevant performance criteria. The Open-community Profiling Assessment tooL (OPAL) implements commonly used performance metrics, including those of the first challenge of the initiativ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1646-y

    authors: Meyer F,Bremges A,Belmann P,Janssen S,McHardy AC,Koslicki D

    更新日期:2019-03-04 00:00:00

  • Wrangling for microRNAs provokes much crosstalk.

    abstract::Levels of transcripts sharing microRNA response elements are co-regulated. These RNA-RNA interactions imply that combinations of microRNAs modulate cell-specific transcript networks. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2011-12-11-132

    authors: Marques AC,Tan J,Ponting CP

    更新日期:2011-11-21 00:00:00

  • Wheat chromatin architecture is organized in genome territories and transcription factories.

    abstract:BACKGROUND:Polyploidy is ubiquitous in eukaryotic plant and fungal lineages, and it leads to the co-existence of several copies of similar or related genomes in one nucleus. In plants, polyploidy is considered a major factor in successful domestication. However, polyploidy challenges chromosome folding architecture in ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-020-01998-1

    authors: Concia L,Veluchamy A,Ramirez-Prado JS,Martin-Ramirez A,Huang Y,Perez M,Domenichini S,Rodriguez Granados NY,Kim S,Blein T,Duncan S,Pichot C,Manza-Mianza D,Juery C,Paux E,Moore G,Hirt H,Bergounioux C,Crespi M,Mahfouz

    更新日期:2020-04-29 00:00:00

  • Molecular orchestration of the hepatic circadian symphony.

    abstract::The circadian clock determines the rhythmic expression of many different genes throughout a 24-hour period. A recent study investigating the circadian regulation of liver proteins reveals multiple levels of regulation, including transcriptional, post-transcriptional and post-translational mechanisms. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2006-7-9-234

    authors: Albrecht U

    更新日期:2006-01-01 00:00:00

  • Studying the microbiology of the indoor environment.

    abstract::The majority of people in the developed world spend more than 90% of their lives indoors. Here, we examine our understanding of the bacteria that co-inhabit our artificial world and how they might influence human health. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2013-14-2-202

    authors: Kelley ST,Gilbert JA

    更新日期:2013-02-28 00:00:00

  • Demystifying "drop-outs" in single-cell UMI data.

    abstract::Many existing pipelines for scRNA-seq data apply pre-processing steps such as normalization or imputation to account for excessive zeros or "drop-outs." Here, we extensively analyze diverse UMI data sets to show that clustering should be the foremost step of the workflow. We observe that most drop-outs disappear once ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-020-02096-y

    authors: Kim TH,Zhou X,Chen M

    更新日期:2020-08-06 00:00:00

  • Copy number variation goes clinical.

    abstract::A report of the First Golden Helix Symposium 'Copy Number Variation (CNV) and Genomic Alterations in Health and Disease', Athens, Greece, 28-29 November 2008. ...

    journal_title:Genome biology

    pub_type:

    doi:10.1186/gb-2009-10-1-301

    authors: Le Caignec C,Redon R

    更新日期:2009-01-01 00:00:00

  • Having a BLAST with bioinformatics (and avoiding BLASTphemy).

    abstract::Searching for similarities between biological sequences is the principal means by which bioinformatics contributes to our understanding of biology. Of the various informatics tools developed to accomplish this task, the most widely used is BLAST, the basic local alignment search tool. This article discusses the princi...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2001-2-10-reviews2002

    authors: Pertsemlidis A,Fondon JW 3rd

    更新日期:2001-01-01 00:00:00

  • ReMixT: clone-specific genomic structure estimation in cancer.

    abstract::Somatic evolution of malignant cells produces tumors composed of multiple clonal populations, distinguished in part by rearrangements and copy number changes affecting chromosomal segments. Whole genome sequencing mixes the signals of sampled populations, diluting the signals of clone-specific aberrations, and complic...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-017-1267-2

    authors: McPherson AW,Roth A,Ha G,Chauve C,Steif A,de Souza CPE,Eirew P,Bouchard-Côté A,Aparicio S,Sahinalp SC,Shah SP

    更新日期:2017-07-27 00:00:00

  • Comparative genome sequence analysis underscores mycoparasitism as the ancestral life style of Trichoderma.

    abstract:BACKGROUND:Mycoparasitism, a lifestyle where one fungus is parasitic on another fungus, has special relevance when the prey is a plant pathogen, providing a strategy for biological control of pests for plant protection. Probably, the most studied biocontrol agents are species of the genus Hypocrea/Trichoderma. RESULTS...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2011-12-4-r40

    authors: Kubicek CP,Herrera-Estrella A,Seidl-Seiboth V,Martinez DA,Druzhinina IS,Thon M,Zeilinger S,Casas-Flores S,Horwitz BA,Mukherjee PK,Mukherjee M,Kredics L,Alcaraz LD,Aerts A,Antal Z,Atanasova L,Cervantes-Badillo MG,Challac

    更新日期:2011-01-01 00:00:00

  • Genome-wide analysis of the maternal-to-zygotic transition in Drosophila primordial germ cells.

    abstract:BACKGROUND:During the maternal-to-zygotic transition (MZT) vast changes in the embryonic transcriptome are produced by a combination of two processes: elimination of maternally provided mRNAs and synthesis of new transcripts from the zygotic genome. Previous genome-wide analyses of the MZT have been restricted to whole...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2012-13-2-r11

    authors: Siddiqui NU,Li X,Luo H,Karaiskakis A,Hou H,Kislinger T,Westwood JT,Morris Q,Lipshitz HD

    更新日期:2012-02-20 00:00:00

  • iBsu1103: a new genome-scale metabolic model of Bacillus subtilis based on SEED annotations.

    abstract:BACKGROUND:Bacillus subtilis is an organism of interest because of its extensive industrial applications, its similarity to pathogenic organisms, and its role as the model organism for Gram-positive, sporulating bacteria. In this work, we introduce a new genome-scale metabolic model of B. subtilis 168 called iBsu1103. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2009-10-6-r69

    authors: Henry CS,Zinner JF,Cohoon MP,Stevens RL

    更新日期:2009-01-01 00:00:00

  • Genome-wide identification and functional analysis of Apobec-1-mediated C-to-U RNA editing in mouse small intestine and liver.

    abstract:BACKGROUND:RNA editing encompasses a post-transcriptional process in which the genomically templated sequence is enzymatically altered and introduces a modified base into the edited transcript. Mammalian C-to-U RNA editing represents a distinct subtype of base modification, whose prototype is intestinal apolipoprotein ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2014-15-6-r79

    authors: Blanc V,Park E,Schaefer S,Miller M,Lin Y,Kennedy S,Billing AM,Ben Hamidane H,Graumann J,Mortazavi A,Nadeau JH,Davidson NO

    更新日期:2014-06-19 00:00:00

  • SAGE profiling of the forelimb and hindlimb.

    abstract::A recent study has used serial analysis of gene expression to compare mouse forelimb and hindlimb gene-expression profiles. The method successfully identified known regulators of limb identity and has generated a candidate set of differentially expressed genes that may regulate limb identity. ...

    journal_title:Genome biology

    pub_type: 杂志文章,评审

    doi:10.1186/gb-2002-3-3-reviews1007

    authors: Logan M

    更新日期:2002-01-01 00:00:00

  • High-resolution genomic analysis: the tumor-immune interface comes into focus.

    abstract::A genomic analysis of heterogeneous colorectal tumor samples has uncovered interactions between immunophenotype and various aspects of tumor biology, with implications for informing the choice of immunotherapies for specific patients and guiding the design of personalized neoantigen-based vaccines. ...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-015-0631-3

    authors: Havel JJ,Chan TA

    更新日期:2015-03-31 00:00:00

  • Retrozymes are a unique family of non-autonomous retrotransposons with hammerhead ribozymes that propagate in plants through circular RNAs.

    abstract:BACKGROUND:Catalytic RNAs, or ribozymes, are regarded as fossils of a prebiotic RNA world that have remained in the genomes of modern organisms. The simplest ribozymes are the small self-cleaving RNAs, like the hammerhead ribozyme, which have been historically considered biological oddities restricted to some RNA patho...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-016-1002-4

    authors: Cervera A,Urbina D,de la Peña M

    更新日期:2016-06-23 00:00:00

  • Comprehensive miRNA sequence analysis reveals survival differences in diffuse large B-cell lymphoma patients.

    abstract:BACKGROUND:Diffuse large B-cell lymphoma (DLBCL) is an aggressive disease, with 30% to 40% of patients failing to be cured with available primary therapy. microRNAs (miRNAs) are RNA molecules that attenuate expression of their mRNA targets. To characterize the DLBCL miRNome, we sequenced miRNAs from 92 DLBCL and 15 ben...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-014-0568-y

    authors: Lim EL,Trinh DL,Scott DW,Chu A,Krzywinski M,Zhao Y,Robertson AG,Mungall AJ,Schein J,Boyle M,Mottok A,Ennishi D,Johnson NA,Steidl C,Connors JM,Morin RD,Gascoyne RD,Marra MA

    更新日期:2015-01-29 00:00:00

  • Comparative and functional genomics reveals genetic diversity and determinants of host specificity among reference strains and a large collection of Chinese isolates of the phytopathogen Xanthomonas campestris pv. campestris.

    abstract:BACKGROUND:Xanthomonas campestris pathovar campestris (Xcc) is the causal agent of black rot disease of crucifers worldwide. The molecular genetic diversity and host specificity of Xcc are poorly understood. RESULTS:We constructed a microarray based on the complete genome sequence of Xcc strain 8004 and investigated t...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/gb-2007-8-10-r218

    authors: He YQ,Zhang L,Jiang BL,Zhang ZC,Xu RQ,Tang DJ,Qin J,Jiang W,Zhang X,Liao J,Cao JR,Zhang SS,Wei ML,Liang XX,Lu GT,Feng JX,Chen B,Cheng J,Tang JL

    更新日期:2007-01-01 00:00:00

  • Comprehensive evaluation of structural variation detection algorithms for whole genome sequencing.

    abstract:BACKGROUND:Structural variations (SVs) or copy number variations (CNVs) greatly impact the functions of the genes encoded in the genome and are responsible for diverse human diseases. Although a number of existing SV detection algorithms can detect many types of SVs using whole genome sequencing (WGS) data, no single a...

    journal_title:Genome biology

    pub_type: 杂志文章

    doi:10.1186/s13059-019-1720-5

    authors: Kosugi S,Momozawa Y,Liu X,Terao C,Kubo M,Kamatani Y

    更新日期:2019-06-03 00:00:00