InvBFM: finding genomic inversions from high-throughput sequence data based on feature mining.

Abstract:

BACKGROUND:Genomic inversion is one type of structural variations (SVs) and is known to play an important biological role. An established problem in sequence data analysis is calling inversions from high-throughput sequence data. It is more difficult to detect inversions because they are surrounded by duplication or other types of SVs in the inversion areas. Existing inversion detection tools are mainly based on three approaches: paired-end reads, split-mapped reads, and assembly. However, existing tools suffer from unsatisfying precision or sensitivity (eg: only 50~60% sensitivity) and it needs to be improved. RESULT:In this paper, we present a new inversion calling method called InvBFM. InvBFM calls inversions based on feature mining. InvBFM first gathers the results of existing inversion detection tools as candidates for inversions. It then extracts features from the inversions. Finally, it calls the true inversions by a trained support vector machine (SVM) classifier. CONCLUSIONS:Our results on real sequence data from the 1000 Genomes Project show that by combining feature mining and a machine learning model, InvBFM outperforms existing tools. InvBFM is written in Python and Shell and is available for download at https://github.com/wzj1234/InvBFM.

journal_name

BMC Genomics

journal_title

BMC genomics

authors

Wu Z,Wu Y,Gao J

doi

10.1186/s12864-020-6585-1

subject

Has Abstract

pub_date

2020-03-05 00:00:00

pages

173

issue

Suppl 1

issn

1471-2164

pii

10.1186/s12864-020-6585-1

journal_volume

21

pub_type

杂志文章
  • Tissue-specific transcriptomics and proteomics of a filarial nematode and its Wolbachia endosymbiont.

    abstract:BACKGROUND:Filarial nematodes cause debilitating human diseases. While treatable, recent evidence suggests drug resistance is developing, necessitating the development of novel targets and new treatment options. Although transcriptomic and proteomic studies around the nematode life cycle have greatly enhanced our knowl...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-2083-2

    authors: Luck AN,Anderson KG,McClung CM,VerBerkmoes NC,Foster JM,Michalski ML,Slatko BE

    更新日期:2015-11-11 00:00:00

  • Genomic sequence, organization and characteristics of a new nucleopolyhedrovirus isolated from Clanis bilineata larva.

    abstract:BACKGROUND:Baculoviruses are well known for their potential as biological agents for controlling agricultural and forest pests. They are also widely used as expression vectors in molecular cloning studies. The genome sequences of 48 baculoviruses are currently available in NCBI databases. As the number of sequenced vir...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-91

    authors: Zhu SY,Yi JP,Shen WD,Wang LQ,He HG,Wang Y,Li B,Wang WB

    更新日期:2009-02-25 00:00:00

  • RNA profiles of rat olfactory epithelia: individual and age related variations.

    abstract:BACKGROUND:Mammalian genomes contain a large number (approximately 1000) of olfactory receptor (OR) genes, many of which (20 to 50%) are pseudogenes. OR gene transcription is not restricted to the olfactory epithelium, but is found in numerous tissues. Using microarray hybridization and RTqPCR, we analyzed the mRNA pro...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-572

    authors: Rimbault M,Robin S,Vaysse A,Galibert F

    更新日期:2009-12-02 00:00:00

  • Transversions have larger regulatory effects than transitions.

    abstract:BACKGROUND:Transversions (Tv's) are more likely to alter the amino acid sequence of proteins than transitions (Ts's), and local deviations in the Ts:Tv ratio are indicative of evolutionary selection on genes. Whether the two different types of mutations have different effects in non-protein-coding sequences remains unk...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-3785-4

    authors: Guo C,McDowell IC,Nodzenski M,Scholtens DM,Allen AS,Lowe WL,Reddy TE

    更新日期:2017-05-19 00:00:00

  • Mosquito transcriptome changes and filarial worm resistance in Armigeres subalbatus.

    abstract:BACKGROUND:Armigeres subalbatus is a natural vector of the filarial worm Brugia pahangi, but it rapidly and proficiently kills Brugia malayi microfilariae by melanotic encapsulation. Because B. malayi and B. pahangi are morphologically and biologically similar, the Armigeres-Brugia system serves as a valuable model for...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-8-463

    authors: Aliota MT,Fuchs JF,Mayhew GF,Chen CC,Christensen BM

    更新日期:2007-12-18 00:00:00

  • Enrichment of Triticum aestivum gene annotations using ortholog cliques and gene ontologies in other plants.

    abstract:BACKGROUND:While the gargantuan multi-nation effort of sequencing T. aestivum gets close to completion, the annotation process for the vast number of wheat genes and proteins is in its infancy. Previous experimental studies carried out on model plant organisms such as A. thaliana and O. sativa provide a plethora of gen...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1496-2

    authors: Tulpan D,Leger S,Tchagang A,Pan Y

    更新日期:2015-04-15 00:00:00

  • Transcriptomic analysis of Chinese bayberry (Myrica rubra) fruit development and ripening using RNA-Seq.

    abstract:BACKGROUND:Chinese bayberry (Myrica rubra Sieb. and Zucc.) is an important subtropical fruit crop and an ideal species for fruit quality research due to the rapid and substantial changes that occur during development and ripening, including changes in fruit color and taste. However, research at the molecular level is l...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-19

    authors: Feng C,Chen M,Xu CJ,Bai L,Yin XR,Li X,Allan AC,Ferguson IB,Chen KS

    更新日期:2012-01-13 00:00:00

  • Wheat EST resources for functional genomics of abiotic stress.

    abstract:BACKGROUND:Wheat is an excellent species to study freezing tolerance and other abiotic stresses. However, the sequence of the wheat genome has not been completely characterized due to its complexity and large size. To circumvent this obstacle and identify genes involved in cold acclimation and associated stresses, a la...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-7-149

    authors: Houde M,Belcaid M,Ouellet F,Danyluk J,Monroy AF,Dryanova A,Gulick P,Bergeron A,Laroche A,Links MG,MacCarthy L,Crosby WL,Sarhan F

    更新日期:2006-06-13 00:00:00

  • Polygenic and sex specific architecture for two maturation traits in farmed Atlantic salmon.

    abstract:BACKGROUND:A key developmental transformation in the life of all vertebrates is the transition to sexual maturity, whereby individuals are capable of reproducing for the first time. In the farming of Atlantic salmon, early maturation prior to harvest size has serious negative production impacts. RESULTS:We report geno...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-5525-4

    authors: Mohamed AR,Verbyla KL,Al-Mamun HA,McWilliam S,Evans B,King H,Kube P,Kijas JW

    更新日期:2019-02-15 00:00:00

  • Exploring the genetics of trotting racing ability in horses using a unique Nordic horse model.

    abstract:BACKGROUND:Horses have been strongly selected for speed, strength, and endurance-exercise traits since the onset of domestication. As a result, highly specialized horse breeds have developed with many modern horse breeds often representing closed populations with high phenotypic and genetic uniformity. However, a great...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-5484-9

    authors: Velie BD,Lillie M,Fegraeus KJ,Rosengren MK,Solé M,Wiklund M,Ihler CF,Strand E,Lindgren G

    更新日期:2019-02-04 00:00:00

  • Profile and functional analysis of small RNAs derived from Aspergillus fumigatus infected with double-stranded RNA mycoviruses.

    abstract:BACKGROUND:Mycoviruses are viruses that naturally infect and replicate in fungi. Aspergillus fumigatus, an opportunistic pathogen causing fungal lung diseases in humans and animals, was recently shown to harbour several different types of mycoviruses. A well-characterised defence against virus infection is RNA silencin...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-3773-8

    authors: Özkan S,Mohorianu I,Xu P,Dalmay T,Coutts RHA

    更新日期:2017-05-30 00:00:00

  • Genome-wide study on genetic diversity and phylogeny of five species in the genus Cervus.

    abstract:BACKGROUND:Previous investigations of phylogeny in Cervus recovered many clades without whole genomic support. METHODS:In this study, the genetic diversity and phylogeny of 5 species (21 subspecies/populations from C. unicolor, C. albirostris, C. nippon, C. elaphus and C. eldii) in the genus Cervus were analyzed using...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-5785-z

    authors: Hu P,Shao Y,Xu J,Wang T,Li Y,Liu H,Rong M,Su W,Chen B,Cui S,Cui X,Yang F,Tamate H,Xing X

    更新日期:2019-05-17 00:00:00

  • Gene expression patterns that predict sensitivity to epidermal growth factor receptor tyrosine kinase inhibitors in lung cancer cell lines and human lung tumors.

    abstract:BACKGROUND:Increased focus surrounds identifying patients with advanced non-small cell lung cancer (NSCLC) who will benefit from treatment with epidermal growth factor receptor (EGFR) tyrosine kinase inhibitors (TKI). EGFR mutation, gene copy number, coexpression of ErbB proteins and ligands, and epithelial to mesenchy...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-7-289

    authors: Balko JM,Potti A,Saunders C,Stromberg A,Haura EB,Black EP

    更新日期:2006-11-10 00:00:00

  • Comparative genomics of the Bifidobacterium breve taxon.

    abstract:BACKGROUND:Bifidobacteria are commonly found as part of the microbiota of the gastrointestinal tract (GIT) of a broad range of hosts, where their presence is positively correlated with the host's health status. In this study, we assessed the genomes of thirteen representatives of Bifidobacterium breve, which is not onl...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-170

    authors: Bottacini F,O'Connell Motherway M,Kuczynski J,O'Connell KJ,Serafini F,Duranti S,Milani C,Turroni F,Lugli GA,Zomer A,Zhurina D,Riedel C,Ventura M,van Sinderen D

    更新日期:2014-03-01 00:00:00

  • Systemic treatment of xenografts with vaccinia virus GLV-1h68 reveals the immunologic facet of oncolytic therapy.

    abstract:BACKGROUND:GLV-1h68 is an attenuated recombinant vaccinia virus (VACV) that selectively colonizes established human xenografts inducing their complete regression. RESULTS:Here, we explored xenograft/VACV/host interactions in vivo adopting organism-specific expression arrays and tumor cell/VACV in vitro comparing VACV ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-301

    authors: Worschech A,Chen N,Yu YA,Zhang Q,Pos Z,Weibel S,Raab V,Sabatino M,Monaco A,Liu H,Monsurró V,Buller RM,Stroncek DF,Wang E,Szalay AA,Marincola FM

    更新日期:2009-07-07 00:00:00

  • Transcriptomic response to heat stress among ecologically divergent populations of redband trout.

    abstract:BACKGROUND:As ectothermic organisms have evolved to differing aquatic climates, the molecular basis of thermal adaptation is a key area of research. In this study, we tested for differential transcriptional response of ecologically divergent populations of redband trout (Oncorhynchus mykiss gairdneri) that have evolved...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1246-5

    authors: Narum SR,Campbell NR

    更新日期:2015-02-21 00:00:00

  • Comparative genome analysis of jujube witches'-broom Phytoplasma, an obligate pathogen that causes jujube witches'-broom disease.

    abstract:BACKGROUND:JWB phytoplasma is a kind of insect-transmitted and uncultivable bacterial plant pathogen causeing a destructive Jujube disease. To date, no genome information about JWB phytoplasma has been published, which hindered its characterization at genomic level. To understand its pathogenicity and ecology, the geno...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5075-1

    authors: Wang J,Song L,Jiao Q,Yang S,Gao R,Lu X,Zhou G

    更新日期:2018-09-19 00:00:00

  • HumCFS: a database of fragile sites in human chromosomes.

    abstract:BACKGROUND:Fragile sites are the chromosomal regions that are susceptible to breakage, and their frequency varies among the human population. Based on the frequency of fragile site induction, they are categorized as common and rare fragile sites. Common fragile sites are sensitive to replication stress and often rearra...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5330-5

    authors: Kumar R,Nagpal G,Kumar V,Usmani SS,Agrawal P,Raghava GPS

    更新日期:2019-04-18 00:00:00

  • Identification of a strawberry flavor gene candidate using an integrated genetic-genomic-analytical chemistry approach.

    abstract:BACKGROUND:There is interest in improving the flavor of commercial strawberry (Fragaria × ananassa) varieties. Fruit flavor is shaped by combinations of sugars, acids and volatile compounds. Many efforts seek to use genomics-based strategies to identify genes controlling flavor, and then designing durable molecular mar...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-217

    authors: Chambers AH,Pillet J,Plotto A,Bai J,Whitaker VM,Folta KM

    更新日期:2014-04-17 00:00:00

  • Overlapping genes in the human and mouse genomes.

    abstract:BACKGROUND:Increasing evidence suggests that overlapping genes are much more common in eukaryotic genomes than previously thought. In this study we identified and characterized the overlapping genes in a set of 13,484 pairs of human-mouse orthologous genes. RESULTS:About 10% of the genes under study are overlapping ge...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-169

    authors: Sanna CR,Li WH,Zhang L

    更新日期:2008-04-14 00:00:00

  • Short-term genome evolution of Listeria monocytogenes in a non-controlled environment.

    abstract:BACKGROUND:While increasing data on bacterial evolution in controlled environments are available, our understanding of bacterial genome evolution in natural environments is limited. We thus performed full genome analyses on four Listeria monocytogenes, including human and food isolates from both a 1988 case of sporadic...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-539

    authors: Orsi RH,Borowsky ML,Lauer P,Young SK,Nusbaum C,Galagan JE,Birren BW,Ivy RA,Sun Q,Graves LM,Swaminathan B,Wiedmann M

    更新日期:2008-11-13 00:00:00

  • Comparison of gene expression of Paramecium bursaria with and without Chlorella variabilis symbionts.

    abstract:BACKGROUND:The ciliate Paramecium bursaria harbors several hundred cells of the green-alga Chlorella sp. in their cytoplasm. Irrespective of the mutual relation between P. bursaria and the symbiotic algae, both cells retain the ability to grow without the partner. They can easily reestablish endosymbiosis when put in c...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-183

    authors: Kodama Y,Suzuki H,Dohra H,Sugii M,Kitazume T,Yamaguchi K,Shigenobu S,Fujishima M

    更新日期:2014-03-10 00:00:00

  • De novo assembly of highly diverse viral populations.

    abstract:BACKGROUND:Extensive genetic diversity in viral populations within infected hosts and the divergence of variants from existing reference genomes impede the analysis of deep viral sequencing data. A de novo population consensus assembly is valuable both as a single linear representation of the population and as a backbo...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-475

    authors: Yang X,Charlebois P,Gnerre S,Coole MG,Lennon NJ,Levin JZ,Qu J,Ryan EM,Zody MC,Henn MR

    更新日期:2012-09-13 00:00:00

  • Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

    abstract:BACKGROUND:Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearra...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-561

    authors: Jo YD,Choi Y,Kim DH,Kim BD,Kang BC

    更新日期:2014-07-04 00:00:00

  • Genome-wide analysis of the TPX2 family proteins in Eucalyptus grandis.

    abstract:BACKGROUND:The Xklp2 (TPX2) proteins belong to the microtubule-associated (MAP) family of proteins. All members of the family contain the conserved TPX2 motif, which can interact with microtubules, regulate microtubule dynamics or assist with different microtubule functions, for example, maintenance of cell morphology ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-3303-0

    authors: Du P,Kumar M,Yao Y,Xie Q,Wang J,Zhang B,Gan S,Wang Y,Wu AM

    更新日期:2016-11-24 00:00:00

  • Origin of a novel protein-coding gene family with similar signal sequence in Schistosoma japonicum.

    abstract:BACKGROUND:Evolution of novel protein-coding genes is the bedrock of adaptive evolution. Recently, we identified six protein-coding genes with similar signal sequence from Schistosoma japonicum egg stage mRNA using signal sequence trap (SST). To find the mechanism underlying the origination of these genes with similar ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-260

    authors: Mbanefo EC,Chuanxin Y,Kikuchi M,Shuaibu MN,Boamah D,Kirinoki M,Hayashi N,Chigusa Y,Osada Y,Hamano S,Hirayama K

    更新日期:2012-06-20 00:00:00

  • High, clustered, nucleotide diversity in the genome of Anopheles gambiae revealed through pooled-template sequencing: implications for high-throughput genotyping protocols.

    abstract:BACKGROUND:Association mapping approaches are dependent upon discovery and validation of single nucleotide polymorphisms (SNPs). To further association studies in Anopheles gambiae we conducted a major resequencing programme, primarily targeting regions within or close to candidate genes for insecticide resistance. RE...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-320

    authors: Wilding CS,Weetman D,Steen K,Donnelly MJ

    更新日期:2009-07-16 00:00:00

  • Dynamic, mating-induced gene expression changes in female head and brain tissues of Drosophila melanogaster.

    abstract:BACKGROUND:Drosophila melanogaster females show changes in behavior and physiology after mating that are thought to maximize the number of progeny resulting from the most recent copulation. Sperm and seminal fluid proteins induce post-mating changes in females, however, very little is known about the resulting gene exp...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-11-541

    authors: Dalton JE,Kacheria TS,Knott SR,Lebo MS,Nishitani A,Sanders LE,Stirling EJ,Winbush A,Arbeitman MN

    更新日期:2010-10-06 00:00:00

  • Haplotype analysis of sucrose synthase gene family in three Saccharum species.

    abstract:BACKGROUND:Sugarcane is an economically important crop contributing about 80% and 40% to the world sugar and ethanol production, respectively. The complicated genetics consequential to its complex polyploid genome, however, have impeded efforts to improve sugar yield and related important agronomic traits. Modern sugar...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-314

    authors: Zhang J,Arro J,Chen Y,Ming R

    更新日期:2013-05-10 00:00:00

  • Mixed evolutionary origins of endogenous biomass-depolymerizing enzymes in animals.

    abstract:BACKGROUND:Animals are thought to achieve lignocellulose digestion via symbiotic associations with gut microbes; this view leads to significant focus on bacteria and fungi for lignocellulolytic systems. The presence of biomass conversion systems hardwired into animal genomes has not yet been unequivocally demonstrated....

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-4861-0

    authors: Chang WH,Lai AG

    更新日期:2018-06-20 00:00:00