MITGARD: an automated pipeline for mitochondrial genome assembly in eukaryotic species using RNA-seq data.

Abstract:

MOTIVATION:Over the past decade, the field of next-generation sequencing (NGS) has seen dramatic advances in methods and a decrease in costs. Consequently, a large expansion of data has been generated by NGS, most of which have originated from RNA-sequencing (RNA-seq) experiments. Because mitochondrial genes are expressed in most eukaryotic cells, mitochondrial mRNA sequences are usually co-sequenced within the target transcriptome, generating data that are commonly underused or discarded. Here, we present MITGARD, an automated pipeline that reliably recovers the mitochondrial genome from RNA-seq data from various sources. The pipeline identifies mitochondrial sequence reads based on a phylogenetically related reference, assembles them into contigs, and extracts a complete mtDNA for the target species. RESULTS:We demonstrate that MITGARD can reconstruct the mitochondrial genomes of several species throughout the tree of life. We noticed that MITGARD can recover the mitogenomes in different sequencing schemes and even in a scenario of low-sequencing depth. Moreover, we showed that the use of references from congeneric species diverging up to 30 million years ago (MYA) from the target species is sufficient to recover the entire mitogenome, whereas the use of species diverging between 30 and 60 MYA allows the recovery of most mitochondrial genes. Additionally, we provide a case study with original data in which we estimate a phylogenetic tree of snakes from the genus Bothrops, further demonstrating that MITGARD is suitable for use on biodiversity projects. MITGARD is then a valuable tool to obtain high-quality information for studies focusing on the phylogenetic and evolutionary aspects of eukaryotes and provides data for easily identifying a sample using barcoding, and to check for cross-contamination using third-party tools.

journal_name

Brief Bioinform

authors

Nachtigall PG,Grazziotin FG,Junqueira-de-Azevedo ILM

doi

10.1093/bib/bbaa429

subject

Has Abstract

pub_date

2021-01-30 00:00:00

eissn

1467-5463

issn

1477-4054

pii

6123950

pub_type

杂志文章
  • The digital revolution in phenotyping.

    abstract::Phenotypes have gained increased notoriety in the clinical and biological domain owing to their application in numerous areas such as the discovery of disease genes and drug targets, phylogenetics and pharmacogenomics. Phenotypes, defined as observable characteristics of organisms, can be seen as one of the bridges th...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbv083

    authors: Oellrich A,Collier N,Groza T,Rebholz-Schuhmann D,Shah N,Bodenreider O,Boland MR,Georgiev I,Liu H,Livingston K,Luna A,Mallon AM,Manda P,Robinson PN,Rustici G,Simon M,Wang L,Winnenburg R,Dumontier M

    更新日期:2016-09-01 00:00:00

  • RNA-mediated translation regulation in viral genomes: computational advances in the recognition of sequences and structures.

    abstract::RNA structures are widely distributed across all life forms. The global conformation of these structures is defined by a variety of constituent structural units such as helices, hairpin loops, kissing-loop motifs and pseudoknots, which often behave in a modular way. Their ubiquitous distribution is associated with a v...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz054

    authors: Gupta A,Bansal M

    更新日期:2020-07-15 00:00:00

  • AlzRiskMR database: an online database for the impact of exposure factors on Alzheimer's disease.

    abstract::In view of great difficulties in the pathogenesis analysis of Alzheimer's disease (AD) presently, profiling the modifiable risk factors is crucial for early detection and intervention of AD. However, the causal associations among them have yet to be identified, and the effective integration and application of these da...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa213

    authors: Wang Z,Meng L,Liu H,Shen L,Ji HF

    更新日期:2020-09-21 00:00:00

  • Structural database resources for biological macromolecules.

    abstract::This Briefing reviews the widely used, currently active, up-to-date databases derived from the worldwide Protein Data Bank (PDB) to facilitate browsing, finding and exploring its entries. These databases contain visualization and analysis tools tailored to specific kinds of molecules and interactions, often including ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbw049

    authors: Abriata LA

    更新日期:2017-07-01 00:00:00

  • Comparison of haplotype-based tests for detecting gene-environment interactions with rare variants.

    abstract::Dissecting the genetic mechanism underlying a complex disease hinges on discovering gene-environment interactions (GXE). However, detecting GXE is a challenging problem especially when the genetic variants under study are rare. Haplotype-based tests have several advantages over the so-called collapsing tests for detec...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz031

    authors: Papachristou C,Biswas S

    更新日期:2020-05-21 00:00:00

  • Improving structure-based virtual screening performance via learning from scoring function components.

    abstract::Scoring functions (SFs) based on complex machine learning (ML) algorithms have gradually emerged as a promising alternative to overcome the weaknesses of classical SFs. However, extensive efforts have been devoted to the development of SFs based on new protein-ligand interaction representations and advanced alternativ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa094

    authors: Xiong GL,Ye WL,Shen C,Lu AP,Hou TJ,Cao DS

    更新日期:2020-06-04 00:00:00

  • Data warehousing in molecular biology.

    abstract::In the business and healthcare sectors data warehousing has provided effective solutions for information usage and knowledge discovery from databases. However, data warehousing applications in the biological research and development (R&D) sector are lagging far behind. The fuzziness and complexity of biological data r...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/1.2.190

    authors: Schönbach C,Kowalski-Saunders P,Brusic V

    更新日期:2000-05-01 00:00:00

  • The dilemma of choosing the ideal permutation strategy while estimating statistical significance of genome-wide enrichment.

    abstract::Integrative analyses of genomic, epigenomic and transcriptomic features for human and various model organisms have revealed that many such features are nonrandomly distributed in the genome. Significant enrichment (or depletion) of genomic features is anticipated to be biologically important. Detection of genomic regi...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbt053

    authors: De S,Pedersen BS,Kechris K

    更新日期:2014-11-01 00:00:00

  • Protein structure prediction in genomics.

    abstract::As the number of completely sequenced genomes rapidly increases, including now the complete Human Genome sequence, the post-genomic problems of genome-scale protein structure determination and the issue of gene function identification become ever more pressing. In fact, these problems can be seen as interrelated in th...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/2.2.111

    authors: Jones DT

    更新日期:2001-05-01 00:00:00

  • Systematic review of computational methods for identifying miRNA-mediated RNA-RNA crosstalk.

    abstract::Posttranscriptional crosstalk and communication between RNAs yield large regulatory competing endogenous RNA (ceRNA) networks via shared microRNAs (miRNAs), as well as miRNA synergistic networks. The ceRNA crosstalk represents a novel layer of gene regulation that controls both physiological and pathological processes...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbx137

    authors: Li Y,Jin X,Wang Z,Li L,Chen H,Lin X,Yi S,Zhang Y,Xu J

    更新日期:2019-07-19 00:00:00

  • A solid quality-control analysis of AB SOLiD short-read sequencing data.

    abstract::Next generation sequencers have greatly improved our ability to mine polymorphisms and mutations out of entire (or portions of) genomes. The reliability of their outputs, though, showed to be very related to the sequencing chemistry and to deeply affect the quality of the downstream analyses. We focus here on the two-...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbs048

    authors: Castellana S,Romani M,Valente EM,Mazza T

    更新日期:2013-11-01 00:00:00

  • HVIDB: a comprehensive database for human-virus protein-protein interactions.

    abstract::While leading to millions of people's deaths every year the treatment of viral infectious diseases remains a huge public health challenge.Therefore, an in-depth understanding of human-virus protein-protein interactions (PPIs) as the molecular interface between a virus and its host cell is of paramount importance to ob...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa425

    authors: Yang X,Lian X,Fu C,Wuchty S,Yang S,Zhang Z

    更新日期:2021-01-30 00:00:00

  • Shaping the nebulous enhancer in the era of high-throughput assays and genome editing.

    abstract::Since the 1st discovery of transcriptional enhancers in 1981, their textbook definition has remained largely unchanged in the past 37 years. With the emergence of high-throughput assays and genome editing, which are switching the paradigm from bottom-up discovery and testing of individual enhancers to top-down profili...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz030

    authors: Ho EY,Cao Q,Gu M,Chan RW,Wu Q,Gerstein M,Yip KY

    更新日期:2020-05-21 00:00:00

  • Development of biomarker classifiers from high-dimensional data.

    abstract::Recent development of high-throughput technology has accelerated interest in the development of molecular biomarker classifiers for safety assessment, disease diagnostics and prognostics, and prediction of response for patient assignment. This article reviews and evaluates some important aspects and key issues in the ...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbp016

    authors: Baek S,Tsai CA,Chen JJ

    更新日期:2009-09-01 00:00:00

  • Optimizing drug development in oncology by clinical trial simulation: Why and how?

    abstract::In therapeutic research, the safety and efficacy of pharmaceutical products are necessarily tested on humans via clinical trials after an extensive and expensive preclinical development period. Methodologies such as computer modeling and clinical trial simulation (CTS) might represent a valuable option to reduce anima...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbx055

    authors: Gal J,Milano G,Ferrero JM,Saâda-Bouzid E,Viotti J,Chabaud S,Gougis P,Le Tourneau C,Schiappa R,Paquet A,Chamorey E

    更新日期:2018-11-27 00:00:00

  • Bioinformatics education--perspectives and challenges out of Africa.

    abstract::The discipline of bioinformatics has developed rapidly since the complete sequencing of the first genomes in the 1990s. The development of many high-throughput techniques during the last decades has ensured that bioinformatics has grown into a discipline that overlaps with, and is required for, the modern practice of ...

    journal_title:Briefings in bioinformatics

    pub_type: 历史文章,杂志文章

    doi:10.1093/bib/bbu022

    authors: Tastan Bishop Ö,Adebiyi EF,Alzohairy AM,Everett D,Ghedira K,Ghouila A,Kumuthini J,Mulder NJ,Panji S,Patterton HG,H3ABioNet Consortium.,H3Africa Consortium.

    更新日期:2015-03-01 00:00:00

  • Computational prediction of species-specific yeast DNA replication origin via iterative feature representation.

    abstract::Deoxyribonucleic acid replication is one of the most crucial tasks taking place in the cell, and it has to be precisely regulated. This process is initiated in the replication origins (ORIs), and thus it is essential to identify such sites for a deeper understanding of the cellular processes and functions related to t...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa304

    authors: Manavalan B,Basith S,Shin TH,Lee G

    更新日期:2020-11-25 00:00:00

  • Towards scaling elementary flux mode computation.

    abstract::While elementary flux mode (EFM) analysis is now recognized as a cornerstone computational technique for cellular pathway analysis and engineering, EFM application to genome-scale models remains computationally prohibitive. This article provides a review of aspects of EFM computation that elucidates bottlenecks in sca...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz094

    authors: Ullah E,Yosafshahi M,Hassoun S

    更新日期:2020-12-01 00:00:00

  • A proteogenomic approach to understand splice isoform functions through sequence and expression-based computational modeling.

    abstract::The products of multi-exon genes are a mixture of alternatively spliced isoforms, from which the translated proteins can have similar, different or even opposing functions. It is therefore essential to differentiate and annotate functions for individual isoforms. Computational approaches provide an efficient complemen...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbv109

    authors: Li HD,Omenn GS,Guan Y

    更新日期:2016-11-01 00:00:00

  • Extended application of genomic selection to screen multiomics data for prognostic signatures of prostate cancer.

    abstract::Prognostic tests using expression profiles of several dozen genes help provide treatment choices for prostate cancer (PCa). However, these tests require improvement to meet the clinical need for resolving overtreatment, which continues to be a pervasive problem in PCa management. Genomic selection (GS) methodology, wh...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa197

    authors: Li R,Wang S,Cui Y,Qu H,Chater JM,Zhang L,Wei J,Wang M,Xu Y,Yu L,Lu J,Feng Y,Zhou R,Huang Y,Ma R,Zhu J,Zhong W,Jia Z

    更新日期:2020-09-08 00:00:00

  • CyanoPATH: a knowledgebase of genome-scale functional repertoire for toxic cyanobacterial blooms.

    abstract::CyanoPATH is a database that curates and analyzes the common genomic functional repertoire for cyanobacteria harmful algal blooms (CyanoHABs) in eutrophic waters. Based on the literature of empirical studies and genome/protein databases, it summarizes four types of information: common biological functions (pathways) d...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa375

    authors: Du W,Li G,Ho N,Jenkins L,Hockaday D,Tan J,Cao H

    更新日期:2020-12-16 00:00:00

  • Advanced bioinformatics methods for practical applications in proteomics.

    abstract::Mass spectrometry (MS)-based proteomics has undergone rapid advancements in recent years, creating challenging problems for bioinformatics. We focus on four aspects where bioinformatics plays a crucial role (and proteomics is needed for clinical application): peptide-spectra matching (PSM) based on the new data-indepe...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbx128

    authors: Goh WWB,Wong L

    更新日期:2019-01-18 00:00:00

  • TRCirc: a resource for transcriptional regulation information of circRNAs.

    abstract::In recent years, high-throughput genomic technologies like chromatin immunoprecipitation sequencing (ChIp-seq) and transcriptome sequencing (RNA-seq) have been becoming both more refined and less expensive, making them more accessible. Many circular RNAs (circRNAs) that originate from back-spliced exons have been iden...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bby083

    authors: Tang Z,Li X,Zhao J,Qian F,Feng C,Li Y,Zhang J,Jiang Y,Yang Y,Wang Q,Li C

    更新日期:2019-11-27 00:00:00

  • Bioinformatics resources for SARS-CoV-2 discovery and surveillance.

    abstract::In early January 2020, the novel coronavirus (SARS-CoV-2) responsible for a pneumonia outbreak in Wuhan, China, was identified using next-generation sequencing (NGS) and readily available bioinformatics pipelines. In addition to virus discovery, these NGS technologies and bioinformatics resources are currently being e...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa386

    authors: Hu T,Li J,Zhou H,Li C,Holmes EC,Shi W

    更新日期:2021-01-08 00:00:00

  • Identifying drug-target interactions based on graph convolutional network and deep neural network.

    abstract::Identification of new drug-target interactions (DTIs) is an important but a time-consuming and costly step in drug discovery. In recent years, to mitigate these drawbacks, researchers have sought to identify DTIs using computational approaches. However, most existing methods construct drug networks and target networks...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa044

    authors: Zhao T,Hu Y,Valsdottir LR,Zang T,Peng J

    更新日期:2020-05-04 00:00:00

  • Data-driven rational biosynthesis design: from molecules to cell factories.

    abstract::A proliferation of chemical, reaction and enzyme databases, new computational methods and software tools for data-driven rational biosynthesis design have emerged in recent years. With the coming of the era of big data, particularly in the bio-medical field, data-driven rational biosynthesis design could potentially b...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbz065

    authors: Chen F,Yuan L,Ding S,Tian Y,Hu QN

    更新日期:2020-07-15 00:00:00

  • Comparative study of computational methods to detect the correlated reaction sets in biochemical networks.

    abstract::Correlated reaction sets (Co-Sets) are mathematically defined modules in biochemical reaction networks which facilitate the study of biological processes by decomposing complex reaction networks into conceptually simple units. According to the degree of association, Co-Sets can be classified into three types: perfect,...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbp068

    authors: Xi Y,Chen YP,Qian C,Wang F

    更新日期:2011-03-01 00:00:00

  • Exploration of cellular reaction systems.

    abstract::We discuss and review different ways to map cellular components and their temporal interaction with other such components to different non-spatially explicit mathematical models. The essential choices made in the literature are between discrete and continuous state spaces, between rule and event-based state updates an...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbp062

    authors: Kirkilionis M

    更新日期:2010-01-01 00:00:00

  • Exploring the function of genetic variants in the non-coding genomic regions: approaches for identifying human regulatory variants affecting gene expression.

    abstract::Understanding the genetic basis of human traits/diseases and the underlying mechanisms of how these traits/diseases are affected by genetic variations is critical for public health. Current genome-wide functional genomics data uncovered a large number of functional elements in the noncoding regions of human genome, pr...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章,评审

    doi:10.1093/bib/bbu018

    authors: Li MJ,Yan B,Sham PC,Wang J

    更新日期:2015-05-01 00:00:00

  • MloDisDB: a manually curated database of the relations between membraneless organelles and diseases.

    abstract::Cells are compartmentalized by numerous membrane-bounded organelles and membraneless organelles (MLOs) to ensure temporal and spatial regulation of various biological processes. A number of MLOs, such as nucleoli, nuclear speckles and stress granules, exist as liquid droplets within the cells and arise from the conden...

    journal_title:Briefings in bioinformatics

    pub_type: 杂志文章

    doi:10.1093/bib/bbaa271

    authors: Hou C,Xie H,Fu Y,Ma Y,Li T

    更新日期:2020-10-30 00:00:00