Towards barcode markers in Fungi: an intron map of Ascomycota mitochondria.

Abstract:

BACKGROUND:A standardized and cost-effective molecular identification system is now an urgent need for Fungi owing to their wide involvement in human life quality. In particular the potential use of mitochondrial DNA species markers has been taken in account. Unfortunately, a serious difficulty in the PCR and bioinformatic surveys is due to the presence of mobile introns in almost all the fungal mitochondrial genes. The aim of this work is to verify the incidence of this phenomenon in Ascomycota, testing, at the same time, a new bioinformatic tool for extracting and managing sequence databases annotations, in order to identify the mitochondrial gene regions where introns are missing so as to propose them as species markers. METHODS:The general trend towards a large occurrence of introns in the mitochondrial genome of Fungi has been confirmed in Ascomycota by an extensive bioinformatic analysis, performed on all the entries concerning 11 mitochondrial protein coding genes and 2 mitochondrial rRNA (ribosomal RNA) specifying genes, belonging to this phylum, available in public nucleotide sequence databases. A new query approach has been developed to retrieve effectively introns information included in these entries. RESULTS:After comparing the new query-based approach with a blast-based procedure, with the aim of designing a faithful Ascomycota mitochondrial intron map, the first method appeared clearly the most accurate. Within this map, despite the large pervasiveness of introns, it is possible to distinguish specific regions comprised in several genes, including the full NADH dehydrogenase subunit 6 (ND6) gene, which could be considered as barcode candidates for Ascomycota due to their paucity of introns and to their length, above 400 bp, comparable to the lower end size of the length range of barcodes successfully used in animals. CONCLUSION:The development of the new query system described here would answer the pressing requirement to improve drastically the bioinformatics support to the DNA Barcode Initiative. The large scale investigation of Ascomycota mitochondrial introns performed through this tool, allowing to exclude the introns-rich sequences from the barcode candidates exploration, could be the first step towards a mitochondrial barcoding strategy for these organisms, similar to the standard approach employed in metazoans.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Santamaria M,Vicario S,Pappadà G,Scioscia G,Scazzocchio C,Saccone C

doi

10.1186/1471-2105-10-S6-S15

subject

Has Abstract

pub_date

2009-06-16 00:00:00

pages

S15

issn

1471-2105

pii

1471-2105-10-S6-S15

journal_volume

10 Suppl 6

pub_type

杂志文章
  • BIOSMILE: a semantic role labeling system for biomedical verbs using a maximum-entropy model with automatically generated template features.

    abstract:BACKGROUND:Bioinformatics tools for automatic processing of biomedical literature are invaluable for both the design and interpretation of large-scale experiments. Many information extraction (IE) systems that incorporate natural language processing (NLP) techniques have thus been developed for use in the biomedical fi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-325

    authors: Tsai RT,Chou WC,Su YS,Lin YC,Sung CL,Dai HJ,Yeh IT,Ku W,Sung TY,Hsu WL

    更新日期:2007-09-01 00:00:00

  • Googling DNA sequences on the World Wide Web.

    abstract:BACKGROUND:New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bio...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S14-S4

    authors: Hajibabaei M,Singer GA

    更新日期:2009-11-10 00:00:00

  • DECA: scalable XHMM exome copy-number variant calling with ADAM and Apache Spark.

    abstract:BACKGROUND:XHMM is a widely used tool for copy-number variant (CNV) discovery from whole exome sequencing data but can require hours to days to run for large cohorts. A more scalable implementation would reduce the need for specialized computational resources and enable increased exploration of the configuration parame...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3108-7

    authors: Linderman MD,Chia D,Wallace F,Nothaft FA

    更新日期:2019-10-11 00:00:00

  • Finding motif pairs in the interactions between heterogeneous proteins via bootstrapping and boosting.

    abstract:BACKGROUND:Supervised learning and many stochastic methods for predicting protein-protein interactions require both negative and positive interactions in the training data set. Unlike positive interactions, negative interactions cannot be readily obtained from interaction data, so these must be generated. In protein-pr...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-10-S1-S57

    authors: Kim J,Huang DS,Han K

    更新日期:2009-01-30 00:00:00

  • Analysis of Bovine Viral Diarrhea Viruses-infected monocytes: identification of cytopathic and non-cytopathic biotype differences.

    abstract:BACKGROUND:Bovine Viral Diarrhea Virus (BVDV) infection is widespread in cattle worldwide, causing important economic losses. Pathogenesis of the disease caused by BVDV is complex, as each BVDV strain has two biotypes: non-cytopathic (ncp) and cytopathic (cp). BVDV can cause a persistent latent infection and immune sup...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S6-S9

    authors: Ammari M,McCarthy FM,Nanduri B,Pinchuk LM

    更新日期:2010-10-07 00:00:00

  • A multiobjective approach to the genetic code adaptability problem.

    abstract:BACKGROUND:The organization of the canonical code has intrigued researches since it was first described. If we consider all codes mapping the 64 codes into 20 amino acids and one stop codon, there are more than 1.51×10(84) possible genetic codes. The main question related to the organization of the genetic code is why ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0480-9

    authors: de Oliveira LL,de Oliveira PS,Tinós R

    更新日期:2015-02-19 00:00:00

  • Estimating the individualized HIV-1 genetic barrier to resistance using a nelfinavir fitness landscape.

    abstract:BACKGROUND:Failure on Highly Active Anti-Retroviral Treatment is often accompanied with development of antiviral resistance to one or more drugs included in the treatment. In general, the virus is more likely to develop resistance to drugs with a lower genetic barrier. Previously, we developed a method to reverse engin...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-409

    authors: Theys K,Deforche K,Beheydt G,Moreau Y,van Laethem K,Lemey P,Camacho RJ,Rhee SY,Shafer RW,Van Wijngaerden E,Vandamme AM

    更新日期:2010-08-03 00:00:00

  • Statistics for approximate gene clusters.

    abstract:BACKGROUND:Genes occurring co-localized in multiple genomes can be strong indicators for either functional constraints on the genome organization or remnant ancestral gene order. The computational detection of these patterns, which are usually referred to as gene clusters, has become increasingly sensitive over the pas...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-S15-S14

    authors: Jahn K,Winter S,Stoye J,Böcker S

    更新日期:2013-01-01 00:00:00

  • ILP-based maximum likelihood genome scaffolding.

    abstract:BACKGROUND:Interest in de novo genome assembly has been renewed in the past decade due to rapid advances in high-throughput sequencing (HTS) technologies which generate relatively short reads resulting in highly fragmented assemblies consisting of contigs. Additional long-range linkage information is typically used to ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-S9-S9

    authors: Lindsay J,Salooti H,Măndoiu I,Zelikovsky A

    更新日期:2014-01-01 00:00:00

  • An automated framework for understanding structural variations in the binding grooves of MHC class II molecules.

    abstract:BACKGROUND:MHC/HLA class II molecules are important components of the immune system and play a critical role in processes such as phagocytosis. Understanding peptide recognition properties of the hundreds of MHC class II alleles is essential to appreciate determinants of antigenicity and ultimately to predict epitopes....

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-S1-S55

    authors: Yeturu K,Utriainen T,Kemp GJ,Chandra N

    更新日期:2010-01-18 00:00:00

  • Predicting human splicing branchpoints by combining sequence-derived features and multi-label learning methods.

    abstract:BACKGROUND:Alternative splicing is the critical process in a single gene coding, which removes introns and joins exons, and splicing branchpoints are indicators for the alternative splicing. Wet experiments have identified a great number of human splicing branchpoints, but many branchpoints are still unknown. In order ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1875-6

    authors: Zhang W,Zhu X,Fu Y,Tsuji J,Weng Z

    更新日期:2017-12-01 00:00:00

  • BiPOm: a rule-based ontology to represent and infer molecule knowledge from a biological process-centered viewpoint.

    abstract:BACKGROUND:Managing and organizing biological knowledge remains a major challenge, due to the complexity of living systems. Recently, systemic representations have been promising in tackling such a challenge at the whole-cell scale. In such representations, the cell is considered as a system composed of interlocked sub...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03637-9

    authors: Henry V,Saïs F,Inizan O,Marchadier E,Dibie J,Goelzer A,Fromion V

    更新日期:2020-07-23 00:00:00

  • SAlign-a structure aware method for global PPI network alignment.

    abstract:BACKGROUND:High throughput experiments have generated a significantly large amount of protein interaction data, which is being used to study protein networks. Studying complete protein networks can reveal more insight about healthy/disease states than studying proteins in isolation. Similarly, a comparative study of pr...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03827-5

    authors: Ayub U,Haider I,Naveed H

    更新日期:2020-11-04 00:00:00

  • Multi-omic analysis of signalling factors in inflammatory comorbidities.

    abstract:BACKGROUND:Inflammation is a core element of many different, systemic and chronic diseases that usually involve an important autoimmune component. The clinical phase of inflammatory diseases is often the culmination of a long series of pathologic events that started years before. The systemic characteristics and relate...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2413-x

    authors: Xiao H,Bartoszek K,Lio' P

    更新日期:2018-11-30 00:00:00

  • Extracting predictors for lung adenocarcinoma based on Granger causality test and stepwise character selection.

    abstract:BACKGROUND:Lung adenocarcinoma is the most common type of lung cancer, with high mortality worldwide. Its occurrence and development were thoroughly studied by high-throughput expression microarray, which produced abundant data on gene expression, DNA methylation, and miRNA quantification. However, the hub genes, which...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2739-z

    authors: Fan X,Wang Y,Tang XQ

    更新日期:2019-05-01 00:00:00

  • Class prediction for high-dimensional class-imbalanced data.

    abstract:BACKGROUND:The goal of class prediction studies is to develop rules to accurately predict the class membership of new samples. The rules are derived using the values of the variables available for each subject: the main characteristic of high-dimensional data is that the number of variables greatly exceeds the number o...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-523

    authors: Blagus R,Lusa L

    更新日期:2010-10-20 00:00:00

  • An efficient visualization tool for the analysis of protein mutation matrices.

    abstract:BACKGROUND:It is useful to develop a tool that would effectively describe protein mutation matrices specifically geared towards the identification of mutations that produce either wanted or unwanted effects, such as an increase or decrease in affinity, or a predisposition towards misfolding. Here, we describe a tool wh...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-218

    authors: David MP,Lapid CM,Daria VR

    更新日期:2008-04-28 00:00:00

  • IDconverter and IDClight: conversion and annotation of gene and protein IDs.

    abstract:BACKGROUND:Researchers involved in the annotation of large numbers of gene, clone or protein identifiers are usually required to perform a one-by-one conversion for each identifier. When the field of research is one such as microarray experiments, this number may be around 30,000. RESULTS:To help researchers map acces...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-9

    authors: Alibés A,Yankilevich P,Cañada A,Díaz-Uriarte R

    更新日期:2007-01-10 00:00:00

  • Enhanced CellClassifier: a multi-class classification tool for microscopy images.

    abstract:BACKGROUND:Light microscopy is of central importance in cell biology. The recent introduction of automated high content screening has expanded this technology towards automation of experiments and performing large scale perturbation assays. Nevertheless, evaluation of microscopy data continues to be a bottleneck in man...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-30

    authors: Misselwitz B,Strittmatter G,Periaswamy B,Schlumberger MC,Rout S,Horvath P,Kozak K,Hardt WD

    更新日期:2010-01-14 00:00:00

  • ChemEx: information extraction system for chemical data curation.

    abstract:BACKGROUND:Manual chemical data curation from publications is error-prone, time consuming, and hard to maintain up-to-date data sets. Automatic information extraction can be used as a tool to reduce these problems. Since chemical structures usually described in images, information extraction needs to combine structure ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-S17-S9

    authors: Tharatipyakul A,Numnark S,Wichadakul D,Ingsriswang S

    更新日期:2012-01-01 00:00:00

  • Leveraging TCGA gene expression data to build predictive models for cancer drug response.

    abstract:BACKGROUND:Machine learning has been utilized to predict cancer drug response from multi-omics data generated from sensitivities of cancer cell lines to different therapeutic compounds. Here, we build machine learning models using gene expression data from patients' primary tumor tissues to predict whether a patient wi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03690-4

    authors: Clayton EA,Pujol TA,McDonald JF,Qiu P

    更新日期:2020-09-30 00:00:00

  • Standard machine learning approaches outperform deep representation learning on phenotype prediction from transcriptomics data.

    abstract:BACKGROUND:The ability to confidently predict health outcomes from gene expression would catalyze a revolution in molecular diagnostics. Yet, the goal of developing actionable, robust, and reproducible predictive signatures of phenotypes such as clinical outcome has not been attained in almost any disease area. Here, w...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-3427-8

    authors: Smith AM,Walsh JR,Long J,Davis CB,Henstock P,Hodge MR,Maciejewski M,Mu XJ,Ra S,Zhao S,Ziemek D,Fisher CK

    更新日期:2020-03-20 00:00:00

  • Predicting anatomic therapeutic chemical classification codes using tiered learning.

    abstract:BACKGROUND:The low success rate and high cost of drug discovery requires the development of new paradigms to identify molecules of therapeutic value. The Anatomical Therapeutic Chemical (ATC) Code System is a World Health Organization (WHO) proposed classification that assigns multi-level codes to compounds based on th...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1660-6

    authors: Olson T,Singh R

    更新日期:2017-06-07 00:00:00

  • A theorem proving approach for automatically synthesizing visualizations of flow cytometry data.

    abstract:BACKGROUND:Polychromatic flow cytometry is a popular technique that has wide usage in the medical sciences, especially for studying phenotypic properties of cells. The high-dimensionality of data generated by flow cytometry usually makes it difficult to visualize. The naive solution of simply plotting two-dimensional g...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-017-1662-4

    authors: Raj S,Hussain F,Husein Z,Torosdagli N,Turgut D,Deo N,Pattanaik S,Chang CJ,Jha SK

    更新日期:2017-06-07 00:00:00

  • WellInverter: a web application for the analysis of fluorescent reporter gene data.

    abstract:BACKGROUND:Fluorescent reporter genes have become widely used for monitoring gene expression in living cells. When a microbial strain carrying a reporter gene is grown in a microplate reader, the fluorescence and the absorbance (optical density) of the culture can be automatically measured every few minutes in a highly...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2920-4

    authors: Martin Y,Page M,Blanchet C,de Jong H

    更新日期:2019-06-11 00:00:00

  • Missing genes in the annotation of prokaryotic genomes.

    abstract:BACKGROUND:Protein-coding gene detection in prokaryotic genomes is considered a much simpler problem than in intron-containing eukaryotic genomes. However there have been reports that prokaryotic gene finder programs have problems with small genes (either over-predicting or under-predicting). Therefore the question ari...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-131

    authors: Warren AS,Archuleta J,Feng WC,Setubal JC

    更新日期:2010-03-15 00:00:00

  • Relation extraction between bacteria and biotopes from biomedical texts with attention mechanisms and domain-specific contextual representations.

    abstract:BACKGROUND:The Bacteria Biotope (BB) task is a biomedical relation extraction (RE) that aims to study the interaction between bacteria and their locations. This task is considered to pertain to fundamental knowledge in applied microbiology. Some previous investigations conducted the study by applying feature-based mode...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3217-3

    authors: Jettakul A,Wichadakul D,Vateekul P

    更新日期:2019-12-03 00:00:00

  • Bison: bisulfite alignment on nodes of a cluster.

    abstract:BACKGROUND:DNA methylation changes are associated with a wide array of biological processes. Bisulfite conversion of DNA followed by high-throughput sequencing is increasingly being used to assess genome-wide methylation at single-base resolution. The relative slowness of most commonly used aligners for processing such...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-15-337

    authors: Ryan DP,Ehninger D

    更新日期:2014-10-18 00:00:00

  • The PowerAtlas: a power and sample size atlas for microarray experimental design and research.

    abstract:BACKGROUND:Microarrays permit biologists to simultaneously measure the mRNA abundance of thousands of genes. An important issue facing investigators planning microarray experiments is how to estimate the sample size required for good statistical power. What is the projected sample size or number of replicate chips need...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-84

    authors: Page GP,Edwards JW,Gadbury GL,Yelisetti P,Wang J,Trivedi P,Allison DB

    更新日期:2006-02-22 00:00:00

  • SDA: a semi-parametric differential abundance analysis method for metabolomics and proteomics data.

    abstract:BACKGROUND:Identifying differentially abundant features between different experimental groups is a common goal for many metabolomics and proteomics studies. However, analyzing data from mass spectrometry (MS) is difficult because the data may not be normally distributed and there is often a large fraction of zero value...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3067-z

    authors: Li Y,Fan TWM,Lane AN,Kang WY,Arnold SM,Stromberg AJ,Wang C,Chen L

    更新日期:2019-10-17 00:00:00