How to evaluate performance of prediction methods? Measures and their interpretation in variation effect analysis.

Abstract:

BACKGROUND:Prediction methods are increasingly used in biosciences to forecast diverse features and characteristics. Binary two-state classifiers are the most common applications. They are usually based on machine learning approaches. For the end user it is often problematic to evaluate the true performance and applicability of computational tools as some knowledge about computer science and statistics would be needed. RESULTS:Instructions are given on how to interpret and compare method evaluation results. For systematic method performance analysis is needed established benchmark datasets which contain cases with known outcome, and suitable evaluation measures. The criteria for benchmark datasets are discussed along with their implementation in VariBench, benchmark database for variations. There is no single measure that alone could describe all the aspects of method performance. Predictions of genetic variation effects on DNA, RNA and protein level are important as information about variants can be produced much faster than their disease relevance can be experimentally verified. Therefore numerous prediction tools have been developed, however, systematic analyses of their performance and comparison have just started to emerge. CONCLUSIONS:The end users of prediction tools should be able to understand how evaluation is done and how to interpret the results. Six main performance evaluation measures are introduced. These include sensitivity, specificity, positive predictive value, negative predictive value, accuracy and Matthews correlation coefficient. Together with receiver operating characteristics (ROC) analysis they provide a good picture about the performance of methods and allow their objective and quantitative comparison. A checklist of items to look at is provided. Comparisons of methods for missense variant tolerance, protein stability changes due to amino acid substitutions, and effects of variations on mRNA splicing are presented.

journal_name

BMC Genomics

journal_title

BMC genomics

authors

Vihinen M

doi

10.1186/1471-2164-13-S4-S2

subject

Has Abstract

pub_date

2012-06-18 00:00:00

pages

S2

issn

1471-2164

pii

1471-2164-13-S4-S2

journal_volume

13 Suppl 4

pub_type

杂志文章
  • Comparing copy-number profiles under multi-copy amplifications and deletions.

    abstract:BACKGROUND:During cancer progression, malignant cells accumulate somatic mutations that can lead to genetic aberrations. In particular, evolutionary events akin to segmental duplications or deletions can alter the copy-number profile (CNP) of a set of genes in a genome. Our aim is to compute the evolutionary distance b...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-6611-3

    authors: Cordonnier G,Lafond M

    更新日期:2020-04-16 00:00:00

  • Synteny mapping between common bean and soybean reveals extensive blocks of shared loci.

    abstract:BACKGROUND:Understanding syntentic relationship between two species is critical to assessing the potential for comparative genomic analysis. Common bean (Phaseolus vulgaris L.) and soybean (Glycine max L.), the two most important members of the Phaseoleae legumes, appear to have a diploid and polyploidy recent past, re...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-11-184

    authors: McClean PE,Mamidi S,McConnell M,Chikara S,Lee R

    更新日期:2010-03-18 00:00:00

  • Gene expression profiling in the Cynomolgus macaque Macaca fascicularis shows variation within the normal birth range.

    abstract:BACKGROUND:Although an adverse early-life environment has been linked to an increased risk of developing the metabolic syndrome, the molecular mechanisms underlying altered disease susceptibility as well as their relevance to humans are largely unknown. Importantly, emerging evidence suggests that these effects operate...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-509

    authors: Emerald BS,Chng K,Masuda S,Sloboda DM,Vickers MH,Kambadur R,Gluckman PD

    更新日期:2011-10-16 00:00:00

  • Construction of relatedness matrices using genotyping-by-sequencing data.

    abstract:BACKGROUND:Genotyping-by-sequencing (GBS) is becoming an attractive alternative to array-based methods for genotyping individuals for a large number of single nucleotide polymorphisms (SNPs). Costs can be lowered by reducing the mean sequencing depth, but this results in genotype calls of lower quality. A common analys...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-2252-3

    authors: Dodds KG,McEwan JC,Brauning R,Anderson RM,van Stijn TC,Kristjánsson T,Clarke SM

    更新日期:2015-12-09 00:00:00

  • Genome-wide survey of two-component signal transduction systems in the plant growth-promoting bacterium Azospirillum.

    abstract:BACKGROUND:Two-component systems (TCS) play critical roles in sensing and responding to environmental cues. Azospirillum is a plant growth-promoting rhizobacterium living in the rhizosphere of many important crops. Despite numerous studies about its plant beneficial properties, little is known about how the bacterium s...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1962-x

    authors: Borland S,Oudart A,Prigent-Combaret C,Brochier-Armanet C,Wisniewski-Dyé F

    更新日期:2015-10-22 00:00:00

  • Integrated proteomic and metabolomic analysis to study the effects of spaceflight on Candida albicans.

    abstract:BACKGROUND:Candida albicans is an opportunistic pathogenic yeast, which could become pathogenic in various stressful environmental factors including the spaceflight environment. In this study, we aim to explore the phenotypic changes and possible mechanisms of C. albicans after exposure to spaceflight conditions. RESU...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-6476-5

    authors: Wang J,Liu Y,Zhao G,Gao J,Liu J,Wu X,Xu C,Li Y

    更新日期:2020-01-17 00:00:00

  • Comparative evolutionary genomics of the HADH2 gene encoding Abeta-binding alcohol dehydrogenase/17beta-hydroxysteroid dehydrogenase type 10 (ABAD/HSD10).

    abstract:BACKGROUND:The Abeta-binding alcohol dehydrogenase/17beta-hydroxysteroid dehydrogenase type 10 (ABAD/HSD10) is an enzyme involved in pivotal metabolic processes and in the mitochondrial dysfunction seen in the Alzheimer's disease. Here we use comparative genomic analyses to study the evolution of the HADH2 gene encodin...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-7-202

    authors: Marques AT,Antunes A,Fernandes PA,Ramos MJ

    更新日期:2006-08-09 00:00:00

  • Comparative analysis of the acute response of the trout, O. mykiss, head kidney to in vivo challenge with virulent and attenuated infectious hematopoietic necrosis virus and LPS-induced inflammation.

    abstract:BACKGROUND:The response of the trout, O. mykiss, head kidney to bacterial lipopolysaccharide (LPS) or active and attenuated infectious hematopoietic necrosis virus (IHNV and attINHV respectively) intraperitoneal challenge, 24 and 72 hours post-injection, was investigated using a salmonid-specific cDNA microarray. RESU...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-141

    authors: MacKenzie S,Balasch JC,Novoa B,Ribas L,Roher N,Krasnov A,Figueras A

    更新日期:2008-03-26 00:00:00

  • An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome.

    abstract:BACKGROUND:Second generation sequencing has permitted detailed sequence characterisation at the whole genome level of a growing number of non-model organisms, but the data produced have short read-lengths and biased genome coverage leading to fragmented genome assemblies. The PacBio RS long-read sequencing platform off...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-670

    authors: Ferrarini M,Moretto M,Ward JA,Šurbanovski N,Stevanović V,Giongo L,Viola R,Cavalieri D,Velasco R,Cestaro A,Sargent DJ

    更新日期:2013-10-01 00:00:00

  • KSP: an integrated method for predicting catalyzing kinases of phosphorylation sites in proteins.

    abstract:BACKGROUND:Protein phosphorylation by kinases plays crucial roles in various biological processes including signal transduction and tumorigenesis, thus a better understanding of protein phosphorylation events in cells is fundamental for studying protein functions and designing drugs to treat diseases caused by the malf...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-06895-2

    authors: Ma H,Li G,Su Z

    更新日期:2020-08-04 00:00:00

  • Gene expression profiles at different stages for formation of pearl sac and pearl in the pearl oyster Pinctada fucata.

    abstract:BACKGROUND:The most critical step in the pearl formation during aquaculture is issued to the proliferation and differentiation of outer epithelial cells of mantle graft into pearl sac. This pearl sac secretes various matrix proteins to produce pearls by a complex physiological process which has not been well-understood...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-5579-3

    authors: Mariom,Take S,Igarashi Y,Yoshitake K,Asakawa S,Maeyama K,Nagai K,Watabe S,Kinoshita S

    更新日期:2019-03-25 00:00:00

  • Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

    abstract:BACKGROUND:Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.)...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-211

    authors: Straub SC,Fishbein M,Livshultz T,Foster Z,Parks M,Weitemier K,Cronn RC,Liston A

    更新日期:2011-05-04 00:00:00

  • Genome-wide host responses against infectious laryngotracheitis virus vaccine infection in chicken embryo lung cells.

    abstract:BACKGROUND:Infectious laryngotracheitis virus (ILTV; gallid herpesvirus 1) infection causes high mortality and huge economic losses in the poultry industry. To protect chickens against ILTV infection, chicken-embryo origin (CEO) and tissue-culture origin (TCO) vaccines have been used. However, the transmission of vacci...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-143

    authors: Lee J,Bottje WG,Kong BW

    更新日期:2012-04-24 00:00:00

  • Anomaly detection in gene expression via stochastic models of gene regulatory networks.

    abstract:BACKGROUND:The steady-state behaviour of gene regulatory networks (GRNs) can provide crucial evidence for detecting disease-causing genes. However, monitoring the dynamics of GRNs is particularly difficult because biological data only reflects a snapshot of the dynamical behaviour of the living organism. Also most GRN ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-S3-S26

    authors: Kim H,Gelenbe E

    更新日期:2009-12-03 00:00:00

  • Genomes of Helicobacter pylori from native Peruvians suggest admixture of ancestral and modern lineages and reveal a western type cag-pathogenicity island.

    abstract:BACKGROUND:Helicobacter pylori is presumed to be co-evolved with its human host and is a highly diverse gastric pathogen at genetic levels. Ancient origins of H. pylori in the New World are still debatable. It is not clear how different waves of human migrations in South America contributed to the evolution of strain d...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-7-191

    authors: Devi SM,Ahmed I,Khan AA,Rahman SA,Alvi A,Sechi LA,Ahmed N

    更新日期:2006-07-27 00:00:00

  • Genomic meta-analysis of growth factor and integrin pathways in chronic kidney transplant injury.

    abstract:BACKGROUND:Chronic Allograft Nephropathy (CAN) is a clinical entity of progressive kidney transplant injury. The defining histology is tubular atrophy with interstitial fibrosis (IFTA). Using a meta-analysis of microarrays from 84 kidney transplant biopsies, we revealed growth factor and integrin adhesion molecule path...

    journal_title:BMC genomics

    pub_type: 杂志文章,meta分析

    doi:10.1186/1471-2164-14-275

    authors: Dosanjh A,Robison E,Mondala T,Head SR,Salomon DR,Kurian SM

    更新日期:2013-04-23 00:00:00

  • Exploring the genetics of trotting racing ability in horses using a unique Nordic horse model.

    abstract:BACKGROUND:Horses have been strongly selected for speed, strength, and endurance-exercise traits since the onset of domestication. As a result, highly specialized horse breeds have developed with many modern horse breeds often representing closed populations with high phenotypic and genetic uniformity. However, a great...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-5484-9

    authors: Velie BD,Lillie M,Fegraeus KJ,Rosengren MK,Solé M,Wiklund M,Ihler CF,Strand E,Lindgren G

    更新日期:2019-02-04 00:00:00

  • Microarray-based ultra-high resolution discovery of genomic deletion mutations.

    abstract:BACKGROUND:Oligonucleotide microarray-based comparative genomic hybridization (CGH) offers an attractive possible route for the rapid and cost-effective genome-wide discovery of deletion mutations. CGH typically involves comparison of the hybridization intensities of genomic DNA samples with microarray chip representat...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-224

    authors: Belfield EJ,Brown C,Gan X,Jiang C,Baban D,Mithani A,Mott R,Ragoussis J,Harberd NP

    更新日期:2014-03-22 00:00:00

  • Gene expression patterns that predict sensitivity to epidermal growth factor receptor tyrosine kinase inhibitors in lung cancer cell lines and human lung tumors.

    abstract:BACKGROUND:Increased focus surrounds identifying patients with advanced non-small cell lung cancer (NSCLC) who will benefit from treatment with epidermal growth factor receptor (EGFR) tyrosine kinase inhibitors (TKI). EGFR mutation, gene copy number, coexpression of ErbB proteins and ligands, and epithelial to mesenchy...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-7-289

    authors: Balko JM,Potti A,Saunders C,Stromberg A,Haura EB,Black EP

    更新日期:2006-11-10 00:00:00

  • Comparative genomic analysis reveals occurrence of genetic recombination in virulent Cryptosporidium hominis subtypes and telomeric gene duplications in Cryptosporidium parvum.

    abstract:BACKGROUND:Cryptosporidium hominis is a dominant species for human cryptosporidiosis. Within the species, IbA10G2 is the most virulent subtype responsible for all C. hominis-associated outbreaks in Europe and Australia, and is a dominant outbreak subtype in the United States. In recent yearsIaA28R4 is becoming a major ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1517-1

    authors: Guo Y,Tang K,Rowe LA,Li N,Roellig DM,Knipe K,Frace M,Yang C,Feng Y,Xiao L

    更新日期:2015-04-18 00:00:00

  • Identification and transcriptomic profiling of genes involved in increasing sugar content during salt stress in sweet sorghum leaves.

    abstract:BACKGROUND:Sweet sorghum is an annual C4 crop considered to be one of the most promising bio-energy crops due to its high sugar content in stem, yet it is poorly understood how this plant increases its sugar content in response to salt stress. In response to high NaCl, many of its major processes, such as photosynthesi...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1760-5

    authors: Sui N,Yang Z,Liu M,Wang B

    更新日期:2015-07-19 00:00:00

  • Unlocking the bovine genome.

    abstract::The draft genome sequence of cattle (Bos taurus) has now been analyzed by the Bovine Genome Sequencing and Analysis Consortium and the Bovine HapMap Consortium, which together represent an extensive collaboration involving more than 300 scientists from 25 different countries. ...

    journal_title:BMC genomics

    pub_type: 社论

    doi:10.1186/1471-2164-10-193

    authors: Tellam RL,Lemay DG,Van Tassell CP,Lewin HA,Worley KC,Elsik CG

    更新日期:2009-04-24 00:00:00

  • Urinary proteomic and non-prefractionation quantitative phosphoproteomic analysis during pregnancy and non-pregnancy.

    abstract:BACKGROUND:Progress in the fields of protein separation and identification technologies has accelerated research into biofluids proteomics for protein biomarker discovery. Urine has become an ideal and rich source of biomarkers in clinical proteomics. Here we performed a proteomic analysis of urine samples from pregnan...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-777

    authors: Zheng J,Liu L,Wang J,Jin Q

    更新日期:2013-11-11 00:00:00

  • Multiple genetic loci define Ca++ utilization by bloodstream malaria parasites.

    abstract:BACKGROUND:Bloodstream malaria parasites require Ca++ for their development, but the sites and mechanisms of Ca++ utilization are not well understood. We hypothesized that there may be differences in Ca++ uptake or utilization by genetically distinct lines of P. falciparum. These differences, if identified, may provide...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5418-y

    authors: Apolis L,Olivas J,Srinivasan P,Kushwaha AK,Desai SA

    更新日期:2019-01-16 00:00:00

  • Transcriptomic time-series analysis of early development in olive from germinated embryos to juvenile tree.

    abstract:BACKGROUND:Despite its relevance, almost no studies account for the genetic control in the early stages of tree development, i.e. from germination on. This study seeks to make a quite complete transcriptome for olive development and to elucidate the dynamic regulation of the transcriptomic response during the early-juv...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-5232-6

    authors: Jiménez-Ruiz J,de la O Leyva-Pérez M,Vidoy-Mercado I,Barceló A,Luque F

    更新日期:2018-11-19 00:00:00

  • Transcriptome analysis of human brain tissue identifies reduced expression of complement complex C1Q Genes in Rett syndrome.

    abstract:BACKGROUND:MECP2, the gene mutated in the majority of Rett syndrome cases, is a transcriptional regulator that can activate or repress transcription. Although the transcription regulatory function of MECP2 has been known for over a decade, it remains unclear how transcriptional dysregulation leads to the neurodevelopme...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-016-2746-7

    authors: Lin P,Nicholls L,Assareh H,Fang Z,Amos TG,Edwards RJ,Assareh AA,Voineagu I

    更新日期:2016-06-06 00:00:00

  • Positively correlated miRNA-mRNA regulatory networks in mouse frontal cortex during early stages of alcohol dependence.

    abstract:BACKGROUND:Although the study of gene regulation via the action of specific microRNAs (miRNAs) has experienced a boom in recent years, the analysis of genome-wide interaction networks among miRNAs and respective targeted mRNAs has lagged behind. MicroRNAs simultaneously target many transcripts and fine-tune the express...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-725

    authors: Nunez YO,Truitt JM,Gorini G,Ponomareva ON,Blednov YA,Harris RA,Mayfield RD

    更新日期:2013-10-22 00:00:00

  • Methods for high-throughput MethylCap-Seq data analysis.

    abstract:BACKGROUND:Advances in whole genome profiling have revolutionized the cancer research field, but at the same time have raised new bioinformatics challenges. For next generation sequencing (NGS), these include data storage, computational costs, sequence processing and alignment, delineating appropriate statistical measu...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-S6-S14

    authors: Rodriguez BA,Frankhouser D,Murphy M,Trimarchi M,Tam HH,Curfman J,Huang R,Chan MW,Lai HC,Parikh D,Ball B,Schwind S,Blum W,Marcucci G,Yan P,Bundschuh R

    更新日期:2012-01-01 00:00:00

  • Ion channel profiling of the Lymnaea stagnalis ganglia via transcriptome analysis.

    abstract:BACKGROUND:The pond snail Lymnaea stagnalis (L. stagnalis) has been widely used as a model organism in neurobiology, ecotoxicology, and parasitology due to the relative simplicity of its central nervous system (CNS). However, its usefulness is restricted by a limited availability of transcriptome data. While sequence i...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-07287-2

    authors: Dong N,Bandura J,Zhang Z,Wang Y,Labadie K,Noel B,Davison A,Koene JM,Sun HS,Coutellec MA,Feng ZP

    更新日期:2021-01-06 00:00:00

  • Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

    abstract:BACKGROUND:Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearra...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-561

    authors: Jo YD,Choi Y,Kim DH,Kim BD,Kang BC

    更新日期:2014-07-04 00:00:00