Comparing the performance of biomedical clustering methods.

Abstract:

:Identifying groups of similar objects is a popular first step in biomedical data analysis, but it is error-prone and impossible to perform manually. Many computational methods have been developed to tackle this problem. Here we assessed 13 well-known methods using 24 data sets ranging from gene expression to protein domains. Performance was judged on the basis of 13 common cluster validity indices. We developed a clustering analysis platform, ClustEval (http://clusteval.mpi-inf.mpg.de), to promote streamlined evaluation, comparison and reproducibility of clustering results in the future. This allowed us to objectively evaluate the performance of all tools on all data sets with up to 1,000 different parameter sets each, resulting in a total of more than 4 million calculated cluster validity indices. We observed that there was no universal best performer, but on the basis of this wide-ranging comparison we were able to develop a short guideline for biomedical clustering tasks. ClustEval allows biomedical researchers to pick the appropriate tool for their data type and allows method developers to compare their tool to the state of the art.

journal_name

Nat Methods

journal_title

Nature methods

authors

Wiwie C,Baumbach J,Röttger R

doi

10.1038/nmeth.3583

subject

Has Abstract

pub_date

2015-11-01 00:00:00

pages

1033-8

issue

11

eissn

1548-7091

issn

1548-7105

pii

nmeth.3583

journal_volume

12

pub_type

杂志文章
  • A versatile tool for conditional gene expression and knockdown.

    abstract::Drug-inducible systems allowing the control of gene expression in mammalian cells are invaluable tools for genetic research, and could also fulfill essential roles in gene- and cell-based therapy. Currently available systems, however, often have limited in vivo functionality because of leakiness, insufficient levels o...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth846

    authors: Szulc J,Wiznerowicz M,Sauvain MO,Trono D,Aebischer P

    更新日期:2006-02-01 00:00:00

  • Robust statistical modeling improves sensitivity of high-throughput RNA structure probing experiments.

    abstract::Structure probing coupled with high-throughput sequencing could revolutionize our understanding of the role of RNA structure in regulation of gene expression. Despite recent technological advances, intrinsic noise and high sequence coverage requirements greatly limit the applicability of these techniques. Here we desc...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.4068

    authors: Selega A,Sirocchi C,Iosub I,Granneman S,Sanguinetti G

    更新日期:2017-01-01 00:00:00

  • Nm-seq maps 2'-O-methylation sites in human mRNA with base precision.

    abstract::The ribose of RNA nucleotides can be 2'-O-methylated (Nm). Despite advances in high-throughput detection, the inert chemical nature of Nm still limits sensitivity and precludes mapping in mRNA. We leveraged the differential reactivity of 2'-O-methylated and 2'-hydroxylated nucleosides to periodate oxidation to develop...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.4294

    authors: Dai Q,Moshitch-Moshkovitz S,Han D,Kol N,Amariglio N,Rechavi G,Dominissini D,He C

    更新日期:2017-07-01 00:00:00

  • Efficient and quantitative high-throughput tRNA sequencing.

    abstract::Despite its biological importance, tRNA has not been adequately sequenced by standard methods because of its abundant post-transcriptional modifications and stable structure, which interfere with cDNA synthesis. We achieved efficient and quantitative tRNA sequencing in HEK293T cells by using engineered demethylases to...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.3478

    authors: Zheng G,Qin Y,Clark WC,Dai Q,Yi C,He C,Lambowitz AM,Pan T

    更新日期:2015-09-01 00:00:00

  • Evaluating measures of association for single-cell transcriptomics.

    abstract::Single-cell transcriptomics provides an opportunity to characterize cell-type-specific transcriptional networks, intercellular signaling pathways and cellular diversity with unprecedented resolution by profiling thousands of cells in a single experiment. However, owing to the unique statistical properties of scRNA-seq...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/s41592-019-0372-4

    authors: Skinnider MA,Squair JW,Foster LJ

    更新日期:2019-05-01 00:00:00

  • Three-dimensional nanoscopy of whole cells and tissues with in situ point spread function retrieval.

    abstract::Single-molecule localization microscopy is a powerful tool for visualizing subcellular structures, interactions and protein functions in biological research. However, inhomogeneous refractive indices inside cells and tissues distort the fluorescent signal emitted from single-molecule probes, which rapidly degrades res...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/s41592-020-0816-x

    authors: Xu F,Ma D,MacPherson KP,Liu S,Bu Y,Wang Y,Tang Y,Bi C,Kwok T,Chubykin AA,Yin P,Calve S,Landreth GE,Huang F

    更新日期:2020-05-01 00:00:00

  • Atomic-resolution structures from fragmented protein crystals with the cryoEM method MicroED.

    abstract::Traditionally, crystallographic analysis of macromolecules has depended on large, well-ordered crystals, which often require significant effort to obtain. Even sizable crystals sometimes suffer from pathologies that render them inappropriate for high-resolution structure determination. Here we show that fragmentation ...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.4178

    authors: de la Cruz MJ,Hattne J,Shi D,Seidler P,Rodriguez J,Reyes FE,Sawaya MR,Cascio D,Weiss SC,Kim SK,Hinck CS,Hinck AP,Calero G,Eisenberg D,Gonen T

    更新日期:2017-02-13 00:00:00

  • Nanoscale resolution in GFP-based microscopy.

    abstract::We report attainment of subdiffraction resolution using stimulated emission depletion (STED) microscopy with GFP-labeled samples. The approximately 70 nm lateral resolution attained in this study is demonstrated by imaging GFP-labeled viruses and the endoplasmic reticulum (ER) of a mammalian cell. Our results mark the...

    journal_title:Nature methods

    pub_type: 杂志文章,评审

    doi:10.1038/nmeth922

    authors: Willig KI,Kellner RR,Medda R,Hein B,Jakobs S,Hell SW

    更新日期:2006-09-01 00:00:00

  • Imaging cellular ultrastructures using expansion microscopy (U-ExM).

    abstract::Determining the structure and composition of macromolecular assemblies is a major challenge in biology. Here we describe ultrastructure expansion microscopy (U-ExM), an extension of expansion microscopy that allows the visualization of preserved ultrastructures by optical microscopy. This method allows for near-native...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/s41592-018-0238-1

    authors: Gambarotto D,Zwettler FU,Le Guennec M,Schmidt-Cernohorska M,Fortun D,Borgers S,Heine J,Schloetel JG,Reuss M,Unser M,Boyden ES,Sauer M,Hamel V,Guichard P

    更新日期:2019-01-01 00:00:00

  • More options for gene editing.

    abstract::Engineering precise genetic changes in a genome is powerful way to study gene function, and several recent papers describe new applications of gene-editing tools. Working with researchers at Sangamo BioSciences, Howard Hughes Medical Institute investigator Barbara Meyer and her colleagues at the University of Californ...

    journal_title:Nature methods

    pub_type: 评论,杂志文章

    doi:10.1038/nmeth.1683

    authors: Baker M

    更新日期:2011-09-01 00:00:00

  • Varying label density allows artifact-free analysis of membrane-protein nanoclusters.

    abstract::We present a method to robustly discriminate clustered from randomly distributed molecules detected with techniques based on single-molecule localization microscopy, such as PALM and STORM. The approach is based on deliberate variation of labeling density, such as titration of fluorescent antibody, combined with quant...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.3897

    authors: Baumgart F,Arnold AM,Leskovar K,Staszek K,Fölser M,Weghuber J,Stockinger H,Schütz GJ

    更新日期:2016-08-01 00:00:00

  • Purification and enrichment of specific chromatin loci.

    abstract::Understanding how chromatin is regulated is essential to fully grasp genome biology, and establishing the locus-specific protein composition is a major step toward this goal. Here we explain why the isolation and analysis of a specific chromatin segment are technically challenging, independently of the method. We then...

    journal_title:Nature methods

    pub_type: 杂志文章,评审

    doi:10.1038/s41592-020-0765-4

    authors: Gauchier M,van Mierlo G,Vermeulen M,Déjardin J

    更新日期:2020-04-01 00:00:00

  • Quantitative analysis of gene expression in a single cell by qPCR.

    abstract::We developed a quantitative PCR method featuring a reusable single-cell cDNA library immobilized on beads for measuring the expression of multiple genes in a single cell. We used this method to analyze multiple cDNA targets (from several copies to several hundred thousand copies) with an experimental error of 15.9% or...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.1338

    authors: Taniguchi K,Kajiyama T,Kambara H

    更新日期:2009-07-01 00:00:00

  • Independence and reproducibility across microarray platforms.

    abstract::Microarrays have been widely used for the analysis of gene expression, but the issue of reproducibility across platforms has yet to be fully resolved. To address this apparent problem, we compared gene expression between two microarray platforms: the short oligonucleotide Affymetrix Mouse Genome 430 2.0 GeneChip and a...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth757

    authors: Larkin JE,Frank BC,Gavras H,Sultana R,Quackenbush J

    更新日期:2005-05-01 00:00:00

  • Visualizing a one-way protein encounter complex by ultrafast single-molecule mixing.

    abstract::We combined rapid microfluidic mixing with single-molecule fluorescence resonance energy transfer to study the folding kinetics of the intrinsically disordered human protein α-synuclein. The time-resolution of 0.2 ms revealed initial collapse of the unfolded protein induced by binding with lipid mimics and subsequent ...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.1568

    authors: Gambin Y,VanDelinder V,Ferreon AC,Lemke EA,Groisman A,Deniz AA

    更新日期:2011-03-01 00:00:00

  • Conditional genome engineering in Toxoplasma gondii uncovers alternative invasion mechanisms.

    abstract::We established a conditional site-specific recombination system based on dimerizable Cre recombinase-mediated recombination in the apicomplexan parasite Toxoplasma gondii. Using a new single-vector strategy that allows ligand-dependent, efficient removal of a gene of interest, we generated three knockouts of apicomple...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.2301

    authors: Andenmatten N,Egarter S,Jackson AJ,Jullien N,Herman JP,Meissner M

    更新日期:2013-02-01 00:00:00

  • A guide to designing germline-dependent epigenetic inheritance experiments in mammals.

    abstract::Recent work has demonstrated that environmental factors experienced by parents can affect their offspring across multiple generations, and that such transgenerational transmission can depend on the germline. Causal evidence for the involvement of germ cells is rare, however, and the underlying molecular mechanisms rem...

    journal_title:Nature methods

    pub_type: 杂志文章,评审

    doi:10.1038/nmeth.4181

    authors: Bohacek J,Mansuy IM

    更新日期:2017-02-28 00:00:00

  • Functional ultrasound imaging of the brain.

    abstract::We present functional ultrasound (fUS), a method for imaging transient changes in blood volume in the whole brain at better spatiotemporal resolution than with other functional brain imaging modalities. fUS uses plane-wave illumination at high frame rate and can measure blood volumes in smaller vessels than previous u...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.1641

    authors: Macé E,Montaldo G,Cohen I,Baulac M,Fink M,Tanter M

    更新日期:2011-07-03 00:00:00

  • Rational design of true monomeric and bright photoactivatable fluorescent proteins.

    abstract::Monomeric (m)Eos2 is an engineered photoactivatable fluorescent protein widely used for super-resolution microscopy. We show that mEos2 forms oligomers at high concentrations and forms aggregates when labeling membrane proteins, limiting its application as a fusion partner. We solved the crystal structure of tetrameri...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.2021

    authors: Zhang M,Chang H,Zhang Y,Yu J,Wu L,Ji W,Chen J,Liu B,Lu J,Liu Y,Zhang J,Xu P,Xu T

    更新日期:2012-05-13 00:00:00

  • Spheroid-based engineering of a human vasculature in mice.

    abstract::The complexity of the angiogenic cascade limits cellular approaches to studying angiogenic endothelial cells (ECs). In turn, in vivo assays do not allow the analysis of the distinct cellular behavior of ECs during angiogenesis. Here we show that ECs can be grafted as spheroids into a matrix to give rise to a complex t...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.1198

    authors: Alajati A,Laib AM,Weber H,Boos AM,Bartol A,Ikenberg K,Korff T,Zentgraf H,Obodozie C,Graeser R,Christian S,Finkenzeller G,Stark GB,Héroult M,Augustin HG

    更新日期:2008-05-01 00:00:00

  • High-throughput genetic interaction mapping in the fission yeast Schizosaccharomyces pombe.

    abstract::Epistasis analysis, which reports on the extent to which the function of one gene depends on the presence of a second, is a powerful tool for studying the functional organization of the cell. Systematic genome-wide studies of epistasis, however, have been limited, with the majority of data being collected in the buddi...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth1098

    authors: Roguev A,Wiren M,Weissman JS,Krogan NJ

    更新日期:2007-10-01 00:00:00

  • Protein-RNA networks revealed through covalent RNA marks.

    abstract::Protein-RNA networks are ubiquitous and central in biological control. We present an approach termed RNA Tagging that enables the user to identify protein-RNA interactions in vivo by analyzing purified cellular RNA, without protein purification or cross-linking. An RNA-binding protein of interest is fused to an enzyme...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.3651

    authors: Lapointe CP,Wilinski D,Saunders HA,Wickens M

    更新日期:2015-12-01 00:00:00

  • Atomic-accuracy models from 4.5-Å cryo-electron microscopy data with density-guided iterative local refinement.

    abstract::We describe a general approach for refining protein structure models on the basis of cryo-electron microscopy maps with near-atomic resolution. The method integrates Monte Carlo sampling with local density-guided optimization, Rosetta all-atom refinement and real-space B-factor fitting. In tests on experimental maps o...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.3286

    authors: DiMaio F,Song Y,Li X,Brunner MJ,Xu C,Conticello V,Egelman E,Marlovits T,Cheng Y,Baker D

    更新日期:2015-04-01 00:00:00

  • scGen predicts single-cell perturbation responses.

    abstract::Accurately modeling cellular response to perturbations is a central goal of computational biology. While such modeling has been based on statistical, mechanistic and machine learning models in specific settings, no generalization of predictions to phenomena absent from training data (out-of-sample) has yet been demons...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/s41592-019-0494-8

    authors: Lotfollahi M,Wolf FA,Theis FJ

    更新日期:2019-08-01 00:00:00

  • Tracking the wily transcription factor.

    abstract::A 'paired-end ditag' (PET) strategy for pinpointing protein binding sites can reveal a wealth of information about transcription factors and other DNA-binding proteins. ...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth0506-341

    authors: Eisenstein M

    更新日期:2006-05-01 00:00:00

  • Deciphering laminar-specific neural inputs with line-scanning fMRI.

    abstract::Using a line-scanning method during functional magnetic resonance imaging (fMRI), we obtained high temporal (50-ms) and spatial (50-μm) resolution information along the cortical thickness and showed that the laminar position of fMRI onset coincides with distinct neural inputs in rat somatosensory and motor cortices. T...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.2730

    authors: Yu X,Qian C,Chen DY,Dodd SJ,Koretsky AP

    更新日期:2014-01-01 00:00:00

  • NeuroGPS-Tree: automatic reconstruction of large-scale neuronal populations with dense neurites.

    abstract::The reconstruction of neuronal populations, a key step in understanding neural circuits, remains a challenge in the presence of densely packed neurites. Here we achieved automatic reconstruction of neuronal populations by partially mimicking human strategies to separate individual neurons. For populations not resolvab...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.3662

    authors: Quan T,Zhou H,Li J,Li S,Li A,Li Y,Lv X,Luo Q,Gong H,Zeng S

    更新日期:2016-01-01 00:00:00

  • Membrane-protein structure determination by solid-state NMR spectroscopy of microcrystals.

    abstract::Membrane proteins are largely underrepresented among available atomic-resolution structures. The use of detergents in protein purification procedures hinders the formation of well-ordered crystals for X-ray crystallography and leads to slower molecular tumbling, impeding the application of solution-state NMR. Solid-st...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.2248

    authors: Shahid SA,Bardiaux B,Franks WT,Krabben L,Habeck M,van Rossum BJ,Linke D

    更新日期:2012-12-01 00:00:00

  • Dynamic proteomics in individual human cells uncovers widespread cell-cycle dependence of nuclear proteins.

    abstract::We examined cell cycle-dependent changes in the proteome of human cells by systematically measuring protein dynamics in individual living cells. We used time-lapse microscopy to measure the dynamics of a random subset of 20 nuclear proteins, each tagged with yellow fluorescent protein (YFP) at its endogenous chromosom...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth892

    authors: Sigal A,Milo R,Cohen A,Geva-Zatorsky N,Klein Y,Alaluf I,Swerdlin N,Perzov N,Danon T,Liron Y,Raveh T,Carpenter AE,Lahav G,Alon U

    更新日期:2006-07-01 00:00:00

  • High-frequency genome editing using ssDNA oligonucleotides with zinc-finger nucleases.

    abstract::Zinc-finger nucleases (ZFNs) have enabled highly efficient gene targeting in multiple cell types and organisms. Here we describe methods for using simple ssDNA oligonucleotides in tandem with ZFNs to efficiently produce human cell lines with three distinct genetic outcomes: (i) targeted point mutation, (ii) targeted g...

    journal_title:Nature methods

    pub_type: 杂志文章

    doi:10.1038/nmeth.1653

    authors: Chen F,Pruett-Miller SM,Huang Y,Gjoka M,Duda K,Taunton J,Collingwood TN,Frodin M,Davis GD

    更新日期:2011-07-17 00:00:00