Robustness of signal detection in cryo-electron microscopy via a bi-objective-function approach.

Abstract:

BACKGROUND:The detection of weak signals and selection of single particles from low-contrast micrographs of frozen hydrated biomolecules by cryo-electron microscopy (cryo-EM) represents a major practical bottleneck in cryo-EM data analysis. Template-based particle picking by an objective function using fast local correlation (FLC) allows computational extraction of a large number of candidate particles from micrographs. Another independent objective function based on maximum likelihood estimates (MLE) can be used to align the images and verify the presence of a signal in the selected particles. Despite the widespread applications of the two objective functions, an optimal combination of their utilities has not been exploited. Here we propose a bi-objective function (BOF) approach that combines both FLC and MLE and explore the potential advantages and limitations of BOF in signal detection from cryo-EM data. RESULTS:The robustness of the BOF strategy in particle selection and verification was systematically examined with both simulated and experimental cryo-EM data. We investigated how the performance of the BOF approach is quantitatively affected by the signal-to-noise ratio (SNR) of cryo-EM data and by the choice of initialization for FLC and MLE. We quantitatively pinpointed the critical SNR (~ 0.005), at which the BOF approach starts losing its ability to select and verify particles reliably. We found that the use of a Gaussian model to initialize the MLE suppresses the adverse effects of reference dependency in the FLC function used for template-matching. CONCLUSION:The BOF approach, which combines two distinct objective functions, provides a sensitive way to verify particles for downstream cryo-EM structure analysis. Importantly, reference dependency of the FLC does not necessarily transfer to the MLE, enabling the robust detection of weak signals. Our insights into the numerical behavior of the BOF approach can be used to improve automation efficiency in the cryo-EM data processing pipeline for high-resolution structural determination.

journal_name

BMC Bioinformatics

journal_title

BMC bioinformatics

authors

Wang WL,Yu Z,Castillo-Menendez LR,Sodroski J,Mao Y

doi

10.1186/s12859-019-2714-8

subject

Has Abstract

pub_date

2019-04-03 00:00:00

pages

169

issue

1

issn

1471-2105

pii

10.1186/s12859-019-2714-8

journal_volume

20

pub_type

杂志文章
  • A new method for 2D gel spot alignment: application to the analysis of large sample sets in clinical proteomics.

    abstract:BACKGROUND:In current comparative proteomics studies, the large number of images generated by 2D gels is currently compared using spot matching algorithms. Unfortunately, differences in gel migration and sample variability make efficient spot alignment very difficult to obtain, and, as consequence most of the software ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-460

    authors: Pérès S,Molina L,Salvetat N,Granier C,Molina F

    更新日期:2008-10-28 00:00:00

  • imputeqc: an R package for assessing imputation quality of genotypes and optimizing imputation parameters.

    abstract:BACKGROUND:The imputation of genotypes increases the power of genome-wide association studies. However, the imputation quality should be assessed in each particular case. Nevertheless, not all imputation softwares control the error of output, e.g., the last release of fastPHASE program (1.4.8) lacks such an option. In ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-020-03589-0

    authors: Khvorykh GV,Khrunin AV

    更新日期:2020-07-24 00:00:00

  • Frnakenstein: multiple target inverse RNA folding.

    abstract:BACKGROUND:RNA secondary structure prediction, or folding, is a classic problem in bioinformatics: given a sequence of nucleotides, the aim is to predict the base pairs formed in its three dimensional conformation. The inverse problem of designing a sequence folding into a particular target structure has only more rece...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-260

    authors: Lyngsø RB,Anderson JW,Sizikova E,Badugu A,Hyland T,Hein J

    更新日期:2012-10-09 00:00:00

  • MGOGP: a gene module-based heuristic algorithm for cancer-related gene prioritization.

    abstract:BACKGROUND:Prioritizing genes according to their associations with a cancer allows researchers to explore genes in more informed ways. By far, Gene-centric or network-centric gene prioritization methods are predominated. Genes and their protein products carry out cellular processes in the context of functional modules....

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2216-0

    authors: Su L,Liu G,Bai T,Meng X,Ma Q

    更新日期:2018-06-05 00:00:00

  • Multi-label literature classification based on the Gene Ontology graph.

    abstract:BACKGROUND:The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-525

    authors: Jin B,Muller B,Zhai C,Lu X

    更新日期:2008-12-08 00:00:00

  • Integrated olfactory receptor and microarray gene expression databases.

    abstract:BACKGROUND:Gene expression patterns of olfactory receptors (ORs) are an important component of the signal encoding mechanism in the olfactory system since they determine the interactions between odorant ligands and sensory neurons. We have developed the Olfactory Receptor Microarray Database (ORMD) to house OR gene exp...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-8-231

    authors: Liu N,Crasto CJ,Ma M

    更新日期:2007-06-30 00:00:00

  • Robust detection of periodic time series measured from biological systems.

    abstract:BACKGROUND:Periodic phenomena are widespread in biology. The problem of finding periodicity in biological time series can be viewed as a multiple hypothesis testing of the spectral content of a given time series. The exact noise characteristics are unknown in many bioinformatics applications. Furthermore, the observed ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-117

    authors: Ahdesmäki M,Lähdesmäki H,Pearson R,Huttunen H,Yli-Harja O

    更新日期:2005-05-13 00:00:00

  • Local functional descriptors for surface comparison based binding prediction.

    abstract:BACKGROUND:Molecular recognition in proteins occurs due to appropriate arrangements of physical, chemical, and geometric properties of an atomic surface. Similar surface regions should create similar binding interfaces. Effective methods for comparing surface regions can be used in identifying similar regions, and to p...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-314

    authors: Cipriano GM,Phillips GN Jr,Gleicher M

    更新日期:2012-11-24 00:00:00

  • Comparison of methods to detect copy number alterations in cancer using simulated and real genotyping data.

    abstract:BACKGROUND:The detection of genomic copy number alterations (CNA) in cancer based on SNP arrays requires methods that take into account tumour specific factors such as normal cell contamination and tumour heterogeneity. A number of tools have been recently developed but their performance needs yet to be thoroughly asse...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-192

    authors: Mosén-Ansorena D,Aransay AM,Rodríguez-Ezpeleta N

    更新日期:2012-08-07 00:00:00

  • mRNA:guanine-N7 cap methyltransferases: identification of novel members of the family, evolutionary analysis, homology modeling, and analysis of sequence-structure-function relationships.

    abstract:BACKGROUND:The 5'-terminal cap structure plays an important role in many aspects of mRNA metabolism. Capping enzymes encoded by viruses and pathogenic fungi are attractive targets for specific inhibitors. There is a large body of experimental data on viral and cellular methyltransferases (MTases) that carry out guanine...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-2-2

    authors: Bujnicki JM,Feder M,Radlinska M,Rychlewski L

    更新日期:2001-01-01 00:00:00

  • Graph-representation of oxidative folding pathways.

    abstract:BACKGROUND:The process of oxidative folding combines the formation of native disulfide bond with conformational folding resulting in the native three-dimensional fold. Oxidative folding pathways can be described in terms of disulfide intermediate species (DIS) which can also be isolated and characterized. Each DIS corr...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-6-19

    authors: Agoston V,Cemazar M,Kaján L,Pongor S

    更新日期:2005-01-27 00:00:00

  • NPBSS: a new PacBio sequencing simulator for generating the continuous long reads with an empirical model.

    abstract:BACKGROUND:PacBio sequencing platform offers longer read lengths than the second-generation sequencing technologies. It has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. Due to its extremely wide range of application areas, fast sequencing simulation syste...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2208-0

    authors: Wei ZG,Zhang SW

    更新日期:2018-05-22 00:00:00

  • The G protein-coupled receptors in the pufferfish Takifugu rubripes.

    abstract:BACKGROUND:Guanine protein-coupled receptors (GPCRs) constitute a eukaryotic transmembrane protein family and function as "molecular switches" in the second messenger cascades and are found in all organisms between yeast and humans. They form the single, biggest drug-target family due to their versatility of action and...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-S1-S3

    authors: Sarkar A,Kumar S,Sundar D

    更新日期:2011-02-15 00:00:00

  • ModuleOrganizer: detecting modules in families of transposable elements.

    abstract:BACKGROUND:Most known eukaryotic genomes contain mobile copied elements called transposable elements. In some species, these elements account for the majority of the genome sequence. They have been subject to many mutations and other genomic events (copies, deletions, captures) during transposition. The identification ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-474

    authors: Tempel S,Rousseau C,Tahi F,Nicolas J

    更新日期:2010-09-22 00:00:00

  • JContextExplorer: a tree-based approach to facilitate cross-species genomic context comparison.

    abstract:BACKGROUND:Cross-species comparisons of gene neighborhoods (also called genomic contexts) in microbes may provide insight into determining functionally related or co-regulated sets of genes, suggest annotations of previously un-annotated genes, and help to identify horizontal gene transfer events across microbial speci...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-18

    authors: Seitzer P,Huynh TA,Facciotti MT

    更新日期:2013-01-16 00:00:00

  • An improved distance measure between the expression profiles linking co-expression and co-regulation in mouse.

    abstract:BACKGROUND:Many statistical algorithms combine microarray expression data and genome sequence data to identify transcription factor binding motifs in the low eukaryotic genomes. Finding cis-regulatory elements in higher eukaryote genomes, however, remains a challenge, as searching in the promoter regions of genes with ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-44

    authors: Kim RS,Ji H,Wong WH

    更新日期:2006-01-26 00:00:00

  • Visualizing complex feature interactions and feature sharing in genomic deep neural networks.

    abstract:BACKGROUND:Visualization tools for deep learning models typically focus on discovering key input features without considering how such low level features are combined in intermediate layers to make decisions. Moreover, many of these methods examine a network's response to specific input examples that may be insufficien...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-2957-4

    authors: Liu G,Zeng H,Gifford DK

    更新日期:2019-07-19 00:00:00

  • Rearrangement analysis of multiple bacterial genomes.

    abstract:BACKGROUND:Genomes are subjected to rearrangements that change the orientation and ordering of genes during evolution. The most common rearrangements that occur in uni-chromosomal genomes are inversions (or reversals) to adapt to the changing environment. Since genome rearrangements are rarer than point mutations, gene...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-019-3293-4

    authors: Noureen M,Tada I,Kawashima T,Arita M

    更新日期:2019-12-27 00:00:00

  • MD-SeeGH: a platform for integrative analysis of multi-dimensional genomic data.

    abstract:BACKGROUND:Recent advances in global genomic profiling methodologies have enabled multi-dimensional characterization of biological systems. Complete analysis of these genomic profiles require an in depth look at parallel profiles of segmental DNA copy number status, DNA methylation state, single nucleotide polymorphism...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-243

    authors: Chi B,deLeeuw RJ,Coe BP,Ng RT,MacAulay C,Lam WL

    更新日期:2008-05-20 00:00:00

  • Integrating biological knowledge into variable selection: an empirical Bayes approach with an application in cancer biology.

    abstract:BACKGROUND:An important question in the analysis of biochemical data is that of identifying subsets of molecular variables that may jointly influence a biological response. Statistical variable selection methods have been widely used for this purpose. In many settings, it may be important to incorporate ancillary biolo...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-13-94

    authors: Hill SM,Neve RM,Bayani N,Kuo WL,Ziyad S,Spellman PT,Gray JW,Mukherjee S

    更新日期:2012-05-11 00:00:00

  • MEGADOCK-Web: an integrated database of high-throughput structure-based protein-protein interaction predictions.

    abstract:BACKGROUND:Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations fo...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2073-x

    authors: Hayashi T,Matsuzaki Y,Yanagisawa K,Ohue M,Akiyama Y

    更新日期:2018-05-08 00:00:00

  • BRCA-Pathway: a structural integration and visualization system of TCGA breast cancer data on KEGG pathways.

    abstract:BACKGROUND:Bioinformatics research for finding biological mechanisms can be done by analysis of transcriptome data with pathway based interpretation. Therefore, researchers have tried to develop tools to analyze transcriptome data with pathway based interpretation. Over the years, the amount of omics data has become hu...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-018-2016-6

    authors: Kim I,Choi S,Kim S

    更新日期:2018-02-19 00:00:00

  • SIMAT: GC-SIM-MS data analysis tool.

    abstract:BACKGROUND:Gas chromatography coupled with mass spectrometry (GC-MS) is one of the technologies widely used for qualitative and quantitative analysis of small molecules. In particular, GC coupled to single quadrupole MS can be utilized for targeted analysis by selected ion monitoring (SIM). However, to our knowledge, t...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/s12859-015-0681-2

    authors: Ranjbar MR,Di Poto C,Wang Y,Ressom HW

    更新日期:2015-08-19 00:00:00

  • Random generalized linear model: a highly accurate and interpretable ensemble predictor.

    abstract:BACKGROUND:Ensemble predictors such as the random forest are known to have superior accuracy but their black-box predictions are difficult to interpret. In contrast, a generalized linear model (GLM) is very interpretable especially when forward feature selection is used to construct the model. However, forward feature ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-5

    authors: Song L,Langfelder P,Horvath S

    更新日期:2013-01-16 00:00:00

  • Widespread evidence of viral miRNAs targeting host pathways.

    abstract:BACKGROUND:MicroRNAs (miRNA) are regulatory genes that target and repress other RNA molecules via sequence-specific binding. Several biological processes are regulated across many organisms by evolutionarily conserved miRNAs. Plants and invertebrates employ their miRNA in defense against viruses by targeting and degrad...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-14-S2-S3

    authors: Carl JW Jr,Trgovcich J,Hannenhalli S

    更新日期:2013-01-01 00:00:00

  • Promoting ranking diversity for genomics search with relevance-novelty combined model.

    abstract:BACKGROUND:In the biomedical domain, the desired information of a question (query) asked by biologists usually is a list of a certain type of entities covering different aspects that are related to the question, such as genes, proteins, diseases, mutations, etc. Hence it is important for a biomedical information retrie...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-S5-S8

    authors: Yin X,Li Z,Huang JX,Hu X

    更新日期:2011-01-01 00:00:00

  • STSE: Spatio-Temporal Simulation Environment Dedicated to Biology.

    abstract:BACKGROUND:Recently, the availability of high-resolution microscopy together with the advancements in the development of biomarkers as reporters of biomolecular interactions increased the importance of imaging methods in molecular cell biology. These techniques enable the investigation of cellular characteristics like ...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-12-126

    authors: Stoma S,Fröhlich M,Gerber S,Klipp E

    更新日期:2011-04-28 00:00:00

  • Functional clustering of yeast proteins from the protein-protein interaction network.

    abstract:BACKGROUND:The abundant data available for protein interaction networks have not yet been fully understood. New types of analyses are needed to reveal organizational principles of these networks to investigate the details of functional and regulatory clusters of proteins. RESULTS:In the present work, individual cluste...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-7-355

    authors: Sen TZ,Kloczkowski A,Jernigan RL

    更新日期:2006-07-24 00:00:00

  • Meta-analysis of breast cancer microarray studies in conjunction with conserved cis-elements suggest patterns for coordinate regulation.

    abstract:BACKGROUND:Gene expression measurements from breast cancer (BrCa) tumors are established clinical predictive tools to identify tumor subtypes, identify patients showing poor/good prognosis, and identify patients likely to have disease recurrence. However, diverse breast cancer datasets in conjunction with diagnostic cl...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-9-63

    authors: Smith DD,Saetrom P,Snøve O Jr,Lundberg C,Rivas GE,Glackin C,Larson GP

    更新日期:2008-01-28 00:00:00

  • GOAL: a software tool for assessing biological significance of genes groups.

    abstract:BACKGROUND:Modern high throughput experimental techniques such as DNA microarrays often result in large lists of genes. Computational biology tools such as clustering are then used to group together genes based on their similarity in expression profiles. Genes in each group are probably functionally related. The functi...

    journal_title:BMC bioinformatics

    pub_type: 杂志文章

    doi:10.1186/1471-2105-11-229

    authors: Tchagang AB,Gawronski A,Bérubé H,Phan S,Famili F,Pan Y

    更新日期:2010-05-06 00:00:00