Deep learning embedder method and tool for mass spectra similarity search.

Abstract:

:Spectral similarity calculation is widely used in protein identification tools and mass spectra clustering algorithms while comparing theoretical or experimental spectra. The performance of the spectral similarity calculation plays an important role in these tools and algorithms especially in the analysis of large-scale datasets. Recently, deep learning methods have been proposed to improve the performance of clustering algorithms and protein identification by training the algorithms with existing data and the use of multiple spectra and identified peptide features. While the efficiency of these algorithms is still under study in comparison with traditional approaches, their application in proteomics data analysis is becoming more common. Here, we propose the use of deep learning to improve spectral similarity comparison. We assessed the performance of deep learning for spectral similarity, with GLEAMS and a newly trained embedder model (DLEAMSE), which uses high-quality spectra from PRIDE Cluster. Also, we developed a new bioinformatics tool (mslookup - https://github.com/bigbio/DLEAMSE/) that allows users to quickly search for spectra in previously identified mass spectra publish in public repositories and spectral libraries. Finally, we released a human database to enable bioinformaticians and biologists to search for identified spectra in their machines. SIGNIFICANCE STATEMENT: Spectral similarity calculation plays an important role in proteomics data analysis. With deep learning's ability to learn the implicit and effective features from large-scale training datasets, deep learning-based MS/MS spectra embedding models has emerged as a solution to improve mass spectral clustering similarity calculation algorithms. We compare multiple similarity scoring and deep learning methods in terms of accuracy (compute the similarity for a pair of the mass spectrum) and computing-time performance. The benchmark results showed no major differences in accuracy between DLEAMSE and normalized dot product for spectrum similarity calculations. The DLEAMSE GPU implementation is faster than NDP in preprocessing on the GPU server and the similarity calculation of DLEAMSE (Euclidean distance on 32-D vectors) takes about 1/3 of dot product calculations. The deep learning model (DLEAMSE) encoding and embedding steps needed to run once for each spectrum and the embedded 32-D points can be persisted in the repository for future comparison, which is faster for future comparisons and large-scale data. Based on these, we proposed a new tool mslookup that enables the researcher to find spectra previously identified in public data. The tool can be also used to generate in-house databases of previously identified spectra to share with other laboratories and consortiums.

journal_name

J Proteomics

journal_title

Journal of proteomics

authors

Qin C,Luo X,Deng C,Shu K,Zhu W,Griss J,Hermjakob H,Bai M,Perez-Riverol Y

doi

10.1016/j.jprot.2020.104070

subject

Has Abstract

pub_date

2021-02-10 00:00:00

pages

104070

eissn

1874-3919

issn

1876-7737

pii

S1874-3919(20)30438-3

journal_volume

232

pub_type

杂志文章
  • A proteomics approach to identify proteins differentially expressed in Douglas-fir seedlings infected by Phellinus sulphurascens.

    abstract::We carried out a comparative proteomic study to explore the molecular mechanisms that underlie the defense response of Douglas-fir (DF, Pseudotsuga menziesii) to laminated root rot, a disease caused by Phellinus sulphurascens. 2-DE was conducted on proteins extracted from roots of laboratory-grown, young DF seedlings ...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2008.06.004

    authors: Islam MA,Sturrock RN,Ekramoddoullah AK

    更新日期:2008-10-07 00:00:00

  • MHC class I presented antigens from malignancies: A perspective on analytical characterization & immunogenicity.

    abstract::The field of cancer immunotherapy has expanded rapidly in the past few years, with many new approaches entering the clinic for T cell mediated killing of tumors. Several of these clinical approaches involve the exploitation of a CD8 + T cell response against MHC I presented tumor antigens. Here, we describe the types ...

    journal_title:Journal of proteomics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jprot.2018.04.021

    authors: Schmidt M,Lill JR

    更新日期:2019-01-16 00:00:00

  • Proteomics of wine additives: mining for the invisible via combinatorial peptide ligand libraries.

    abstract::Combinatorial peptide ligand libraries (CPLLs) have been adopted for harvesting and identifying traces of casein (used as a fining agent) present in white wines. Although minute amounts (200 microL) of CPLL beads are added to the entire content of a wine bottle (750 mL), they are able to sequester with high efficiency...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2010.05.010

    authors: Cereda A,Kravchuk AV,D'Amato A,Bachi A,Righetti PG

    更新日期:2010-08-05 00:00:00

  • Proteomics as a tool to explore human milk in health and disease.

    abstract::Proteins in milk have wide range of functions, they are carriers of minerals or chemically vulnerable and insoluble vitamins and other compounds, stabilisers of large aggregates or micelles of lipids, and components of both innate and acquired immune defence systems. Together with other components of milk, proteins ma...

    journal_title:Journal of proteomics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jprot.2013.04.008

    authors: Roncada P,Stipetic LH,Bonizzi L,Burchmore RJ,Kennedy MW

    更新日期:2013-08-02 00:00:00

  • Applicability of 2-DE to assess differences in the protein profile between cold storage and not cold storage in nectarine fruits.

    abstract::Cold storage is being used to increase nectarine fruits' postharvest life. However, low temperatures lead to chilling injury and limit their commercial quality and value. In this study a proteomic approach was used to compare the protein profile between control and cold storage nectarine fruits. Protein extracted from...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2012.08.005

    authors: Giraldo E,Díaz A,Corral JM,García A

    更新日期:2012-10-22 00:00:00

  • Schistosome infections induce significant changes in the host biliary proteome.

    abstract:UNLABELLED:Schistosomiasis is a disease caused by blood trematodes affecting man and animals that represents an important human health and veterinary problem. Main damages caused by this infection are a consequence of the host inflammatory reaction against the parasite eggs trapped inside the liver. Despite that the he...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2014.11.009

    authors: de la Torre-Escudero E,Pérez-Sánchez R,Manzano-Román R,Oleaga A

    更新日期:2015-01-30 00:00:00

  • Searching for specific motifs in affinity capture in proteome analysis.

    abstract::In analysing the red blood cell cytoplasmic proteome, in search for low abundance proteins, 15 amino acid (AA; Arg, Asn, Asp, Gln, Gly, His, Ile, Lys, Phe, Pro, Ser, Thr, Trp, Tyr, and Val) probes, used individually, captured a total of 787 unique gene products. Of those, 76 were found to be the common catch of all AA...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2009.04.001

    authors: Masseroli M,Bachi A,Boschetti E,Righetti PG

    更新日期:2009-07-21 00:00:00

  • Proteomic expression profiles of virulent and avirulent strains of Listeria monocytogenes isolated from macrophages.

    abstract::Listeria monocytogenes is able to survive and proliferate within macrophages. In the current study, the ability of three L. monocytogenes strains (serovar 1/2a strain EGDe, serovar 4b strain F2365, and serovar 4a strain HCC23) to proliferate in the murine macrophage cell line J774.1 was analyzed. We found that the avi...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2011.05.008

    authors: Donaldson JR,Nanduri B,Pittman JR,Givaruangsawat S,Burgess SC,Lawrence ML

    更新日期:2011-09-06 00:00:00

  • Systematic identification of mitochondrial lysine succinylome in silkworm (Bombyx mori) midgut during the larval gluttonous stage.

    abstract::Lysine succinylation is a newly identified protein post-translational modification (PTM) of lysine residues. Increasing evidences demonstrate that this modification is prevalent in mitochondria and regulates many vital cellular processes, especially metabolism. Here, we determined the succinylome of the silkworm (Bomb...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2017.12.019

    authors: Chen J,Li F,Liu Y,Shen W,Du X,He L,Meng Z,Ma X,Wang Y

    更新日期:2018-03-01 00:00:00

  • Comparative proteomic analysis of four Bacillus clausii strains: proteomic expression signature distinguishes protein profile of the strains.

    abstract::A comparative proteomic approach, using two dimensional gel electrophoresis and mass spectrometry, has been developed to compare and elucidate the differences among the cellular proteomes of four closely related isogenic O/C, SIN, N/R and T, B. clausii strains during both exponential and stationary phases of growth. I...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2011.06.032

    authors: Lippolis R,Gnoni A,Abbrescia A,Panelli D,Maiorano S,Paternoster MS,Sardanelli AM,Papa S,Gaballo A

    更新日期:2011-11-18 00:00:00

  • Could transformation mechanisms of acetylase-harboring pMdT1 plasmid be evaluated through proteomic tools in Escherichia coli?

    abstract:UNLABELLED:Escherichia coli is a commensal microorganism of the gastrointestinal tract of animals and humans and it is an excellent model organism for the study of antibiotic resistance mechanisms. The resistance transmission and other characteristics of bacteria are based on different types of gene transfer occurring ...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2016.03.042

    authors: Magalhães P,Pinto L,Gonçalves A,Araújo JE,Santos HM,Capelo JL,Saénz Y,de Toro M,Torres C,Chambon C,Hébraud M,Poeta P,Igrejas G

    更新日期:2016-08-11 00:00:00

  • Integrated physiological and proteomic analysis reveals underlying response and defense mechanisms of Brachypodium distachyon seedling leaves under osmotic stress, cadmium and their combined stresses.

    abstract::Drought stress, a major abiotic stress, commonly occurs in metal-contaminated environments and affects crop growth and yield. In this study, we performed the first integrated phenotypic, physiological, and proteomic analysis of Brachypodium distachyon L. seedling leaves under polyethylene glycol (PEG) mock osmotic str...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2017.09.015

    authors: Cheng ZW,Chen ZY,Yan X,Bian YW,Deng X,Yan YM

    更新日期:2018-01-06 00:00:00

  • Deregulation of smooth muscle cell cytoskeleton within the human atherosclerotic coronary media layer.

    abstract:UNLABELLED:Fatal events derived from coronary atherosclerosis are the major cause of mortality in the developed countries. Proteomic analysis of the atherosclerotic coronary artery has been mainly carried out with whole tissue extracts, making it difficult to distinguish the alterations present in every region of the p...

    journal_title:Journal of proteomics

    pub_type: 临床试验,杂志文章

    doi:10.1016/j.jprot.2013.01.032

    authors: de la Cuesta F,Zubiri I,Maroto AS,Posada M,Padial LR,Vivanco F,Alvarez-Llamas G,Barderas MG

    更新日期:2013-04-26 00:00:00

  • Targeted quantitative proteomic investigation employing multiple reaction monitoring on quantitative changes in proteins that regulate volatile biosynthesis of strawberry fruit at different ripening stages.

    abstract::A targeted quantitative proteomic investigation employing the multiple reaction monitoring (MRM, SRM) technique was conducted on strawberry fruit at different development stages. We investigated 22 proteins and isoforms from 32 peptides with 111 peptide transitions, which may be involved in the volatile aroma biosynth...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2015.06.004

    authors: Song J,Du L,Li L,Palmer LC,Forney CF,Fillmore S,Zhang Z,Li X

    更新日期:2015-08-03 00:00:00

  • Efferent intestinal lymph protein responses in nematode-resistant, -resilient and -susceptible lambs under challenge with Trichostrongylus colubriformis.

    abstract:UNLABELLED:The mechanisms underlying resistance to challenge by gastrointestinal nematode parasites in sheep are complex. Using DIGE, we profiled ovine lymph proteins in lambs with host resistance (R), resilience (Ri) or susceptibility (S) to a daily trickle challenge with the nematode Trichostrongylus colubriformis. E...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2014.07.017

    authors: Bond JJ,Pernthaner A,Zhang K,Rosanowski SM,Clerens S,Bisset SA,Sutherland IA,Koolaard JP,Hein WR

    更新日期:2014-09-23 00:00:00

  • Proteomics identifies molecular networks affected by tetradecylthioacetic acid and fish oil supplemented diets.

    abstract:UNLABELLED:Fish oil (FO) and tetradecylthioacetic acid (TTA) - a synthetic modified fatty acid have beneficial effects in regulating lipid metabolism. In order to dissect the mechanisms underlying the molecular action of those two fatty acids we have investigated the changes in mitochondrial protein expression in a lon...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2013.03.027

    authors: Wrzesinski K,R León I,Kulej K,Sprenger RR,Bjørndal B,Christensen BJ,Berge RK,Jensen ON,Rogowska-Wrzesinska A

    更新日期:2013-06-12 00:00:00

  • The identification and characterization of epitopes in the 30-34 kDa Trypanosoma cruzi proteins recognized by antibodies in the serum samples of chagasic patients.

    abstract::Trypanosoma cruzi proteins with molecular weight between 30 and 34 kDa have shown high reactivity in western blot assays with serum samples from chagasic individuals. However, in-depth analysis of the constituents of these protein fractions has not been performed. This is the first report of an immunoaffinity proteomi...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2012.11.001

    authors: Verissimo da Costa GC,Lery LM,da Silva ML,Moura H,Peralta RH,von Krüger WM,Bisch PM,Barr JR,Peralta JM

    更新日期:2013-03-27 00:00:00

  • Vitamin D-binding protein as a biomarker of active disease in acute intermittent porphyria.

    abstract:UNLABELLED:Acute intermittent porphyria (AIP) is an autosomal dominant metabolic disorder caused by a deficiency of hepatic porphobilinogen deaminase (PBGD). The disease is characterized by life threatening acute neurovisceral attacks. The aim of this study was to identify metabolites secreted by the hepatocytes that r...

    journal_title:Journal of proteomics

    pub_type: 临床试验,杂志文章

    doi:10.1016/j.jprot.2015.05.004

    authors: Serrano-Mendioroz I,Sampedro A,Mora MI,Mauleón I,Segura V,Enríquez de Salamanca R,Harper P,Sardh E,Corrales FJ,Fontanellas A

    更新日期:2015-09-08 00:00:00

  • Pipeline to assess the greatest source of technical variance in quantitative proteomics using metabolic labelling.

    abstract::The biological variance in protein expression of interest to biologists can only be accessed if the technical variance of the protein quantification method is low compared with the biological variance. Technical variance is dependent on the protocol employed within a quantitative proteomics experiment and accumulated ...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2012.09.020

    authors: Russell MR,Lilley KS

    更新日期:2012-12-21 00:00:00

  • Ability of the marine bacterium Pseudomonas fluorescens BA3SM1 to counteract the toxicity of CdSe nanoparticles.

    abstract:UNLABELLED:In the marine environment, bacteria from estuarine and coastal sediments are among the first targets of nanoparticle pollution; it is therefore relevant to improve the knowledge of interactions between bacteria and nanoparticles. In this work, the response of the marine bacterium Pseudomonas fluorescens BA3S...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2016.07.021

    authors: Poirier I,Kuhn L,Demortière A,Mirvaux B,Hammann P,Chicher J,Caplat C,Pallud M,Bertrand M

    更新日期:2016-10-04 00:00:00

  • The benefits (and misfortunes) of SDS in top-down proteomics.

    abstract::Top-down proteomics (TDP) has great potential for high throughput proteoform characterization. With significant advances in mass spectrometry (MS) instrumentation permitting tandem MS of large intact proteins, a limitation to the widespread adoption of TDP still resides on front-end sample preparation protocols (e.g. ...

    journal_title:Journal of proteomics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jprot.2017.03.002

    authors: Kachuk C,Doucette AA

    更新日期:2018-03-20 00:00:00

  • Quantitative proteomics reveals a role of JAZ7 in plant defense response to Pseudomonas syringae DC3000.

    abstract::Jasmonate ZIM-domain (JAZ) proteins are key transcriptional repressors regulating various biological processes. Although many studies have studied JAZ proteins by genetic and biochemical analyses, little is known about JAZ7-associated global protein networks and how JAZ7 contributes to bacterial pathogen defense. In t...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2018.01.002

    authors: Zhang T,Meng L,Kong W,Yin Z,Wang Y,Schneider JD,Chen S

    更新日期:2018-03-20 00:00:00

  • Proteomics analysis reveals multiple regulatory mechanisms in response to selenium in rice.

    abstract::Selenium (Se) shows both beneficial and toxic effects on plant growth. Rice (Oryza sativa L.) seedlings cultivated under lower concentrations of sodium selenite showed enhanced growth, whereas higher concentrations of sodium selenite repressed seedling growth. To acquire detailed regulatory mechanisms underlying these...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2011.12.030

    authors: Wang YD,Wang X,Wong YS

    更新日期:2012-03-16 00:00:00

  • Transcriptomics and proteomics-based analysis of heterosis on main economic traits of silkworm, Bombyx mori.

    abstract::The application of silkworm hybrids have promoted the innovation and development of agricultural technology, but the mechanism of heterosis in silkworm has not been explained clearly. In this study, the heterosis of silkworm in the aspects of body weight, silk gland and cocoon weight was investigated by means of silkw...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2020.103941

    authors: Xiao R,Yuan Y,Zhu F,He S,Ge Q,Wang X,Taha R,Chen K

    更新日期:2020-10-30 00:00:00

  • Bioactive peptides in plant-derived foodstuffs.

    abstract:UNLABELLED:A literature survey covering the presence of bioactive peptides in plant-derived foodstuffs is presented. Examples are given of plant peptides associated with a beneficial effect on human health. The main bioactive effects of these peptides are defined and their mechanism of action described, when known. Cur...

    journal_title:Journal of proteomics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jprot.2016.03.048

    authors: Maestri E,Marmiroli M,Marmiroli N

    更新日期:2016-09-16 00:00:00

  • Proteomic analysis of the abomasal mucosal response following infection by the nematode, Haemonchus contortus, in genetically resistant and susceptible sheep.

    abstract::Sheep have a variable ability to resist gastrointestinal nematode infection, but the key factors mediating this response are poorly defined. Here we report the first large-scale application of quantitative proteomic technologies to define proteins that are differentially abundant between sheep selectively bred to have...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2012.01.016

    authors: Nagaraj SH,Harsha HC,Reverter A,Colgrave ML,Sharma R,Andronicos N,Hunt P,Menzies M,Lees MS,Sekhar NR,Pandey A,Ingham A

    更新日期:2012-04-03 00:00:00

  • Functional characterization of RNA fragments using high-throughput interactome screening.

    abstract::Populations of small eukaryotic RNAs, in addition to relatively well recognized molecules such as miRNAs or siRNAs, also contain fragments derived from all classes of constitutively expressed non-coding RNAs. It has been recently demonstrated that the formation and accumulation of RNA fragments (RFs) is cell-/tissue-s...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2018.10.007

    authors: Jackowiak P,Lis A,Luczak M,Stolarek I,Figlerowicz M

    更新日期:2019-02-20 00:00:00

  • Serological autoantibody profiling of type 1 diabetes by protein arrays.

    abstract:UNLABELLED:The need for biomarkers that illuminate the pathophysiology of type 1 diabetes (T1D), enhance early diagnosis and provide additional avenues for therapeutic intervention is well recognized in the scientific community. We conducted a proteome-scale, two-stage serological AAb screening followed by an independe...

    journal_title:Journal of proteomics

    pub_type: 杂志文章,多中心研究,随机对照试验

    doi:10.1016/j.jprot.2013.10.018

    authors: Miersch S,Bian X,Wallstrom G,Sibani S,Logvinenko T,Wasserfall CH,Schatz D,Atkinson M,Qiu J,LaBaer J

    更新日期:2013-12-06 00:00:00

  • Integration of quantitative proteomics and metabolomics reveals tissue hypoxia mechanisms in an ischemic-hypoxic rat model.

    abstract::Tissues hypoxia caused by hemorrhage is a common complication in many clinical diseases. However, its pathological mechanism remains largely unknown. To partly address this issue, an ischemic-hypoxic rat model was established and the plasma proteomic and metabolic profiles were quantified and analyzed using TMT-based ...

    journal_title:Journal of proteomics

    pub_type: 杂志文章

    doi:10.1016/j.jprot.2020.103924

    authors: He R,Kong Y,Fang P,Li L,Shi H,Liu Z

    更新日期:2020-09-30 00:00:00

  • Proteome studies of bacterial antibiotic resistance mechanisms.

    abstract::Ever since antibiotics were used to help humanity battle infectious diseases, microorganisms straight away fought back. Antibiotic resistance mechanisms indeed provide microbes with possibilities to by-pass and survive the action of antibiotic drugs. Several methods have been employed to identify these microbial resis...

    journal_title:Journal of proteomics

    pub_type: 杂志文章,评审

    doi:10.1016/j.jprot.2013.10.027

    authors: Vranakis I,Goniotakis I,Psaroulaki A,Sandalakis V,Tselentis Y,Gevaert K,Tsiotis G

    更新日期:2014-01-31 00:00:00