MS2CNN: predicting MS/MS spectrum based on protein sequence using deep convolutional neural networks.

Abstract:

BACKGROUND:Tandem mass spectrometry allows biologists to identify and quantify protein samples in the form of digested peptide sequences. When performing peptide identification, spectral library search is more sensitive than traditional database search but is limited to peptides that have been previously identified. An accurate tandem mass spectrum prediction tool is thus crucial in expanding the peptide space and increasing the coverage of spectral library search. RESULTS:We propose MS2CNN, a non-linear regression model based on deep convolutional neural networks, a deep learning algorithm. The features for our model are amino acid composition, predicted secondary structure, and physical-chemical features such as isoelectric point, aromaticity, helicity, hydrophobicity, and basicity. MS2CNN was trained with five-fold cross validation on a three-way data split on the large-scale human HCD MS2 dataset of Orbitrap LC-MS/MS downloaded from the National Institute of Standards and Technology. It was then evaluated on a publicly available independent test dataset of human HeLa cell lysate from LC-MS experiments. On average, our model shows better cosine similarity and Pearson correlation coefficient (0.690 and 0.632) than MS2PIP (0.647 and 0.601) and is comparable with pDeep (0.692 and 0.642). Notably, for the more complex MS2 spectra of 3+ peptides, MS2PIP is significantly better than both MS2PIP and pDeep. CONCLUSIONS:We showed that MS2CNN outperforms MS2PIP for 2+ and 3+ peptides and pDeep for 3+ peptides. This implies that MS2CNN, the proposed convolutional neural network model, generates highly accurate MS2 spectra for LC-MS/MS experiments using Orbitrap machines, which can be of great help in protein and peptide identifications. The results suggest that incorporating more data for deep learning model may improve performance.

journal_name

BMC Genomics

journal_title

BMC genomics

authors

Lin YM,Chen CT,Chang JM

doi

10.1186/s12864-019-6297-6

subject

Has Abstract

pub_date

2019-12-24 00:00:00

pages

906

issue

Suppl 9

issn

1471-2164

pii

10.1186/s12864-019-6297-6

journal_volume

20

pub_type

杂志文章
  • Association of the matrix attachment region recognition signature with coding regions in Caenorhabditis elegans.

    abstract:BACKGROUND:Matrix attachment regions (MAR) are the sites on genomic DNA that interact with the nuclear matrix. There is increasing evidence for the involvement of MAR in regulation of gene expression. The unsuitability of experimental detection of MAR for genome-wide analyses has led to the development of computational...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-8-418

    authors: Anthony A,Blaxter M

    更新日期:2007-11-15 00:00:00

  • The scale and evolutionary significance of horizontal gene transfer in the choanoflagellate Monosiga brevicollis.

    abstract:BACKGROUND:It is generally agreed that horizontal gene transfer (HGT) is common in phagotrophic protists. However, the overall scale of HGT and the cumulative impact of acquired genes on the evolution of these organisms remain largely unknown. RESULTS:Choanoflagellates are phagotrophs and the closest living relatives ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-729

    authors: Yue J,Sun G,Hu X,Huang J

    更新日期:2013-10-25 00:00:00

  • A transcription map of the 6p22.3 reading disability locus identifying candidate genes.

    abstract:BACKGROUND:Reading disability (RD) is a common syndrome with a large genetic component. Chromosome 6 has been identified in several linkage studies as playing a significant role. A more recent study identified a peak of transmission disequilibrium to marker JA04 (G72384) on chromosome 6p22.3, suggesting that a gene is ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-4-25

    authors: Londin ER,Meng H,Gruen JR

    更新日期:2003-06-30 00:00:00

  • Analyzing adjuvant radiotherapy suggests a non monotonic radio-sensitivity over tumor volumes.

    abstract:BACKGROUND:Adjuvant Radiotherapy (RT) after surgical removal of tumors proved beneficial in long-term tumor control and treatment planning. For many years, it has been well concluded that radio-sensitivities of tumors upon radiotherapy decrease according to the sizes of tumors and RT models based on Poisson statistics ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-S2-S9

    authors: Yang JY,Niemierko A,Yang MQ,Deng Y

    更新日期:2008-09-16 00:00:00

  • Characterization of SR3 reveals abundance of non-LTR retrotransposons of the RTE clade in the genome of the human blood fluke, Schistosoma mansoni.

    abstract:BACKGROUND:It is becoming apparent that perhaps as much as half of the genome of the human blood fluke Schistosoma mansoni is constituted of mobile genetic element-related sequences. Non-long terminal repeat (LTR) retrotransposons, related to the LINE elements of mammals, comprise much of this repetitive component of t...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-6-154

    authors: Laha T,Kewgrai N,Loukas A,Brindley PJ

    更新日期:2005-11-04 00:00:00

  • Transcriptome of Sphaerospora molnari (Cnidaria, Myxosporea) blood stages provides proteolytic arsenal as potential therapeutic targets against sphaerosporosis in common carp.

    abstract:BACKGROUND:Parasites employ proteases to evade host immune systems, feed and replicate and are often the target of anti-parasite strategies to disrupt these interactions. Myxozoans are obligate cnidarian parasites, alternating between invertebrate and fish hosts. Their genes are highly divergent from other metazoans, a...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-6705-y

    authors: Hartigan A,Kosakyan A,Pecková H,Eszterbauer E,Holzer AS

    更新日期:2020-06-16 00:00:00

  • Comparative genomics of the wheat fungal pathogen Pyrenophora tritici-repentis reveals chromosomal variations and genome plasticity.

    abstract:BACKGROUND:Pyrenophora tritici-repentis (Ptr) is a necrotrophic fungal pathogen that causes the major wheat disease, tan spot. We set out to provide essential genomics-based resources in order to better understand the pathogenicity mechanisms of this important pathogen. RESULTS:Here, we present eight new Ptr isolate g...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-4680-3

    authors: Moolhuijzen P,See PT,Hane JK,Shi G,Liu Z,Oliver RP,Moffat CS

    更新日期:2018-04-23 00:00:00

  • Gene expression profiling in the Cynomolgus macaque Macaca fascicularis shows variation within the normal birth range.

    abstract:BACKGROUND:Although an adverse early-life environment has been linked to an increased risk of developing the metabolic syndrome, the molecular mechanisms underlying altered disease susceptibility as well as their relevance to humans are largely unknown. Importantly, emerging evidence suggests that these effects operate...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-509

    authors: Emerald BS,Chng K,Masuda S,Sloboda DM,Vickers MH,Kambadur R,Gluckman PD

    更新日期:2011-10-16 00:00:00

  • Combining gene expression, demographic and clinical data in modeling disease: a case study of bipolar disorder and schizophrenia.

    abstract:BACKGROUND:This paper presents a retrospective statistical study on the newly-released data set by the Stanley Neuropathology Consortium on gene expression in bipolar disorder and schizophrenia. This data set contains gene expression data as well as limited demographic and clinical data for each subject. Previous studi...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-531

    authors: Struyf J,Dobrin S,Page D

    更新日期:2008-11-07 00:00:00

  • Skin healing and scale regeneration in fed and unfed sea bream, Sparus auratus.

    abstract:BACKGROUND:Fish scales are an important reservoir of calcium and phosphorus and together with the skin function as an integrated barrier against environmental changes and external aggressors. Histological studies have revealed that the skin and scales regenerate rapidly in fish when they are lost or damaged. In the pre...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-490

    authors: Vieira FA,Gregório SF,Ferraresso S,Thorne MA,Costa R,Milan M,Bargelloni L,Clark MS,Canario AV,Power DM

    更新日期:2011-10-07 00:00:00

  • Divergence of the SigB regulon and pathogenesis of the Bacillus cereus sensu lato group.

    abstract:BACKGROUND:The Bacillus cereus sensu lato group currently includes seven species (B. cereus, B. anthracis, B. mycoides, B. pseudomycoides, B. thuringiensis, B. weihenstephanensis and B. cytotoxicus) that recent phylogenetic and phylogenomic analyses suggest are likely a single species, despite their varied phenotypes. ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-564

    authors: Scott E 2nd,Dyer DW

    更新日期:2012-10-22 00:00:00

  • Quantitative proteomic analysis of host--pathogen interactions: a study of Acinetobacter baumannii responses to host airways.

    abstract:BACKGROUND:Acinetobacter baumannii is a major health problem. The most common infection caused by A. baumannii is hospital acquired pneumonia, and the associated mortality rate is approximately 50%. Neither in vivo nor ex vivo expression profiling has been performed at the proteomic or transcriptomic level for pneumoni...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1608-z

    authors: Méndez JA,Mateos J,Beceiro A,Lopez M,Tomás M,Poza M,Bou G

    更新日期:2015-05-30 00:00:00

  • Ontology and diversity of transcript-associated microsatellites mined from a globe artichoke EST database.

    abstract:BACKGROUND:The globe artichoke (Cynara cardunculus var. scolymus L.) is a significant crop in the Mediterranean basin. Despite its commercial importance and its both dietary and pharmaceutical value, knowledge of its genetics and genomics remains scant. Microsatellite markers have become a key tool in genetic and genom...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-454

    authors: Scaglione D,Acquadro A,Portis E,Taylor CA,Lanteri S,Knapp SJ

    更新日期:2009-09-28 00:00:00

  • Distribution and diversity of ribosome binding sites in prokaryotic genomes.

    abstract:BACKGROUND:Prokaryotic translation initiation involves the proper docking, anchoring, and accommodation of mRNA to the 30S ribosomal subunit. Three initiation factors (IF1, IF2, and IF3) and some ribosomal proteins mediate the assembly and activation of the translation initiation complex. Although the interaction betwe...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-1808-6

    authors: Omotajo D,Tate T,Cho H,Choudhary M

    更新日期:2015-08-14 00:00:00

  • Determining multiallelic complex copy number and sequence variation from high coverage exome sequencing data.

    abstract:BACKGROUND:Copy number variation (CNV) is a major component of genomic variation, yet methods to accurately type genomic CNV lag behind methods that type single nucleotide variation. High-throughput sequencing can contribute to these methods by using sequence read depth, which takes the number of reads that map to a gi...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-2123-y

    authors: Forni D,Martin D,Abujaber R,Sharp AJ,Sironi M,Hollox EJ

    更新日期:2015-11-02 00:00:00

  • Genome and Transcriptome sequence of Finger millet (Eleusine coracana (L.) Gaertn.) provides insights into drought tolerance and nutraceutical properties.

    abstract:BACKGROUND:Finger millet (Eleusine coracana (L.) Gaertn.) is an important staple food crop widely grown in Africa and South Asia. Among the millets, finger millet has high amount of calcium, methionine, tryptophan, fiber, and sulphur containing amino acids. In addition, it has C4 photosynthetic carbon assimilation mech...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-017-3850-z

    authors: Hittalmani S,Mahesh HB,Shirke MD,Biradar H,Uday G,Aruna YR,Lohithaswa HC,Mohanrao A

    更新日期:2017-06-15 00:00:00

  • Methods for high-throughput MethylCap-Seq data analysis.

    abstract:BACKGROUND:Advances in whole genome profiling have revolutionized the cancer research field, but at the same time have raised new bioinformatics challenges. For next generation sequencing (NGS), these include data storage, computational costs, sequence processing and alignment, delineating appropriate statistical measu...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-S6-S14

    authors: Rodriguez BA,Frankhouser D,Murphy M,Trimarchi M,Tam HH,Curfman J,Huang R,Chan MW,Lai HC,Parikh D,Ball B,Schwind S,Blum W,Marcucci G,Yan P,Bundschuh R

    更新日期:2012-01-01 00:00:00

  • Temporal transcriptome changes induced by MDV in Marek's disease-resistant and -susceptible inbred chickens.

    abstract:BACKGROUND:Marek's disease (MD) is a lymphoproliferative disease in chickens caused by Marek's disease virus (MDV) and characterized by T cell lymphoma and infiltration of lymphoid cells into various organs such as liver, spleen, peripheral nerves and muscle. Resistance to MD and disease risk have long been thought to ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-12-501

    authors: Yu Y,Luo J,Mitra A,Chang S,Tian F,Zhang H,Yuan P,Zhou H,Song J

    更新日期:2011-10-12 00:00:00

  • Microarray analysis of Foxa2 mutant mouse embryos reveals novel gene expression and inductive roles for the gastrula organizer and its derivatives.

    abstract:BACKGROUND:The Spemann/Mangold organizer is a transient tissue critical for patterning the gastrula stage vertebrate embryo and formation of the three germ layers. Despite its important role during development, there are still relatively few genes with specific expression in the organizer and its derivatives. Foxa2 is ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-9-511

    authors: Tamplin OJ,Kinzel D,Cox BJ,Bell CE,Rossant J,Lickert H

    更新日期:2008-10-30 00:00:00

  • Effect of CAR activation on selected metabolic pathways in normal and hyperlipidemic mouse livers.

    abstract:BACKGROUND:Detoxification in the liver involves activation of nuclear receptors, such as the constitutive androstane receptor (CAR), which regulate downstream genes of xenobiotic metabolism. Frequently, the metabolism of endobiotics is also modulated, resulting in potentially harmful effects. We therefore used 1,4-Bis ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-10-384

    authors: Rezen T,Tamasi V,Lövgren-Sandblom A,Björkhem I,Meyer UA,Rozman D

    更新日期:2009-08-19 00:00:00

  • Multi-tissue transcriptomics of the black widow spider reveals expansions, co-options, and functional processes of the silk gland gene toolkit.

    abstract:BACKGROUND:Spiders (Order Araneae) are essential predators in every terrestrial ecosystem largely because they have evolved potent arsenals of silk and venom. Spider silks are high performance materials made almost entirely of proteins, and thus represent an ideal system for investigating genome level evolution of nove...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-15-365

    authors: Clarke TH,Garb JE,Hayashi CY,Haney RA,Lancaster AK,Corbett S,Ayoub NA

    更新日期:2014-05-23 00:00:00

  • De novo transcriptome sequencing in a songbird, the dark-eyed junco (Junco hyemalis): genomic tools for an ecological model system.

    abstract:BACKGROUND:Though genomic-level data are becoming widely available, many of the metazoan species sequenced are laboratory systems whose natural history is not well documented. In contrast, the wide array of species with very well-characterized natural history have, until recently, lacked genomics tools. It is now possi...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-305

    authors: Peterson MP,Whittaker DJ,Ambreth S,Sureshchandra S,Buechlein A,Podicheti R,Choi JH,Lai Z,Mockatis K,Colbourne J,Tang H,Ketterson ED

    更新日期:2012-07-09 00:00:00

  • High-throughput sequencing of circRNAs reveals novel insights into mechanisms of nigericin in pancreatic cancer.

    abstract:BACKGROUND:Our previous study had proved that nigericin could reduce colorectal cancer cell proliferation in dose- and time-dependent manners by targeting Wnt/β-catenin signaling. To better elucidate its potential anti-cancer mechanism, two pancreatic cancer (PC) cell lines were exposed to increasing concentrations of ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-019-6032-3

    authors: Xu Z,Shen J,Hua S,Wan D,Chen Q,Han Y,Ren R,Liu F,Du Z,Guo X,Shi J,Zhi Q

    更新日期:2019-09-18 00:00:00

  • Transcriptomic analysis of Chinese bayberry (Myrica rubra) fruit development and ripening using RNA-Seq.

    abstract:BACKGROUND:Chinese bayberry (Myrica rubra Sieb. and Zucc.) is an important subtropical fruit crop and an ideal species for fruit quality research due to the rapid and substantial changes that occur during development and ripening, including changes in fruit color and taste. However, research at the molecular level is l...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-19

    authors: Feng C,Chen M,Xu CJ,Bai L,Yin XR,Li X,Allan AC,Ferguson IB,Chen KS

    更新日期:2012-01-13 00:00:00

  • A comprehensive survey of integron-associated genes present in metagenomes.

    abstract:BACKGROUND:Integrons are genomic elements that mediate horizontal gene transfer by inserting and removing genetic material using site-specific recombination. Integrons are commonly found in bacterial genomes, where they maintain a large and diverse set of genes that plays an important role in adaptation and evolution. ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-020-06830-5

    authors: Buongermino Pereira M,Österlund T,Eriksson KM,Backhaus T,Axelson-Fisk M,Kristiansson E

    更新日期:2020-07-20 00:00:00

  • Replicate exome-sequencing in a multiple-generation family: improved interpretation of next-generation sequencing data.

    abstract:BACKGROUND:Whole-exome sequencing (WES) is rapidly evolving into a tool of choice for rapid, and inexpensive identification of molecular genetic lesions within targeted regions of the human genome. While biases in WES coverage of nucleotides in targeted regions are recognized, it is not well understood how repetition o...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-015-2107-y

    authors: Cherukuri PF,Maduro V,Fuentes-Fajardo KV,Lam K,NISC Comparative Sequencing Program.,Adams DR,Tifft CJ,Mullikin JC,Gahl WA,Boerkoel CF

    更新日期:2015-11-25 00:00:00

  • RNA-Seq quantification of the human small airway epithelium transcriptome.

    abstract:BACKGROUND:The small airway epithelium (SAE), the cell population that covers the human airway surface from the 6th generation of airway branching to the alveoli, is the major site of lung disease caused by smoking. The focus of this study is to provide quantitative assessment of the SAE transcriptome in the resting st...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-13-82

    authors: Hackett NR,Butler MW,Shaykhiev R,Salit J,Omberg L,Rodriguez-Flores JL,Mezey JG,Strulovici-Barel Y,Wang G,Didon L,Crystal RG

    更新日期:2012-02-29 00:00:00

  • Comparative genome analysis of Streptococcus infantarius subsp. infantarius CJ18, an African fermented camel milk isolate with adaptations to dairy environment.

    abstract:BACKGROUND:Streptococcus infantarius subsp. infantarius (Sii) belongs to the Streptococcus bovis/Streptococcus equinus complex associated with several human and animal infections. Sii is a predominant bacterium in spontaneously fermented milk products in Africa. The genome sequence of Sii strain CJ18 was compared with ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-14-200

    authors: Jans C,Follador R,Hochstrasser M,Lacroix C,Meile L,Stevens MJ

    更新日期:2013-03-22 00:00:00

  • Molecular mechanisms of an antimicrobial peptide piscidin (Lc-pis) in a parasitic protozoan, Cryptocaryon irritans.

    abstract:BACKGROUND:Cryptocaryon irritans is an obligate parasitic ciliate protozoan that can infect various commercially important mariculture fish species and cause high lethality and economic loss. Current methods of controlling this parasite with chemicals or antibiotics are widely considered to be environmentally harmful. ...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/s12864-018-4565-5

    authors: Chen R,Mao Y,Wang J,Liu M,Qiao Y,Zheng L,Su Y,Ke Q,Zheng W

    更新日期:2018-03-12 00:00:00

  • STATc is a key regulator of the transcriptional response to hyperosmotic shock.

    abstract:BACKGROUND:Dictyostelium discoideum is frequently subjected to environmental changes in its natural habitat, the forest soil. In order to survive, the organism had to develop effective mechanisms to sense and respond to such changes. When cells are faced with a hypertonic environment a complex response is triggered. It...

    journal_title:BMC genomics

    pub_type: 杂志文章

    doi:10.1186/1471-2164-8-123

    authors: Na J,Tunggal B,Eichinger L

    更新日期:2007-05-21 00:00:00