Abstract:
BACKGROUND:Tandem mass spectrometry allows biologists to identify and quantify protein samples in the form of digested peptide sequences. When performing peptide identification, spectral library search is more sensitive than traditional database search but is limited to peptides that have been previously identified. An accurate tandem mass spectrum prediction tool is thus crucial in expanding the peptide space and increasing the coverage of spectral library search. RESULTS:We propose MS2CNN, a non-linear regression model based on deep convolutional neural networks, a deep learning algorithm. The features for our model are amino acid composition, predicted secondary structure, and physical-chemical features such as isoelectric point, aromaticity, helicity, hydrophobicity, and basicity. MS2CNN was trained with five-fold cross validation on a three-way data split on the large-scale human HCD MS2 dataset of Orbitrap LC-MS/MS downloaded from the National Institute of Standards and Technology. It was then evaluated on a publicly available independent test dataset of human HeLa cell lysate from LC-MS experiments. On average, our model shows better cosine similarity and Pearson correlation coefficient (0.690 and 0.632) than MS2PIP (0.647 and 0.601) and is comparable with pDeep (0.692 and 0.642). Notably, for the more complex MS2 spectra of 3+ peptides, MS2PIP is significantly better than both MS2PIP and pDeep. CONCLUSIONS:We showed that MS2CNN outperforms MS2PIP for 2+ and 3+ peptides and pDeep for 3+ peptides. This implies that MS2CNN, the proposed convolutional neural network model, generates highly accurate MS2 spectra for LC-MS/MS experiments using Orbitrap machines, which can be of great help in protein and peptide identifications. The results suggest that incorporating more data for deep learning model may improve performance.
journal_name
BMC Genomicsjournal_title
BMC genomicsauthors
Lin YM,Chen CT,Chang JMdoi
10.1186/s12864-019-6297-6subject
Has Abstractpub_date
2019-12-24 00:00:00pages
906issue
Suppl 9issn
1471-2164pii
10.1186/s12864-019-6297-6journal_volume
20pub_type
杂志文章相关文献
BMC GENOMICS文献大全abstract:BACKGROUND:Matrix attachment regions (MAR) are the sites on genomic DNA that interact with the nuclear matrix. There is increasing evidence for the involvement of MAR in regulation of gene expression. The unsuitability of experimental detection of MAR for genome-wide analyses has led to the development of computational...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-8-418
更新日期:2007-11-15 00:00:00
abstract:BACKGROUND:It is generally agreed that horizontal gene transfer (HGT) is common in phagotrophic protists. However, the overall scale of HGT and the cumulative impact of acquired genes on the evolution of these organisms remain largely unknown. RESULTS:Choanoflagellates are phagotrophs and the closest living relatives ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-14-729
更新日期:2013-10-25 00:00:00
abstract:BACKGROUND:Reading disability (RD) is a common syndrome with a large genetic component. Chromosome 6 has been identified in several linkage studies as playing a significant role. A more recent study identified a peak of transmission disequilibrium to marker JA04 (G72384) on chromosome 6p22.3, suggesting that a gene is ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-4-25
更新日期:2003-06-30 00:00:00
abstract:BACKGROUND:Adjuvant Radiotherapy (RT) after surgical removal of tumors proved beneficial in long-term tumor control and treatment planning. For many years, it has been well concluded that radio-sensitivities of tumors upon radiotherapy decrease according to the sizes of tumors and RT models based on Poisson statistics ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-9-S2-S9
更新日期:2008-09-16 00:00:00
abstract:BACKGROUND:It is becoming apparent that perhaps as much as half of the genome of the human blood fluke Schistosoma mansoni is constituted of mobile genetic element-related sequences. Non-long terminal repeat (LTR) retrotransposons, related to the LINE elements of mammals, comprise much of this repetitive component of t...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-6-154
更新日期:2005-11-04 00:00:00
abstract:BACKGROUND:Parasites employ proteases to evade host immune systems, feed and replicate and are often the target of anti-parasite strategies to disrupt these interactions. Myxozoans are obligate cnidarian parasites, alternating between invertebrate and fish hosts. Their genes are highly divergent from other metazoans, a...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-020-6705-y
更新日期:2020-06-16 00:00:00
abstract:BACKGROUND:Pyrenophora tritici-repentis (Ptr) is a necrotrophic fungal pathogen that causes the major wheat disease, tan spot. We set out to provide essential genomics-based resources in order to better understand the pathogenicity mechanisms of this important pathogen. RESULTS:Here, we present eight new Ptr isolate g...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-018-4680-3
更新日期:2018-04-23 00:00:00
abstract:BACKGROUND:Although an adverse early-life environment has been linked to an increased risk of developing the metabolic syndrome, the molecular mechanisms underlying altered disease susceptibility as well as their relevance to humans are largely unknown. Importantly, emerging evidence suggests that these effects operate...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-12-509
更新日期:2011-10-16 00:00:00
abstract:BACKGROUND:This paper presents a retrospective statistical study on the newly-released data set by the Stanley Neuropathology Consortium on gene expression in bipolar disorder and schizophrenia. This data set contains gene expression data as well as limited demographic and clinical data for each subject. Previous studi...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-9-531
更新日期:2008-11-07 00:00:00
abstract:BACKGROUND:Fish scales are an important reservoir of calcium and phosphorus and together with the skin function as an integrated barrier against environmental changes and external aggressors. Histological studies have revealed that the skin and scales regenerate rapidly in fish when they are lost or damaged. In the pre...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-12-490
更新日期:2011-10-07 00:00:00
abstract:BACKGROUND:The Bacillus cereus sensu lato group currently includes seven species (B. cereus, B. anthracis, B. mycoides, B. pseudomycoides, B. thuringiensis, B. weihenstephanensis and B. cytotoxicus) that recent phylogenetic and phylogenomic analyses suggest are likely a single species, despite their varied phenotypes. ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-564
更新日期:2012-10-22 00:00:00
abstract:BACKGROUND:Acinetobacter baumannii is a major health problem. The most common infection caused by A. baumannii is hospital acquired pneumonia, and the associated mortality rate is approximately 50%. Neither in vivo nor ex vivo expression profiling has been performed at the proteomic or transcriptomic level for pneumoni...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-015-1608-z
更新日期:2015-05-30 00:00:00
abstract:BACKGROUND:The globe artichoke (Cynara cardunculus var. scolymus L.) is a significant crop in the Mediterranean basin. Despite its commercial importance and its both dietary and pharmaceutical value, knowledge of its genetics and genomics remains scant. Microsatellite markers have become a key tool in genetic and genom...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-10-454
更新日期:2009-09-28 00:00:00
abstract:BACKGROUND:Prokaryotic translation initiation involves the proper docking, anchoring, and accommodation of mRNA to the 30S ribosomal subunit. Three initiation factors (IF1, IF2, and IF3) and some ribosomal proteins mediate the assembly and activation of the translation initiation complex. Although the interaction betwe...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-015-1808-6
更新日期:2015-08-14 00:00:00
abstract:BACKGROUND:Copy number variation (CNV) is a major component of genomic variation, yet methods to accurately type genomic CNV lag behind methods that type single nucleotide variation. High-throughput sequencing can contribute to these methods by using sequence read depth, which takes the number of reads that map to a gi...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-015-2123-y
更新日期:2015-11-02 00:00:00
abstract:BACKGROUND:Finger millet (Eleusine coracana (L.) Gaertn.) is an important staple food crop widely grown in Africa and South Asia. Among the millets, finger millet has high amount of calcium, methionine, tryptophan, fiber, and sulphur containing amino acids. In addition, it has C4 photosynthetic carbon assimilation mech...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-017-3850-z
更新日期:2017-06-15 00:00:00
abstract:BACKGROUND:Advances in whole genome profiling have revolutionized the cancer research field, but at the same time have raised new bioinformatics challenges. For next generation sequencing (NGS), these include data storage, computational costs, sequence processing and alignment, delineating appropriate statistical measu...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-S6-S14
更新日期:2012-01-01 00:00:00
abstract:BACKGROUND:Marek's disease (MD) is a lymphoproliferative disease in chickens caused by Marek's disease virus (MDV) and characterized by T cell lymphoma and infiltration of lymphoid cells into various organs such as liver, spleen, peripheral nerves and muscle. Resistance to MD and disease risk have long been thought to ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-12-501
更新日期:2011-10-12 00:00:00
abstract:BACKGROUND:The Spemann/Mangold organizer is a transient tissue critical for patterning the gastrula stage vertebrate embryo and formation of the three germ layers. Despite its important role during development, there are still relatively few genes with specific expression in the organizer and its derivatives. Foxa2 is ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-9-511
更新日期:2008-10-30 00:00:00
abstract:BACKGROUND:Detoxification in the liver involves activation of nuclear receptors, such as the constitutive androstane receptor (CAR), which regulate downstream genes of xenobiotic metabolism. Frequently, the metabolism of endobiotics is also modulated, resulting in potentially harmful effects. We therefore used 1,4-Bis ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-10-384
更新日期:2009-08-19 00:00:00
abstract:BACKGROUND:Spiders (Order Araneae) are essential predators in every terrestrial ecosystem largely because they have evolved potent arsenals of silk and venom. Spider silks are high performance materials made almost entirely of proteins, and thus represent an ideal system for investigating genome level evolution of nove...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-15-365
更新日期:2014-05-23 00:00:00
abstract:BACKGROUND:Though genomic-level data are becoming widely available, many of the metazoan species sequenced are laboratory systems whose natural history is not well documented. In contrast, the wide array of species with very well-characterized natural history have, until recently, lacked genomics tools. It is now possi...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-305
更新日期:2012-07-09 00:00:00
abstract:BACKGROUND:Our previous study had proved that nigericin could reduce colorectal cancer cell proliferation in dose- and time-dependent manners by targeting Wnt/β-catenin signaling. To better elucidate its potential anti-cancer mechanism, two pancreatic cancer (PC) cell lines were exposed to increasing concentrations of ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-019-6032-3
更新日期:2019-09-18 00:00:00
abstract:BACKGROUND:Chinese bayberry (Myrica rubra Sieb. and Zucc.) is an important subtropical fruit crop and an ideal species for fruit quality research due to the rapid and substantial changes that occur during development and ripening, including changes in fruit color and taste. However, research at the molecular level is l...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-19
更新日期:2012-01-13 00:00:00
abstract:BACKGROUND:Integrons are genomic elements that mediate horizontal gene transfer by inserting and removing genetic material using site-specific recombination. Integrons are commonly found in bacterial genomes, where they maintain a large and diverse set of genes that plays an important role in adaptation and evolution. ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-020-06830-5
更新日期:2020-07-20 00:00:00
abstract:BACKGROUND:Whole-exome sequencing (WES) is rapidly evolving into a tool of choice for rapid, and inexpensive identification of molecular genetic lesions within targeted regions of the human genome. While biases in WES coverage of nucleotides in targeted regions are recognized, it is not well understood how repetition o...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-015-2107-y
更新日期:2015-11-25 00:00:00
abstract:BACKGROUND:The small airway epithelium (SAE), the cell population that covers the human airway surface from the 6th generation of airway branching to the alveoli, is the major site of lung disease caused by smoking. The focus of this study is to provide quantitative assessment of the SAE transcriptome in the resting st...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-13-82
更新日期:2012-02-29 00:00:00
abstract:BACKGROUND:Streptococcus infantarius subsp. infantarius (Sii) belongs to the Streptococcus bovis/Streptococcus equinus complex associated with several human and animal infections. Sii is a predominant bacterium in spontaneously fermented milk products in Africa. The genome sequence of Sii strain CJ18 was compared with ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-14-200
更新日期:2013-03-22 00:00:00
abstract:BACKGROUND:Cryptocaryon irritans is an obligate parasitic ciliate protozoan that can infect various commercially important mariculture fish species and cause high lethality and economic loss. Current methods of controlling this parasite with chemicals or antibiotics are widely considered to be environmentally harmful. ...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/s12864-018-4565-5
更新日期:2018-03-12 00:00:00
abstract:BACKGROUND:Dictyostelium discoideum is frequently subjected to environmental changes in its natural habitat, the forest soil. In order to survive, the organism had to develop effective mechanisms to sense and respond to such changes. When cells are faced with a hypertonic environment a complex response is triggered. It...
journal_title:BMC genomics
pub_type: 杂志文章
doi:10.1186/1471-2164-8-123
更新日期:2007-05-21 00:00:00