Degenerate peptides are peptides that are shared by multiple proteins and their presence implicates the presence of all the proteins that share these peptides (Huang et al.).
The goal of “Structural Genomics” is the determination of the structure of all human proteins. Structural models of proteins help to increase our understanding of molecular interactions in cells. Therefore, the availability of high-resolution models for all human proteins will increase our knowledge of cellular communications on the molecular level significantly. Furthermore, protein 3D structures also allow functional annotation of protein-protein, DNA-protein, RNA-protein interactions, as well as the interpretation of mutations, the development of small molecule binders, guided enzyme design, and the prediction of protein-protein interaction among others.
High-resolution mass spectrometry (MS) instruments enabled chemical cross-linking methods now used as high-throughput methods for obtaining structural information of protein-protein, DNA-protein and RNA-protein interactions. For example, MS-based proteomics approaches are heavily utilized to decipher protein-protein interaction networks and to study their interaction dynamics.
However, because of the nature of degenerate peptides, the presence of these peptides in mass spectrometry generated proteomics data, quantitative interpretation of sizeable proteomic data sets can be difficult. Therefore multiple approaches have been developed to address the issue of degenerate peptides.
The search engine “ProteinProphet “only retains peptide mass spectra associated with the highest Propensity-Score Matching (PSM) scores. The scores for the remaining peptide mass spectra are then calculated as an approximation of their probabilities (Nesvizhskii et al.; Serang and Noble). On the other hand, “Scaffold” calculates protein scores first using peptides that do not fall under the category of degenerate peptides. Next, degenerate peptides assigned to the protein that scores the highest out of all the proteins that share the peptide (Searle B.C.).
Huang T., Wang J., Yu W., He Z. Protein inference: a review. Brief Bioinform. 2012;13:586–614. [PubMed] [Google Scholar]Huang, T., Wang, J., Yu, W. & He, Z. Protein inference: a review. Briefings in Bioinformatics 13, 586-614 (2012). [PubMed]
Nesvizhskii A.I., Keller A., Kolker E., Aebersold R. A statistical model for identifying proteins by tandem mass spectrometry. Anal Chem. 2003;75:4646–4658. [PubMed] [Google Scholar]Nesvizhskii, A.I., Keller, A., Kolker, E. & Aebersold, R. A Statistical Model for Identifying Proteins by Tandem Mass Spectrometry. Analytical Chemistry 75, 4646-4658 (2003). [PubMed]
Searle B.C. Scaffold: a bioinformatic tool for validating MS/MS-based proteomic studies. Proteomics. 2010;10:1265–1269. [PubMed] [Google Scholar]Searle, B.C. Scaffold: A bioinformatic tool for validating MS/MS-based proteomic studies. PROTEOMICS 10, 1265-1269 (2010). [PubMed]
Serang O., Noble W. A review of statistical methods for protein identification using tandem mass spectrometry. Stat Interface. 2012;5:3–20. [PMC free article] [PubMed] [Google Scholar]Serang, O. & Noble, W. A review of statistical methods for protein identification using tandem mass spectrometry. Statistics and its interface 5, 3-20 (2012). [PMC free article] [PubMed]
Yu H. High-quality binary protein interaction map of the yeast interactome network. Science. 2008;322:104. [PMC free article] [PubMed] [Google Scholar]Yu, H. et al. High-Quality Binary Protein Interaction Map of the Yeast Interactome Network. Science 322, 104 (2008). [PMC free article] [PubMed]