As mass spectrometry proteomics has matured over the past few years, a growing emphasis has been placed on quality control (QC), which is becoming a crucial factor to endorse the generated experimental results. Mass spectrometry is a highly complex technique, and because its results can be subject to significant variability, suitable QC is necessary to model the influence of this variability on experimental results. Nevertheless, extensive quality control procedures are currently lacking due to the absence of QC information alongside the experimental data and the high degree of difficulty in interpreting this complex information.
For mass spectrometry proteomics to mature a systematic approach to quality control is essential. To this end we will first provide the technical infrastructure to generate QC metrics as an integral element of a mass spectrometry experiment. We will develop the qcML standard file format for mass spectrometry QC data and we will establish procedures to include detailed QC data alongside all data submissions to PRIDE, a leading public repository for proteomics data. Second, we will use this newly generated wealth of QC data to develop advanced machine learning techniques to uncover novel knowledge on the performance of a mass spectrometry experiment. This will make it possible to improve the experimental set-up, optimize the spectral acquisition, and increase the confidence in the generated results, massively empowering biological mass spectrometry.