PeptidoformViz is a shiny app for processing, visualising and analysing mass spectrometry based intensity data on peptidoform level
Nina Demeulemeester,Lien Provez,Laura Corveleyn,Bart Van Puyvelde,Lennart Martens,Maarten Dhaenens,Lieven Clement Ghent University - VIB
Abstract
Many histones are known to carry a plethora of post-translational modifications (PTMs)(1). Changes in these histone PTMs (hPTMs) have been linked to a variety of diseases (2). Nonetheless, the mapping of hPTMs, with the aid of mass spectrometry based proteomics (MS) has not been extensively described. Therefore, we have made the first untargeted map of the histone code for 21 T-cell acute lymphoblastic leukaemia cell lines. Data sharing and reuse is important to facilitate transfer of knowledge, but is not trivial because of the complexity of the histone code (3). We have pre-processed the raw data into a single platform-independent Progenesis QIP project that is fully accessible so that other researchers can use this browsable template (4). Because data visualisation and pre-processing are important first steps when analysing data, but are often not trivial for biologists without programming experience, we have developed an accompanying shiny app in which users can intuitively visualise the different peptidoforms present in the dataset. It facilitates importing, pre-processing and visualization of peptide level intensity data and will be updated to include statistical analysis in future versions. Our tool, PeptidoformViz, is an interactive web app developed with the Shiny R Package (https://shiny.rstudio.com/, version 1.7.1) and can be downloaded from: https://github.com/statOmics/PeptidoformViz. (An installation of R and RStudio is required (5, 6).) The app currently consists of three main panels: data import, pre-processing and visualization. As input, the histone map mentioned here can be easily uploaded and explored, or the user can upload their own data consisting of peptidoforms and their intensity values in one or more samples. In the pre-processing panel, the user can opt for several pre-processing steps: logarithmic transformation, normalization and a filtering step based on missing intensity values. A density plot and boxplot of the data are generated to show the user the impact of the pre-processing steps. The visualization panel consists of a protein selection and a normalization step, a data table, a line plot and a boxplot. Here we work with the peptidoforms of one protein only, selected by the user. Users again have the option to normalize the peptidoform intensities towards the average/median peptidoform intensity of the selected protein in the sample. Hence, the user can choose to visualize absolute peptidoform abundances or relative abundances also referred to as peptidoform usage. The user also has the option to download the processed data as well as the line plots and boxplots that were generated in the app. Currently, we are expanding the app with formal statistical analyses, i.e. differential abundance (DA) analysis and usage at the peptidoform and ptm level. The statOmics group has previously published msqrob2, an R/Bioconductor package for DA at peptide or protein level (7) (https://www.bioconductor.org/packages/release/bioc/html/msqrob2.html) and we show that it can be used to perform DA at the peptidoform and ptm level. The option to normalize against the average or median peptidoform intensity of the protein additionally enables the prioritization of peptidoforms that are differentially used, which provides additional biological insight. 1. T. M. Maile et al., Mol. Cell. Proteomics. 14, 1148–1158 (2015). 2. K. Helin, D. Dhanak, Nat. 2013 5027472. 502, 480–488 (2013). 3. K. A. Janssen, S. Sidoli, B. A. Garcia, Methods Enzymol. 586, 359–378 (2017). 4. L. Provez et al., bioRxiv, in press, doi:10.1101/2022.05.05.490796. 5. R Core Team, R: A language and environment for statistical computing (2017), (available at https://www.r-project.org/). 6. RStudio team, (2021) (available at http://www.rstudio.com/). 7. L. J. E. Goeminne, K. Gevaert, L. Clement, Mol. Cell. Proteomics. 15, 657–668 (2016).