The objective of this paper is to contribute to the methodology available for extracting and analyzing signal content from protein mass spectrometry data. Data from Matrix-Assisted Laser Desorption Ionization Time-of-Flight (MALDI-TOF) or Surface-Enhanced Laser Desorption and Ionization Time-Of-Flight (SELDI-TOF) spectra require considerable signal pre-processing such as noise removal and baseline level error correction. After removing the noise by an invariant wavelet transform, we develop a background correction method based on penalized spline quantile regression and apply it to MALDI-TOF spectra. The results show that the wavelet transform technique combined with nonparametric quantile regression can handle all kinds of background and low signal-to-background ratio spectra; it requires no prior knowledge about the spectra composition, no selection of suitable background correction points, and no mathematical assumption of the background distribution. We further present a multi-scale based novel spectra alignment methodology useful in a functional analysis of variance method for identifying proteins that are differentially expressed between different type tissues. Our approaches are compared with several existing approaches in the literature and are tested on simulated and real data. The results indicate that the proposed schemes enable accurate diagnosis based on the over-expression of a small number of identified proteins with high sensitivity.
Keywords: Curve estimation, Wavelets, Regression quantiles, Robust point-matching, P-splines smoothing, Functional analysis of variance