Advanced Multivariable Statistical Analysis Interactive Tool for Handling
Missing Data and Confounding Covariates for Label-free LC-MS Proteomics
Experiments

Sudhir      Srivastava; Michael   L.   Merchant; Craig   J.   McClain; Anil      Rai; Krishna   K.   Chaturvedi; Ulavappa   B.   Angadi; Dwijesh   C.   Mishra; Shesh   N.   Rai

Abstract

Background: Careful consideration is required for detecting significant features (proteins or peptides) in LC-MS proteomics studies using multivariable regression analyses. In proteomics data, missing values can arise due to random errors, bad samples, features below the detection limit in specific samples, etc. Further, expression data are always prone to heterogeneity due to technical/biological reasons. Missing values and heterogeneity in proteomics studies can confound important findings. Moreover, there is additional information in these studies, such as pre-clinical and clinical information (e.g., sex, exposure, etc.), which can be used to supplement the inference.

Methods: We introduce a user-friendly web application SATP (Statistical Analysis interactive Tool for label-free LC-MS Proteomics experiments) for differential expression analysis of proteomics data that is scalable to large clinical proteomic studies. Appropriate normalization and imputation methods have been provided. Apart from these, several statistical tests such as t-test, moderated t-test, linear fixed effect model, and linear mixed model with adjustment of effect of extra covariates have also been provided for users' benefit.

Results: Our intuitive tool has several advantages over the existing ones, including an extension to multiple factor comparisons after adjusting for covariates.

Conclusion: This is a comprehensive tool for analysis of complex experiments with multiple covariates, whereas most of the existing tools were developed for comparing simple experiments mostly with two groups without covariates.

Availability: The tool can be accessed freely by the users from https://ulbbf.shinyapps.io/satp/.

Keywords: ANOVA, ANCOVA, fixed effect model, linear mixed model, GLM, heterogeneity, imputation techniques, peptides, proteins, normalization techniques.

Cite As

Current Bioinformatics

Advanced Multivariable Statistical Analysis Interactive Tool for Handling Missing Data and Confounding Covariates for Label-free LC-MS Proteomics Experiments

Abstract