Ubiquitination is involved in various cellular processes such as protein degradation and stability, cell cycle progression, transcriptional regulation, antigen processing, DNA repair, inflammation and regulation of apoptosis, etc. In silico prediction of potential candidate lysine (K) for ubiquitination will not only save time and money but will also generate valuable data for further scientific research. We developed Ubipredictor (http://chemdp.com/ubipredictor.php) tool for prediction of potential ubiquitinated lysine in protein sequences of human, mouse and yeast dataset using LDA. The statistically significant features selected through LDA were amino acid dimers, position specific score matrix (PSSM) and physicochemical properties of amino acid like electrostatic charge, heat capacity, codon diversity and secondary structure, etc. Testing on three different model organism datasets (human, mouse, yeast) showed that the predictive performance of Ubipredictor was better than two existing tools. On human and mouse datasets, Ubipredictor was found to be more sensitive than Ubipred and Ubpred. Unlike previously designed tools, we trained Ubipredictor specifically on experimentally verified ubiquitinated dataset for each of the human mouse and yeast species.
Keywords: Ubiquitination, Machine Learning, Post-Translational Modifications (PTMs), Linear Discriminant Analysis (LDA), protein modifications.