Objective: To identify potential biomarkers of osteosarcoma (OS) to further elaborate the molecular mechanisms underlying OS through mAP-KL algorithm and mutual information network.
Methods: E-GEOD-33382 and E-GEOD-28974 were downloaded from EMBL-EBI database and then were merged. Afterwards, microarray data of 84 OS samples and 15 controls were obtained. Next, affinity propagation clustering (APCluster) package was utilized to perform the cluster analysis to identify a list of the most representative genes in each cluster, named as exemplars. Support vector machine (SVM) with linear kernel was employed to assess the classification performance of mAP-KL method. Finally, identification of hub genes was implemented based on mutual information network. Results: Based on the pre-defined genes numbers (gene counts ≤ 50), 10 clusters were identified among the top 200 genes, and 10 cluster genes were screened as exemplars. Particularly, O-Fucosylpeptide 3- Beta-N-Acetylglucosaminyltransferase (MFNG, degree = 154), hepatitis A virus cellular receptor 2(HAVCR2, degree = 138), and lymphocyte cytosolic protein 2 (SH2 domain containing leukocyte protein of 76kDa, LCP2, degree = 133) exhibited higher degrees of connectivity in mutual information network of the top 200 genes. During the 5-CV evaluation, the classification results were ideal to distinguish all samples correctly. The mAP-KL method achieved the highest AUC score of 1.00, MCC score of 1.00, specificity of 1.00, and sensitivity of 1.00. Conclusion: Several pivotal genes identified in our study, such as MFNG, HAVCR2, and LCP2 might be potential biomarkers for predicting OS development and therapeutic targets for OS patients.Keywords: Osteosarcoma, support vector machine, mAP-KL algorithm, mutual information network, classification performance, exemplar, hub genes.