Background: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection has been shown to trigger autoimmunity, and the phenomenon leads to several chronic human diseases such as Type-1 diabetes, Crohn’s disease, vasculitis, Guillian-Barrė syndrome, etc. The mechanism underlying SARS-CoV-2-induced autoimmune response is unknown and is an active area of interest for the researchers.
Objective: The primary objective of this study is to identify the autoantigen markers for the classification of SARS-CoV-2 (COVID-19 positive and negative samples) that trigger an immune response leading to autoimmunity using a machine learning approach that provides information to obtain a more accurate diagnosis for COVID-induced diseases.
Materials and Methods: Our study reports the transcriptomic profile of the long COVID patient's whole blood samples that are collected from 0 to 35th day of acute infection as described in the GSE215865 dataset (1391 Samples after preprocessing: 1233-COVID positive and 158-COVID negative). The binary classification algorithm from the sci-kit learn python library, namely logistic regression and random forest with 10-fold cross-validation, was applied to the processed data, followed by a selection of the 20 best gene features with recursive feature elimination from a set of 10,719 gene features to obtain the classification accuracy of 87%.
Results: The fidgetin, microtubule severing factor (FIGN), SH3 and cysteine-rich domain (STAC), Cadherin-6 (CDH6), docking protein 6 (DOK6), nuclear RNA export factor 3 (NXF3) and maternally expressed 3 (MEG3) are the autoantigens markers identified for classification of COVID-positive and negative samples.
Conclusion: The identified autoantigen markers from transcriptomic datasets using machine learning techniques provide a deeper understanding of COVID-induced diseases and may play an important role as potential diagnostic and drug targets for COVID-19.
Keywords: Single-cell RNA sequencing, machine learning, SARS-CoV-2, biomarkers, autoantigens, type-1 diabetes.