Background: Open spina bifida (myelomeningocele) is the result of the failure of spinal cord closing completely and is the second most common and severe birth defect. Open neural tube defects are multifactorial, and the exact molecular mechanism of the pathogenesis is not clear due to disease complexity for which prenatal treatment options remain limited worldwide. Artificial intelligence techniques like machine learning tools have been increasingly used in precision diagnosis.
Objective: The primary objective of this study is to identify key genes for open neural tube defects using a machine learning approach that provides additional information about myelomeningocele in order to obtain a more accurate diagnosis.
Materials and Methods: Our study reports differential gene expression analysis from multiple datasets (GSE4182 and GSE101141) of amniotic fluid samples with open neural tube defects. The sample outliers in the datasets were detected using principal component analysis (PCA). We report a combination of the differential gene expression analysis with recursive feature elimination (RFE), a machine learning approach to get 4 key genes for open neural tube defects. The features selected were validated using five binary classifiers for diseased and healthy samples: Logistic Regression (LR), Decision tree classifier (DT), Support Vector Machine (SVM), Random Forest classifier (RF), and K-nearest neighbour (KNN) with 5-fold cross-validation.
Results: Growth Associated Protein 43 (GAP43), Glial fibrillary acidic protein (GFAP), Repetin (RPTN), and CD44 are the important genes identified in the study. These genes are known to be involved in axon growth, astrocyte differentiation in the central nervous system, post-traumatic brain repair, neuroinflammation, and inflammation-linked neuronal injuries. These key genes represent a promising tool for further studies in the diagnosis and early detection of open neural tube defects.
Conclusion: These key biomarkers help in the diagnosis and early detection of open neural tube defects, thus evaluating the progress and seriousness in diseases condition. This study strengthens previous literature sources of confirming these biomarkers linked with open NTD’s. Thus, among other prenatal treatment options present until now, these biomarkers help in the early detection of open neural tube defects, which provides success in both treatment and prevention of these defects in the advanced stage.
Keywords: Spina bifida, myelomeningocele, PCA, recursive feature elimination (RFE), machine learning-based classification, NTD’s.