Background: Hypertension is one of the most significant underlying ailments of cardiovascular disease; hence, methods that can accurately reveal the risk of hypertension at an early age are essential. Also, one of the most critical personal health objectives is to improve disease prediction accuracy by examining genetic variants.
Objective: Therefore, various clinical and genetically based methods are used to predict the disease; however, the critical issue with these methods is the high number of input variables as genetic markers with small samples. One approach that can be used to solve this problem is machine learning.
Methods: This study was conducted on the participants' genetic markers in the 20-year research of cardiometabolic genetics in Tehran (TCGS). Various machine learning methods were used, including linear regression, neural network, random forest, decision tree, and support vector machine. The top ten genetic markers were identified using importance-based ranking methods, including information gain, gain ratio, Gini index, χ², relief, and FCBF.
Results: A model based on a neural network with AUC of 89% was presented. This model has an accuracy and an f-measure of 0.89, which shows the quality. The final results indicate the success of the machine learning approach.
Conclusion: Study shows machine learning approach helps predict the risk of hypertension at a young age and finds significant SNPs that affect HTN.