Hypertension Risk Prediction Based on SNPs by Machine Learning
Models

S.   Ali   Lajevardi; Mehrdad      Kargari; Maryam   S.   Daneshpour; Mahdi      Akbarzadeh

Abstract

Background: Hypertension is one of the most significant underlying ailments of cardiovascular disease; hence, methods that can accurately reveal the risk of hypertension at an early age are essential. Also, one of the most critical personal health objectives is to improve disease prediction accuracy by examining genetic variants.

Objective: Therefore, various clinical and genetically based methods are used to predict the disease; however, the critical issue with these methods is the high number of input variables as genetic markers with small samples. One approach that can be used to solve this problem is machine learning.

Methods: This study was conducted on the participants' genetic markers in the 20-year research of cardiometabolic genetics in Tehran (TCGS). Various machine learning methods were used, including linear regression, neural network, random forest, decision tree, and support vector machine. The top ten genetic markers were identified using importance-based ranking methods, including information gain, gain ratio, Gini index, χ², relief, and FCBF.

Results: A model based on a neural network with AUC of 89% was presented. This model has an accuracy and an f-measure of 0.89, which shows the quality. The final results indicate the success of the machine learning approach.

Conclusion: Study shows machine learning approach helps predict the risk of hypertension at a young age and finds significant SNPs that affect HTN.

Keywords: Hypertension risk, machine learning, SNP markers, TCGS, cardiovascular disease, genetic markers.

Graphical Abstract

[1]
Rafiei A, Amjadi O. Personalized medicine; a bridge between current medicine and the future healthcare. J Clin Excell  2013; 1(2): 47-68.

[2]
Akhavan-Safar M, Teimourpour B, Kargari M. GenHITS: A network science approach to driver gene detection in human regulatory network using gene’s influence evaluation. J Biomed Inform  2021; 114: 103661.
 [http://dx.doi.org/10.1016/j.jbi.2020.103661] [PMID:  33326867]

[3]
Hebbring S. Genomic and phenomic research in the 21st century. Trends Genet  2019; 35(1): 29-41.
 [http://dx.doi.org/10.1016/j.tig.2018.09.007] [PMID:  30342790]

[4]
Taylor KD, Guo X, Zangwill LM, et al. Genetic architecture of primary open-angle glaucoma in individuals of African descent. Ophthalmology  2019; 126(1): 38-48.
 [http://dx.doi.org/10.1016/j.ophtha.2018.10.031] [PMID:  30352225]

[5]
Dehghan A. Linking metabolic phenotyping and genomic information. Elsevier Inc. 2018.

[6]
Hirschhorn JN, Daly MJ. Genome-wide association studies for common diseases and complex traits. Nat Rev Genet  2005; 6(2): 95-108.
 [http://dx.doi.org/10.1038/nrg1521] [PMID:  15716906]

[7]
Aulchenko YS, de Koning DJ, Haley C. Genomewide rapid association using mixed model and regression: A fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics  2007; 177(1): 577-85.
 [http://dx.doi.org/10.1534/genetics.107.075614] [PMID:  17660554]

[8]
Gray A, Stewart I, Tenesa A. Advanced complex trait analysis. Bioinformatics  2012; 28(23): 3134-6.
 [http://dx.doi.org/10.1093/bioinformatics/bts571] [PMID:  23023980]

[9]
Yang J, Zeng J, Goddard ME, Wray NR, Visscher PM. Concepts, estimation and interpretation of SNP-based heritability. Nat Genet  2017; 49(9): 1304-10.
 [http://dx.doi.org/10.1038/ng.3941] [PMID:  28854176]

[10]
Cebamanos L, Gray A, Stewart I, Tenesa A. Regional heritability advanced complex trait analysis for GPU and traditional parallel architectures. Bioinformatics  2014; 30(8): 1177-9.
 [http://dx.doi.org/10.1093/bioinformatics/btt754] [PMID:  24403537]

[11]
Evangelou E, Ioannidis JPA. Meta-analysis methods for genome-wide association studies and beyond. Nat Rev Genet  2013; 14(6): 379-89.
 [http://dx.doi.org/10.1038/nrg3472] [PMID:  23657481]

[12]
Willer CJ, Li Y, Abecasis GR. METAL: Fast and efficient meta-analysis of genomewide association scans. Bioinformatics  2010; 26(17): 2190-1.
 [http://dx.doi.org/10.1093/bioinformatics/btq340] [PMID:  20616382]

[13]
Wu X, Yuan X, Wang W, et al. Value of a machine learning approach for predicting clinical outcomes in young patients with hypertension. Hypertension  2020; 75(5): 1271-8.
 [http://dx.doi.org/10.1161/HYPERTENSIONAHA.119.13404] [PMID:  32172622]

[14]
Dong SS, Guo Y, Yao S, et al. Integrating regulatory features data for prediction of functional disease-associated SNPs. Brief Bioinform  2019; 20(1): 26-32.
 [http://dx.doi.org/10.1093/bib/bbx094] [PMID:  28968709]

[15]
Krittanawong C, Bomback AS, Baber U, Bangalore S, Messerli FH, Wilson Tang WH. Future direction for using artificial intelligence to predict and manage hypertension. Curr Hypertens Rep  2018; 20(9): 75.
 [http://dx.doi.org/10.1007/s11906-018-0875-x] [PMID:  29980865]

[16]
Alzubi R, Ramzan N, Alzoubi H, Katsigiannis S. SNPs-based hypertension disease detection via machine learning techniques. 2018 24th International Conference on Automation and Computing (ICAC). 06-07 September 2018; Newcastle Upon Tyne, UK.  2018.
 [http://dx.doi.org/10.23919/IConAC.2018.8748972]

[17]
Fang M, Chen Y, Xue R, et al. A hybrid machine learning approach for hypertension risk prediction. Neural Comput Appl  2021; 2021: 1-11.
 [http://dx.doi.org/10.1007/s00521-021-06060-0]

[18]
Chowdhury MZI, Naeem I, Quan H, et al. Prediction of hypertension using traditional regression and machine learning models: A systematic review and meta-analysis. PLoS One  2022; 17(4): e0266334.
 [http://dx.doi.org/10.1371/journal.pone.0266334] [PMID:  35390039]

[19]
Niu M, Wang Y, Zhang L, et al. Identifying the predictive effectiveness of a genetic risk score for incident hypertension using machine learning methods among populations in rural China. Hypertens Res  2021; 44(11): 1483-91.
 [http://dx.doi.org/10.1038/s41440-021-00738-7] [PMID:  34480134]

[20]
Szymczak S. Machine learning in genome-wide association studies. Genet Epidemiol  2009; 33 (Suppl. 1): 51-7.
 [http://dx.doi.org/10.1002/gepi.20473]

[21]
Bilal A, Vellido A, Ribas V. Big data analytics for obesity prediction. Front Artif Intell Appl  2018; 308: 141-5.
 [http://dx.doi.org/10.3233/978-1-61499-918-8-141]

[22]
Berrar D. Performance measures for binary classification. Encycl Bioinforma Comput Biol ABC Bioinforma  2019; 1–3(1): 546-60.
 [http://dx.doi.org/10.1016/B978-0-12-809633-8.20351-8]

[23]
Zhou B, Bentham J, Di Cesare M, et al. Worldwide trends in blood pressure from 1975 to 2015: A pooled analysis of 1479 population-based measurement studies with 19·1 million participants. Lancet  2017; 389(10064): 37-55.
 [http://dx.doi.org/10.1016/S0140-6736(16)31919-5] [PMID:  27863813]

[24]
Ambika M, Raghuraman G. SaiRamesh L. Enhanced decision support system to predict and prevent hypertension using computational intelligence techniques. Soft Comput  2020; 24(17): 13293-304.
 [http://dx.doi.org/10.1007/s00500-020-04743-9]

[25]
Basile J, Bloch MJ, Bakris GL, White WB, Kunins L, John P. Overview of hypertension in adults - UpToDate. 2019; 1-56.Available from:. https://www.uptodate.com/contents/overview-of-hypertension-in-adults?search=hypertension&source=search_result&selectedTitle=1~150&usage_type=default&display_rank=1%0Ahttps://www-uptodate-com.puce.idm.oclc.org/contents/overview-of-hypertension-in-adults?se

[26]
Daneshpour MS, Fallah MS, Sedaghati-Khayat B, et al. Rationale and design of a genetic study on cardiometabolic risk factors: Protocol for the Tehran Cardiometabolic Genetic Study (TCGS). JMIR Res Protoc  2017; 6(2): e28.
 [http://dx.doi.org/10.2196/resprot.6050] [PMID:  28232301]

[27]
Azizi F, Ghanbarian A, Momenan AA, et al. Prevention of non-communicable disease in a population in nutrition transition: Tehran Lipid and Glucose Study phase II. Trials  2009; 10(1): 5.
 [http://dx.doi.org/10.1186/1745-6215-10-5] [PMID:  19166627]

[28]
Mahajan S, Zhang D, He S, et al. Prevalence, awareness, and treatment of isolated diastolic hypertension: Insights from the China Peace million persons project. J Am Heart Assoc  2019; 8(19): e012954.
 [http://dx.doi.org/10.1161/JAHA.119.012954] [PMID:  31566101]

[29]
Tohidi M, Hatami M, Hadaegh F, Azizi F. Triglycerides and triglycerides to high-density lipoprotein cholesterol ratio are strong predictors of incident hypertension in Middle Eastern women. J Hum Hypertens  2012; 26(9): 525-32.
 [http://dx.doi.org/10.1038/jhh.2011.70] [PMID:  21776016]

[30]
Kolifarhood G, Sabour S, Akbarzadeh M, et al. Genome-wide association study on blood pressure traits in the Iranian population suggests ZBED9 as a new locus for hypertension. Sci Rep  2021; 11(1): 11699.
 [http://dx.doi.org/10.1038/s41598-021-90925-w] [PMID:  34083597]

[31]
Kolifarhood G. Familial genetic and environmental risk profile and high blood pressure event: A prospective cohort of cardio-metabolic and genetic study. Blood Press  2021; 30(3): 196-204.
 [http://dx.doi.org/10.1080/08037051.2021.1903807]

[32]
Akbarzadeh M. GWAS findings improved genomic prediction accuracy of lipid profile traits: Tehran cardiometabolic genetic study. Sci Reports  2021; 11(1): 1-9.
 [http://dx.doi.org/10.1038/s41598-021-85203-8]

[33]
Bracher-Smith M, Crawford K, Escott-Price V. Machine learning for genetic prediction of psychiatric disorders: A systematic review. Mol Psychiatry  2021; 26(1): 70-9.
 [http://dx.doi.org/10.1038/s41380-020-0825-2] [PMID:  32591634]

[34]
Li S, Sun Y, Hu S, et al. Genetic risk scores to predict the prognosis of chronic heart failure patients in Chinese Han. J Cell Mol Med  2020; 24(1): 285-93.
 [http://dx.doi.org/10.1111/jcmm.14722] [PMID:  31670483]

[35]
Sullivan GM, Feinn R. Using effect size—or why the P value is not enough. J Grad Med Educ  2012; 4(3): 279-82.
 [http://dx.doi.org/10.4300/JGME-D-12-00156.1] [PMID:  23997866]

Cite As

Current Bioinformatics

Hypertension Risk Prediction Based on SNPs by Machine Learning Models

Abstract

Graphical Abstract