Identification of Gene Signature Associated with Type 2 Diabetes Mellitus
by Integrating Mutation and Expression Data

Zijun      Zhu; Xudong      Han; Liang      Cheng

Abstract

Background: Type 2 Diabetes Mellitus (T2DM) is a chronic disease. The molecular diagnosis should be helpful for the treatment of T2DM patients. With the development of sequencing technology, a large number of differentially expressed genes were identified from expression data. However, the method of machine learning can only identify the local optimal solution as the signature.

Objective: The mutation information obtained by inheritance can better reflect the relationship between genes and diseases. Therefore, we need to integrate mutation information to more accurately identify the signature.

Methods: To this end, we integrated Genome-Wide Association Study (GWAS) data and expression data, combined with expression Quantitative Trait Loci (eQTL) technology to get T2DM predictive signature (T2DMSig-10). Firstly, we used GWAS data to obtain a list of T2DM susceptible loci. Then, we used eQTL technology to obtain risk Single Nucleotide Polymorphisms (SNPs), and combined with the pancreatic β-cells gene expression data to obtain 10 protein-coding genes. Next, we combined these genes with equal weights.

Results: After Receiver Operating Characteristic (ROC), single-gene removal and increase method, gene ontology function enrichment and protein-protein interaction network were used to verify the results showed that T2DMSig-10 had an excellent predictive effect on T2DM (AUC=0.99), and was highly robust.

Conclusion: In short, we obtained the predictive signature of T2DM, and further verified it.

Keywords: Type 2 diabetes mellitus, genome-wide association study, expression quantitative trait loci, predictive signature, AUC=0.99, ROC.

Graphical Abstract

[1] 
Faselis C, Katsimardou A, Imprialos K, Deligkaris P, Kallistratos M, Dimitriadis K. Microvascular complications of type 2 diabetes mellitus. Curr Vasc Pharmacol  2020; 18(2): 117-24.
[http://dx.doi.org/10.2174/1570161117666190502103733] [PMID:  31057114] 
[2] 
Cheng L, Qi C, Zhuang H, Fu T, Zhang X. gutMDisorder: A comprehensive database for dysbiosis of the gut microbiota in disorders and interventions. Nucleic Acids Res  2020; 48(D1): D554-60.
[http://dx.doi.org/10.1093/nar/gkz843] [PMID:  31584099] 
[3] 
Barron E, Bakhai C, Kar P, et al. Associations of type 1 and type 2 diabetes with COVID-19-related mortality in England: A whole-population study. Lancet Diabetes Endocrinol  2020; 8(10): 813-22.
[http://dx.doi.org/10.1016/S2213-8587(20)30272-2] [PMID:  32798472] 
[4] 
Cheng L, Zhuang H, Ju H, et al. Exposing the causal effect of body mass index on the risk of type 2 diabetes mellitus: A mendelian randomization study. Front Genet  2019; 10: 94.
[http://dx.doi.org/10.3389/fgene.2019.00094] [PMID:  30891058] 
[5] 
GWAS to the people. Nat Med  2018; 24(10): 1483.
[http://dx.doi.org/10.1038/s41591-018-0231-3] [PMID:  30297896] 
[6] 
Zou Q, Qu K, Luo Y, Yin D, Ju Y, Tang H. Predicting diabetes mellitus with machine learning techniques. Front Genet  2018; 9: 515.
[http://dx.doi.org/10.3389/fgene.2018.00515] [PMID:  30459809] 
[7] 
Auton A, Brooks LD, Durbin RM, et al. A global reference for human genetic variation. Nature  2015; 526(7571): 68-74.
[http://dx.doi.org/10.1038/nature15393] [PMID:  26432245] 
[8] 
Frazer KA, Ballinger DG, Cox DR, et al. A second generation human haplotype map of over 3.1 million SNPs. Nature  2007; 449(7164): 851-61.
[http://dx.doi.org/10.1038/nature06258] [PMID:  17943122] 
[9] 
Ding L, Fan L, Xu X, Fu J, Xue Y. Identification of core genes and pathways in type 2 diabetes mellitus by bioinformatics analysis. Mol Med Rep  2019; 20(3): 2597-608.
[http://dx.doi.org/10.3892/mmr.2019.10522] [PMID:  31524257] 
[10] 
Cheng L, Hu Y. Human disease system biology. Curr Gene Ther  2018; 18(5): 255-6.
[http://dx.doi.org/10.2174/1566523218666181010101114] [PMID:  30306867] 
[11] 
Zou Q, Li J, Song L, Zeng X, Wang G. Similarity computation strategies in the microRNA-disease network: A survey. Brief Funct Genomics  2016; 15(1): 55-64.
[PMID:  26134276] 
[12] 
Cheng L, Zhao H, Wang P, et al. Computational methods for identifying similar diseases. Mol Ther Nucleic Acids  2019; 18: 590-604.
[http://dx.doi.org/10.1016/j.omtn.2019.09.019] [PMID:  31678735] 
[13] 
Cheng L. Computational and biological methods for gene therapy. Curr Gene Ther  2019; 19(4): 210.
[http://dx.doi.org/10.2174/156652321904191022113307] [PMID:  31762421] 
[14] 
Zhang X, Zou Q, Rodriguez-Paton A, Zeng X. Meta-path methods for prioritizing candidate disease miRNAs. IEEE/ACM Trans Comput Biol Bioinformatics  2019; 16(1): 283-91.
[http://dx.doi.org/10.1109/TCBB.2017.2776280] [PMID:  29990255] 
[15] 
Yang F, Zou Q. mAML: An automated machine learning pipeline with a microbiome repository for human disease classification. Database (Oxford) 2020. baaa050
[http://dx.doi.org/10.1093/database/baaa050] [PMID: 32588040] 
[16] 
Morris AP, Voight BF, Teslovich TM, et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat Genet  2012; 44(9): 981-90.
[http://dx.doi.org/10.1038/ng.2383] [PMID:  22885922] 
[17] 
Pal LR, Yu C-H, Mount SM, Moult J. Insights from GWAS: Emerging landscape of mechanisms underlying complex trait disease. BMC Genomics  2015; 16(Suppl. 8): S4.
[http://dx.doi.org/10.1186/1471-2164-16-S8-S4] [PMID:  26110739] 
[18] 
Sherry ST, Ward MH, Kholodov M, et al. dbSNP: The NCBI database of genetic variation. Nucleic Acids Res  2001; 29(1): 308-11.
[http://dx.doi.org/10.1093/nar/29.1.308] [PMID:  11125122] 
[19] 
Zhang ZM, Tan JX, Wang F, Dao FY, Zhang ZY, Lin H. Early diagnosis of hepatocellular carcinoma using machine learning method. Front Bioeng Biotechnol  2020; 8: 254.
[http://dx.doi.org/10.3389/fbioe.2020.00254] [PMID:  32292778] 
[20] 
Dao FY, Lv H, Zulfiqar H, et al. A computational platform to identify origins of replication sites in eukaryotes. Brief Bioinform  2021; 22(2): 1940-50.
[http://dx.doi.org/10.1093/bib/bbaa017] [PMID:  32065211] 
[21] 
Krentz NAJ, Gloyn AL. Insights into pancreatic islet cell dysfunction from type 2 diabetes mellitus genetics. Nat Rev Endocrinol  2020; 16(4): 202-12.
[http://dx.doi.org/10.1038/s41574-020-0325-0] [PMID:  32099086] 
[22] 
Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res  2015; 43(7)e47
[http://dx.doi.org/10.1093/nar/gkv007] [PMID:  25605792] 
[23] 
Liang J. Protective effects of an obesity-associated polymorphism (cdkal1 rs9356744) on prediabetes: The cardiometabolic risk in chinese (CRC) Study. Experimental and clinical endocrinology & diabetes : Official journal, german society of endocrinology [and] german diabetes association 2018; 126(9): 540-.
[http://dx.doi.org/10.1055/s-0042-109607] [PMID:  29933462] 
[24] 
Montesanto A, Bonfigli AR, Crocco P, et al. Genes associated with type 2 diabetes and vascular complications. Aging (Albany NY)  2018; 10(2): 178-96.
[http://dx.doi.org/10.18632/aging.101375] [PMID:  29410390] 
[25] 
Robin X, Turck N, Hainard A, et al. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics  2011; 12: 77.
[http://dx.doi.org/10.1186/1471-2105-12-77] [PMID:  21414208] 
[26] 
Masini M, Martino L, Marselli L, et al. Ultrastructural alterations of pancreatic beta cells in human diabetes mellitus. Diabetes Metab Res Rev  2017; 33(6)
[http://dx.doi.org/10.1002/dmrr.2894] [PMID:  28303682] 
[27] 
Johnson SR, Leo P, Conwell LS, Harris M, Brown MA, Duncan EL. Clinical usefulness of comprehensive genetic screening in maturity onset diabetes of the young (MODY): A novel ABCC8 mutation in a previously screened family. J Diabetes  2018; 10(9): 764-7.
[http://dx.doi.org/10.1111/1753-0407.12778] [PMID:  29726111] 
[28] 
Huang DW, Sherman BT, Lempicki RA. “Bioinformatics enrichment tools: Paths toward the comprehensive functional analysis of large gene lists,” (in eng). Nucleic Acids Res  2009; 37(1)
[http://dx.doi.org/10.1093/nar/gkn923] 
[29] 
Szklarczyk D, Gable AL, Lyon D, et al. STRING v11: Protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res  2019; 47(D1): D607-13.
[http://dx.doi.org/10.1093/nar/gky1131] [PMID:  30476243] 
[30] 
Amin S, Cook B, Zhou T, et al. Discovery of a drug candidate for GLIS3-associated diabetes. Nat Commun  2018; 9(1): 2681.
[http://dx.doi.org/10.1038/s41467-018-04918-x] [PMID:  29992946] 
[31] 
Kumar S, Aswal VK, Agrawal RP, et al. SNP in KCNQ1 gene is associated with susceptibility to diabetic nephropathy in subjects with type 2 diabetes in india. J Assoc Physicians India  2018; 66(8): 58-61.
[PMID:  31324086] 
[32] 
Dwivedi OP, Lehtovirta M, Hastoy B, et al. Loss of ZnT8 function protects against diabetes by enhanced insulin secretion. Nat Genet  2019; 51(11): 1596-606.
[http://dx.doi.org/10.1038/s41588-019-0513-9] [PMID:  31676859] 
[33] 
Machado-Silva W, Tonet-Furioso AC, Gomes L, Córdova C, Moraes CF, Nóbrega OT. The rs4430796 SNP of the HNF1β gene associates with type 2 diabetes in older adults Revista da associacao medica brasileira  (1992)  2018; 64: pp.(7): 586-9.
[http://dx.doi.org/10.1590/1806-9282.64.07.586] 
[34] 
Lv H, et al.  iDNA-MS: An integrated computational tool for detecting dna modification sites in multiple genomes. iScience 2020; 23(4): 100991.
[http://dx.doi.org/10.1016/j.isci.2020.100991] 
[35] 
Yang H, Lv H, Ding H, Chen W, Lin H. iRNA-2OM: A sequence-based predictor for identifying 2′-o-methylation sites in homo sapiens. J Comput Biol  2018; 25(11): 1266-77.
[http://dx.doi.org/10.1089/cmb.2018.0004] [PMID:  30113871] 
[36] 
Peng J, Hui W, Li Q, et al. A learning-based framework for miRNA-disease association identification using neural networks. Bioinformatics  2019; 35(21): 4364-71.
[http://dx.doi.org/10.1093/bioinformatics/btz254] [PMID:  30977780] 
[37] 
Peng J, Xue H, Wei Z, Tuncali I, Hao J, Shang X. Integrating multi-network topology for gene function prediction using deep neural networks. Brief Bioinform  2021; 22(2): 2096-105.
[http://dx.doi.org/10.1093/bib/bbaa036] [PMID: 32249297] 

Cite As

Current Gene Therapy

Identification of Gene Signature Associated with Type 2 Diabetes Mellitus by Integrating Mutation and Expression Data

Abstract

Graphical Abstract