Identification of Gene Signature Associated with Type 2 Diabetes Mellitus by Integrating Mutation and Expression Data

Page: [51 - 58] Pages: 8

  • * (Excluding Mailing and Handling)

Abstract

Background: Type 2 Diabetes Mellitus (T2DM) is a chronic disease. The molecular diagnosis should be helpful for the treatment of T2DM patients. With the development of sequencing technology, a large number of differentially expressed genes were identified from expression data. However, the method of machine learning can only identify the local optimal solution as the signature.

Objective: The mutation information obtained by inheritance can better reflect the relationship between genes and diseases. Therefore, we need to integrate mutation information to more accurately identify the signature.

Methods: To this end, we integrated Genome-Wide Association Study (GWAS) data and expression data, combined with expression Quantitative Trait Loci (eQTL) technology to get T2DM predictive signature (T2DMSig-10). Firstly, we used GWAS data to obtain a list of T2DM susceptible loci. Then, we used eQTL technology to obtain risk Single Nucleotide Polymorphisms (SNPs), and combined with the pancreatic β-cells gene expression data to obtain 10 protein-coding genes. Next, we combined these genes with equal weights.

Results: After Receiver Operating Characteristic (ROC), single-gene removal and increase method, gene ontology function enrichment and protein-protein interaction network were used to verify the results showed that T2DMSig-10 had an excellent predictive effect on T2DM (AUC=0.99), and was highly robust.

Conclusion: In short, we obtained the predictive signature of T2DM, and further verified it.

Keywords: Type 2 diabetes mellitus, genome-wide association study, expression quantitative trait loci, predictive signature, AUC=0.99, ROC.

Graphical Abstract

[1]
Faselis C, Katsimardou A, Imprialos K, Deligkaris P, Kallistratos M, Dimitriadis K. Microvascular complications of type 2 diabetes mellitus. Curr Vasc Pharmacol 2020; 18(2): 117-24.
[http://dx.doi.org/10.2174/1570161117666190502103733] [PMID: 31057114]
[2]
Cheng L, Qi C, Zhuang H, Fu T, Zhang X. gutMDisorder: A comprehensive database for dysbiosis of the gut microbiota in disorders and interventions. Nucleic Acids Res 2020; 48(D1): D554-60.
[http://dx.doi.org/10.1093/nar/gkz843] [PMID: 31584099]
[3]
Barron E, Bakhai C, Kar P, et al. Associations of type 1 and type 2 diabetes with COVID-19-related mortality in England: A whole-population study. Lancet Diabetes Endocrinol 2020; 8(10): 813-22.
[http://dx.doi.org/10.1016/S2213-8587(20)30272-2] [PMID: 32798472]
[4]
Cheng L, Zhuang H, Ju H, et al. Exposing the causal effect of body mass index on the risk of type 2 diabetes mellitus: A mendelian randomization study. Front Genet 2019; 10: 94.
[http://dx.doi.org/10.3389/fgene.2019.00094] [PMID: 30891058]
[5]
GWAS to the people. Nat Med 2018; 24(10): 1483.
[http://dx.doi.org/10.1038/s41591-018-0231-3] [PMID: 30297896]
[6]
Zou Q, Qu K, Luo Y, Yin D, Ju Y, Tang H. Predicting diabetes mellitus with machine learning techniques. Front Genet 2018; 9: 515.
[http://dx.doi.org/10.3389/fgene.2018.00515] [PMID: 30459809]
[7]
Auton A, Brooks LD, Durbin RM, et al. A global reference for human genetic variation. Nature 2015; 526(7571): 68-74.
[http://dx.doi.org/10.1038/nature15393] [PMID: 26432245]
[8]
Frazer KA, Ballinger DG, Cox DR, et al. A second generation human haplotype map of over 3.1 million SNPs. Nature 2007; 449(7164): 851-61.
[http://dx.doi.org/10.1038/nature06258] [PMID: 17943122]
[9]
Ding L, Fan L, Xu X, Fu J, Xue Y. Identification of core genes and pathways in type 2 diabetes mellitus by bioinformatics analysis. Mol Med Rep 2019; 20(3): 2597-608.
[http://dx.doi.org/10.3892/mmr.2019.10522] [PMID: 31524257]
[10]
Cheng L, Hu Y. Human disease system biology. Curr Gene Ther 2018; 18(5): 255-6.
[http://dx.doi.org/10.2174/1566523218666181010101114] [PMID: 30306867]
[11]
Zou Q, Li J, Song L, Zeng X, Wang G. Similarity computation strategies in the microRNA-disease network: A survey. Brief Funct Genomics 2016; 15(1): 55-64.
[PMID: 26134276]
[12]
Cheng L, Zhao H, Wang P, et al. Computational methods for identifying similar diseases. Mol Ther Nucleic Acids 2019; 18: 590-604.
[http://dx.doi.org/10.1016/j.omtn.2019.09.019] [PMID: 31678735]
[13]
Cheng L. Computational and biological methods for gene therapy. Curr Gene Ther 2019; 19(4): 210.
[http://dx.doi.org/10.2174/156652321904191022113307] [PMID: 31762421]
[14]
Zhang X, Zou Q, Rodriguez-Paton A, Zeng X. Meta-path methods for prioritizing candidate disease miRNAs. IEEE/ACM Trans Comput Biol Bioinformatics 2019; 16(1): 283-91.
[http://dx.doi.org/10.1109/TCBB.2017.2776280] [PMID: 29990255]
[15]
Yang F, Zou Q. mAML: An automated machine learning pipeline with a microbiome repository for human disease classification. Database (Oxford) 2020. baaa050
[http://dx.doi.org/10.1093/database/baaa050] [PMID: 32588040]
[16]
Morris AP, Voight BF, Teslovich TM, et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat Genet 2012; 44(9): 981-90.
[http://dx.doi.org/10.1038/ng.2383] [PMID: 22885922]
[17]
Pal LR, Yu C-H, Mount SM, Moult J. Insights from GWAS: Emerging landscape of mechanisms underlying complex trait disease. BMC Genomics 2015; 16(Suppl. 8): S4.
[http://dx.doi.org/10.1186/1471-2164-16-S8-S4] [PMID: 26110739]
[18]
Sherry ST, Ward MH, Kholodov M, et al. dbSNP: The NCBI database of genetic variation. Nucleic Acids Res 2001; 29(1): 308-11.
[http://dx.doi.org/10.1093/nar/29.1.308] [PMID: 11125122]
[19]
Zhang ZM, Tan JX, Wang F, Dao FY, Zhang ZY, Lin H. Early diagnosis of hepatocellular carcinoma using machine learning method. Front Bioeng Biotechnol 2020; 8: 254.
[http://dx.doi.org/10.3389/fbioe.2020.00254] [PMID: 32292778]
[20]
Dao FY, Lv H, Zulfiqar H, et al. A computational platform to identify origins of replication sites in eukaryotes. Brief Bioinform 2021; 22(2): 1940-50.
[http://dx.doi.org/10.1093/bib/bbaa017] [PMID: 32065211]
[21]
Krentz NAJ, Gloyn AL. Insights into pancreatic islet cell dysfunction from type 2 diabetes mellitus genetics. Nat Rev Endocrinol 2020; 16(4): 202-12.
[http://dx.doi.org/10.1038/s41574-020-0325-0] [PMID: 32099086]
[22]
Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 2015; 43(7)e47
[http://dx.doi.org/10.1093/nar/gkv007] [PMID: 25605792]
[23]
Liang J. Protective effects of an obesity-associated polymorphism (cdkal1 rs9356744) on prediabetes: The cardiometabolic risk in chinese (CRC) Study. Experimental and clinical endocrinology & diabetes : Official journal, german society of endocrinology [and] german diabetes association 2018; 126(9): 540-.
[http://dx.doi.org/10.1055/s-0042-109607] [PMID: 29933462]
[24]
Montesanto A, Bonfigli AR, Crocco P, et al. Genes associated with type 2 diabetes and vascular complications. Aging (Albany NY) 2018; 10(2): 178-96.
[http://dx.doi.org/10.18632/aging.101375] [PMID: 29410390]
[25]
Robin X, Turck N, Hainard A, et al. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 2011; 12: 77.
[http://dx.doi.org/10.1186/1471-2105-12-77] [PMID: 21414208]
[26]
Masini M, Martino L, Marselli L, et al. Ultrastructural alterations of pancreatic beta cells in human diabetes mellitus. Diabetes Metab Res Rev 2017; 33(6)
[http://dx.doi.org/10.1002/dmrr.2894] [PMID: 28303682]
[27]
Johnson SR, Leo P, Conwell LS, Harris M, Brown MA, Duncan EL. Clinical usefulness of comprehensive genetic screening in maturity onset diabetes of the young (MODY): A novel ABCC8 mutation in a previously screened family. J Diabetes 2018; 10(9): 764-7.
[http://dx.doi.org/10.1111/1753-0407.12778] [PMID: 29726111]
[28]
Huang DW, Sherman BT, Lempicki RA. “Bioinformatics enrichment tools: Paths toward the comprehensive functional analysis of large gene lists,” (in eng). Nucleic Acids Res 2009; 37(1)
[http://dx.doi.org/10.1093/nar/gkn923]
[29]
Szklarczyk D, Gable AL, Lyon D, et al. STRING v11: Protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 2019; 47(D1): D607-13.
[http://dx.doi.org/10.1093/nar/gky1131] [PMID: 30476243]
[30]
Amin S, Cook B, Zhou T, et al. Discovery of a drug candidate for GLIS3-associated diabetes. Nat Commun 2018; 9(1): 2681.
[http://dx.doi.org/10.1038/s41467-018-04918-x] [PMID: 29992946]
[31]
Kumar S, Aswal VK, Agrawal RP, et al. SNP in KCNQ1 gene is associated with susceptibility to diabetic nephropathy in subjects with type 2 diabetes in india. J Assoc Physicians India 2018; 66(8): 58-61.
[PMID: 31324086]
[32]
Dwivedi OP, Lehtovirta M, Hastoy B, et al. Loss of ZnT8 function protects against diabetes by enhanced insulin secretion. Nat Genet 2019; 51(11): 1596-606.
[http://dx.doi.org/10.1038/s41588-019-0513-9] [PMID: 31676859]
[33]
Machado-Silva W, Tonet-Furioso AC, Gomes L, Córdova C, Moraes CF, Nóbrega OT. The rs4430796 SNP of the HNF1β gene associates with type 2 diabetes in older adults Revista da associacao medica brasileira (1992) 2018; 64: pp.(7): 586-9.
[http://dx.doi.org/10.1590/1806-9282.64.07.586]
[34]
Lv H, et al. iDNA-MS: An integrated computational tool for detecting dna modification sites in multiple genomes. iScience 2020; 23(4): 100991.
[http://dx.doi.org/10.1016/j.isci.2020.100991]
[35]
Yang H, Lv H, Ding H, Chen W, Lin H. iRNA-2OM: A sequence-based predictor for identifying 2′-o-methylation sites in homo sapiens. J Comput Biol 2018; 25(11): 1266-77.
[http://dx.doi.org/10.1089/cmb.2018.0004] [PMID: 30113871]
[36]
Peng J, Hui W, Li Q, et al. A learning-based framework for miRNA-disease association identification using neural networks. Bioinformatics 2019; 35(21): 4364-71.
[http://dx.doi.org/10.1093/bioinformatics/btz254] [PMID: 30977780]
[37]
Peng J, Xue H, Wei Z, Tuncali I, Hao J, Shang X. Integrating multi-network topology for gene function prediction using deep neural networks. Brief Bioinform 2021; 22(2): 2096-105.
[http://dx.doi.org/10.1093/bib/bbaa036] [PMID: 32249297]