Background: Hepatocellular carcinoma (HCC) is the most common liver cancer and the mechanisms of hepatocarcinogenesis remain elusive.
Objective: This study aims to mine hub genes associated with HCC using multiple databases.
Methods: Data sets GSE45267, GSE60502, GSE74656 were downloaded from GEO database. Differentially expressed genes (DEGs) between HCC and control in each set were identified by limma software. The GO term and KEGG pathway enrichment of the DEGs aggregated in the datasets (aggregated DEGs) were analyzed using DAVID and KOBAS 3.0 databases. Protein-protein interaction (PPI) network of the aggregated DEGs was constructed using STRING database. GSEA software was used to verify the biological process. Association between hub genes and HCC prognosis was analyzed using patients’ information from TCGA database by survminer R package.
Results: From GSE45267, GSE60502 and GSE74656, 7583, 2349, and 553 DEGs were identified respectively. A total of 221 aggregated DEGs, which were mainly enriched in 109 GO terms and 29 KEGG pathways, were identified. Cell cycle phase, mitotic cell cycle, cell division, nuclear division and mitosis were the most significant GO terms. Metabolic pathways, cell cycle, chemical carcinogenesis, retinol metabolism and fatty acid degradation were the main KEGG pathways. Nine hub genes (TOP2A, NDC80, CDK1, CCNB1, KIF11, BUB1, CCNB2, CCNA2 and TTK) were selected by PPI network and all of them were associated with prognosis of HCC patients.
Conclusion: TOP2A, NDC80, CDK1, CCNB1, KIF11, BUB1, CCNB2, CCNA2 and TTK were hub genes in HCC, which may be potential biomarkers of HCC and targets of HCC therapy.
Keywords: Hepatocellular carcinoma, hub gene, bioinformatics, differentially expressed gene, database, mRNA.