HCVS: Pinpointing Chromatin States Through Hierarchical Clustering and Visualization Scheme

Page: [148 - 156] Pages: 9

  • * (Excluding Mailing and Handling)

Abstract

Background: Specific combinations of Histone Modifications (HMs) contributing towards histone code hypothesis lead to various biological functions. HMs combinations have been utilized by various studies to divide the genome into different regions. These study regions have been classified as chromatin states. Mostly Hidden Markov Model (HMM) based techniques have been utilized for this purpose. In case of chromatin studies, data from Next Generation Sequencing (NGS) platforms is being used. Chromatin states based on histone modification combinatorics are annotated by mapping them to functional regions of the genome. The number of states being predicted so far by the HMM tools have been justified biologically till now.

Objective: The present study aimed at providing a computational scheme to identify the underlying hidden states in the data under consideration.

Methods: We proposed a computational scheme HCVS based on hierarchical clustering and visualization strategy in order to achieve the objective of study.

Results: We tested our proposed scheme on a real data set of nine cell types comprising of nine chromatin marks. The approach successfully identified the state numbers for various possibilities. The results have been compared with one of the existing models as well which showed quite good correlation.

Conclusion: The HCVS model not only helps in deciding the optimal state numbers for a particular data but it also justifies the results biologically thereby correlating the computational and biological aspects.

Keywords: ChIP-Seq, segmentation, epigenetic marks, histone modifications, histone code hypothesis, hidden markov model, states, biological annotation.

Graphical Abstract

[1]
Kouzarides T. Chromatin modifications and their function. Cell 2007; 128: 693-705.
[2]
Jenuwein T, Allis CD. Translating the histone code. Science 2001; 293: 1074-80.
[3]
Strahl BD, Allis CD. The language of covalent histone modifications. Nature 2000; 403: 41-5.
[4]
Watson JD. Celebrating the genetic jubilee: a conversation with James D. Watson. Interviewed by John Rennie. Sci Am 2003; 288: 66-9.
[5]
Zhou VW, Goren A, Bernstein BE. Charting histone modifications and the functional organization of mammalian genomes. Nat Rev Genet 2011; 12: 7-18.
[6]
Millar CB, Grunstein M. Genome-wide patterns of histone modifications in yeast. Nat Rev Mol Cell Biol 2006; 7: 657-66.
[7]
Liu CL, Kaplan T, Kim M, et al. Single nucleosome mapping of histone modifications in S. cerevisiae. PLoS Biol 2005; 3: e328.
[8]
Pokholok DK, Harbison CT, Levine S, et al. Genome-wide map of nucleosome acetylation and methylation in yeast. Cell 2005; 122: 517-27.
[9]
Heintzman ND, Stuart RK, Hon G, et al. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 2007; 39: 311-8.
[10]
Won KJ, Chepelev I, Ren B, Wang W. Prediction of regulatory elements in mammalian genomes using chromatin signatures. BMC Bioinformatics 2008; 9: 547.
[11]
Wang X, Xuan Z, Zhao X, Li Y, Zhang MQ. High-resolution human core- promoter prediction with CoreBoost_HM. Genome Res 2009; 19: 266-75.
[12]
Hon G, Ren B, Wang W. ChromaSig: a probabilistic approach to finding common chromatin signatures in the human genome. PLOS Comput Biol 2008; 4: e1000201.
[13]
Schreiber SL, Bernstein BE. Signaling network model of chromatin. Cell 2000; 111: 771-8.
[14]
Ernst J, Kellis M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat Biotechnol 2010; 28: 817-25.
[15]
Ernst J, Kheradpour P, Mikkelsen TS, et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 2011; 473: 43-9.
[16]
Hoffman MM, Buske OJ, Wang J, Weng Z, Bilmes JA, Noble WS. Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat Methods 2012; 9(5): 473-6.
[17]
Roudier F, Ahmed I, Bérard C, et al. Integrative epigenomic mapping defines four main chromatin states in Arabidopsis. EMBO J 2011; 30: 1928-38.
[18]
Liu T, Rechtsteiner A, Egelhofer TA, et al. Broad chromosomal domains of histone modification patterns in C.elegans. Genome Res 2011; 21: 227-36.
[19]
Gerstein MB, Lu ZJ, Van Nostrand EL, et al. Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE project. Science 2010; 330: 1775-87.
[20]
Roy S, Ernst J, Kharchenko PV, et al. Identification of Functional Elements and Regulatory Circuits by Drosophila modENCODE. Science 2011; 330: 1787-97.
[21]
Riddle NC, Minoda A, Kharchenko PV, et al. Plasticity in patterns of histone modifications and chromosomal proteins in Drosophila heterochromatin. Genome Res 2011; 21: 147-63.
[22]
Kharchenko PV, Alekseyenko AA, Schwartz YB, et al. Comprehensive analysis of the chromatin landscape in Drosophila melanogaster. Nature 2010; 471: 480-6.
[23]
Filion GJ, Bemmel GJV, Braunschweig U, et al. Systematic protein location mapping reveals five principal chromatin types in Drosophila cells. Cell 2010; 143: 212-24.
[24]
Larson JL, Yuan GC. Chromatin states accurately classify cell differentiation stages. PLoS One 2012; 7(2): e31414.
[25]
Larson JL, Yuan GC. Epigenetic domains found in mouse embryonic stem cells via a hidden Markov model. BMC Bioinformatics 2010; 11: 557.
[26]
Mikkelsen TS, Ku M, Jaffe DB, et al. Genome-wide maps of chromatin state in pluripotent and lineage- committed cells. Nature 2007; 448(7153): 553-60.
[27]
Ernst J, Kellis M. ChromHMM: automating chromatin-state discovery and characterization. Nat Methods 2012; 9: 215-6.
[28]
Schwarz G. Estimating the dimension of a model. Ann Stat 1978; 6: 461-4.
[29]
Akaike H. Information theory and an extension of the maximum likelihood principle Proceeding of 2nd International Symposium on Information Theory, Tsahkadsor, Armenia, USSR. Budapest: Akadémiai Kiadó 1973; pp. 267-281
[30]
Arlot S, Celisse A. Survey of cross-validation procedures for model selection. Stat Surv 2010; 4: 40-79.
[31]
Dalton L, Ballarin V, Brun M. Clustering Algorithms: On learning, validation, performance, and applications to genomics. Curr Genomics 2009; 10: 430-45.
[32]
Kirk P, Griffin JE, Savage RS, Ghahramani Z, Wild DL. Bayesian correlated clustering to integrate multiple datasets. Bioinformatics 2012; 28(24): 3290-7.
[33]
Baillie M, Jose JM, van Rijsbergen CJ. HMM model selection issues for soccer video. Proceedings of Springer-Verlag, Berlin, Heidelberg. CIVR LNCS 3115, 2004; 3115: pp. 70-78.
[34]
Hoon de MJL. Imoto S, Nolan J, Miyano S. Open source clustering software. Bioinformatics 2004; 20(9): 1453-4.
[35]
Liang K, Keles S. Normalization of ChIP-Seq data with control. BMC Bioinformatics 2012; 13: 199.
[36]
Przytycka TM, Zheng J. Hidden Markov Models eLS John Wiley & Sons, Ltd: Chichester 2011 DOI:101002/9780470015902a0005267pub2.
[37]
Noureen N, Touseef M, Fazal S, Qadir MA. ChromClust: A semi-supervised chromatin clustering toolkit for mining histone modifications interplay. Genomics 2015; 106(6): 355-9.
[38]
Noureen N, Zohaib HM, Qadir MA, Fazal S. ChromBiSim: Interactive chromatin biclustering using a simple approach. Genomics 2017; 109: 353-61.