Cercis gigantea is one of the most beautiful garden trees. It is part of the Cercis genus in the subfamily Caesalpinioideae of Leguminosae. However, little genetic information of C. gigantea is available. In the present study, the C. gigantea transcriptome was subjected to RNA sequencing. This generated large expression datasets suitable for functional genomic analysis. Some 55.5 million high-quality clean reads were collected. These reads were then assembled into 44,660 unigenes and 77,024 unique transcripts. The unigenes, with an average of 998 bps in length, were annotated by comparing with all known proteins in four public databases, Kyoto Encyclopedia of Genes and Genomes (KEGG), the National Center for Biotechnology Information (NCBI) non-redundant protein database (NR), the Cluster of Orthologous Groups (COG), and Swiss-Prot using the NCBI blast procedure. Out of the 44,660 unigenes, 28,884 (64.7%) were annotated. In addition, an interaction network of unigenes in C. gigantea was also constructed. The current study provides the first screen of a transcriptome not only for C. gigantea but for any Caesalpinioideae plant as an important platform for researches of functional genomics, gene expression, and protein-protein interaction.
Keywords: Cercis gigantea, transcriptome, PPI, Caesalpinioideae, annotation, expressed gene.