Leucine rich repeats (LRRs) are present in over 20,000 proteins from viruses to eukaryotes. Two to sixty-two LRRs occur in tandem. Each repeat is typically 20-30 residues long and can be divided into an HCS (Highly conserved segment) and a VS (Variable segment). The HCS part consists of an eleven or a twelve residue stretch, LxxLxLxxNx(x/- )L, in which “L” is Leu, Ile, Val, or Phe, “N” is Asn, Thr, Ser, or Cys, “x” is a non-conserved residue, and “-” is a possible deletion site. Eight classes have been recognized. However, there are many unclassified or unrecognized LRRs. Here we performed to search novel LRRs using protein sequence database. The novel LRR domains are present over three hundred proteins, which include fungal ECM33 protein and Monosiga brevicollis LRR receptor kinase, from unicellular eukaryotes and bacteria. The HCS part is clearly different from that of the known LRRs and consists of a twelve or a thirteen residue stretch, VxGx(L/F)x(L/C)xxNx(x/-)L, that is characterized by the addition of Gly between the first conserved Val and the second conserved Leu. The novel LRRs identified here form a new family. The novel LRR domains were classified into four classes. The VS parts of the two classes are consistent with those of known, normal “SDS22-like” and “IRREKO” classes, while the other two classes have unique VS parts. The structures, functions, and evolution of the novel LRR domains and their proteins are described. The present results should stimulate various experimental studies.
Keywords: Choanoflagellida, Fungal ECM33 protein, “IRREKO” LRR, Leucine-rich repeat, “SDS22-like” LRR, Vibrio cholera.