Abstract
Background: The machine learning computation paradigm touched new horizons with the
development of deep learning architectures. It is widely used in complex problems and achieved significant
results in many traditional applications like protein structure prediction, speech recognition,
traffic management, health diagnostic systems and many more. Especially, Convolution neural network
(CNN) has revolutionized visual data processing tasks.
Objective: Protein structure is an important research area in various domains, from medical science
and health sectors to drug designing. Fourier Transform Infrared Spectroscopy (FTIR) is the leading
tool for protein structure determination. This review aims to study the existing deep learning approaches
proposed in the literature to predict proteins' secondary structure and to develop a conceptual
relation between FTIR spectra images and deep learning models to predict the structure of proteins.
Methods: Various pre-trained CNN models are identified and interpreted to correlate the FTIR images
of proteins containing Amide-I and Amide-II absorbance values and their secondary structure.
Results: The concept of transfer learning is efficiently incorporated using the models like Visual Geometry
Group (VGG), Inception, Resnet, and Efficientnet. The dataset of protein spectra images is applied
as input, and these models significantly predict the secondary structure of proteins.
Conclusion: As deep learning is recently being explored in this field of research, it worked remarkably
in this application and needs continuous improvement with the development of new models.
Keywords:
Deep learning, transfer learning, pre-trained models, pre-processing, fourier transform infrared spectroscopy, secondary structure.
Graphical Abstract
[3]
Javed, A.R.; Sarwar, M.U. ur Rehman, S; Khan, H.U.; Al-Otaibi, Y.D.; Alnumay, W.S. Pp-spa: privacy preserved smartphone-based per-sonal assistant to improve routine life functioning of cognitive impaired individuals. Neural Process. Lett., 2021, 2021, 1-18.
[9]
Dargan, S.; Kumar, M.; Ayyagari, M.R.; Kumar, G. A survey of deep learning and its applications: A new paradigm to machine learning. Arch. Comput. Methods Eng., 2020, 27(4), 1071-1092.
[10]
Nwankpa, C.; Ijomah, W.; Gachagan, A Marshall, S Activation functions: Comparison of trends in practice and research for deep learn-ing. arXiv:1811.03378, 2018.
[12]
Simonyan, K.; Zisserman, A Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014.
[14]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, Jun 27-30, 2016Las Vegas, NV, USA, pp. 770-778.2016
[15]
Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. Proceed. Mach. Learn., 2019, 97, 6105-6114.
[18]
Zhou, J.; Troyanskaya, O. Deep supervised and convolutional generative stochastic network for protein secondary structure prediction.Proceed. Mach. Learn; , 2014, 32, pp. 745-753.
[22]
Busia, A.; Collins, J.; Jaitly, N Protein secondary structure prediction using deep multi-scale convolutional neural networks and next-step conditioning., 2016.
[23]
Chen, Y. Long sequence feature extraction based on deep learning neural network for protein secondary structure prediction. In 2017 IEEE 3rd Information Technology and Mechatronics Engineering Conference (ITOEC), 03-05 Oct, 2017, Chongqing, China, pp. 843-847.
[25]
Liu, Y.; Cheng, J.; Ma, Y.; Chen, Y. Protein secondary structure prediction based on two dimensional deep convolutional neural networks. In 2017 3rd IEEE International Conference on Computer and Communications (ICCC), 13-16 Dec, 2017, Chengdu, China, pp. 1995-1999.
[46]
Sukumaran, S. Protein secondary structure elucidation using FTIR spectroscopy; Thermo Fisher Scientific, 2017, pp. 1-4.
[53]
Hering, J.A.; Innocent, P.R.; Haris, P.I. Neuro‐fuzzy structural classification of proteins for improved protein secondary structure predic-tion. Proteomics, 2003, 3(8), 1464-1475.
[54]
Hering, J.A.; Innocent, P.R.; Haris, P.I. Automatic amide I frequency selection for rapid quantification of protein secondary structure from Fourier transform infrared spectra of proteins. Proteomics, 2002, 2(7), 839-849.