An Automatic Text Summarization Method with the Concern of Covering Complete Formation

Page: [977 - 986] Pages: 10

  • * (Excluding Mailing and Handling)

Abstract

Background: Text summarization is the process of generating a short description of the entire document which is more difficult to read. This method provides a convenient way of extracting the most useful information and a short summary of the documents. In the existing research work, this is focused by introducing the Fuzzy Rule-based Automated Summarization Method (FRASM). Existing work tends to have various limitations which might limit its applicability to the various real-world applications. The existing method is only suitable for the single document summarization where various applications such as research industries tend to summarize information from multiple documents.

Methods: This paper proposed Multi-document Automated Summarization Method (MDASM) to introduce the summarization framework which would result in the accurate summarized outcome from the multiple documents. In this work, multi-document summarization is performed whereas in the existing system only single document summarization was performed. Initially document clustering is performed using modified k means cluster algorithm to group the similar kind of documents that provides the same meaning. This is identified by measuring the frequent term measurement. After clustering, pre-processing is performed by introducing the Hybrid TF-IDF and Singular value decomposition technique which would eliminate the irrelevant content and would result in the required content. Then sentence measurement is one by introducing the additional metrics namely Title measurement in addition to the existing work metrics to accurately retrieve the sentences with more similarity. Finally, a fuzzy rule system is applied to perform text summarization.

Results: The overall evaluation of the research work is conducted in the MatLab simulation environment from which it is proved that the proposed research method ensures the optimal outcome than the existing research method in terms of accurate summarization. MDASM produces 89.28% increased accuracy, 89.28% increased precision, 89.36% increased recall value and 70% increased the f-measure value which performs better than FRASM.

Conclusion: The summarization processes carried out in this work provides the accurate summarized outcome.

Keywords: Summarization, frequent term measurement, irrelevant content, multi documents, sentence measurement, TF-IDF.

Graphical Abstract

[1]
M. Gambhir, and V. Gupta, "Recent automatic text summarization techniques: A survey", Artif. Intell. Rev., vol. 47, no. 1, pp. 1-66, 2017.
[http://dx.doi.org/10.1007/s10462-016-9475-9]
[2]
C.Y. Liu, M.S. Chen, and C.Y. Tseng, "Incrests: Towards real-time incremental short text summarization on comment streams from social network services", IEEE Trans. Knowl. Data Eng., vol. 27, no. 11, pp. 2986-3000, 2015.
[http://dx.doi.org/10.1109/TKDE.2015.2405553]
[3]
M. Yousefi-Azar, and L. Hamey, "Text summarization using unsupervised deep learning", Expert Syst. Appl., vol. 68, pp. 93-105, 2017.
[http://dx.doi.org/10.1016/j.eswa.2016.10.017]
[4]
K.Y. Chen, S.H. Liu, B. Chen, H.M. Wang, E.E. Jan, W.L. Hsu, and H.H. Chen, "Extractive broadcast news summarization leveraging recurrent neural network language modeling techniques", IEEE Trans. Audio Speech Lang. Process., vol. 23, no. 8, pp. 1322-1334, 2015.
[http://dx.doi.org/10.1109/TASLP.2015.2432578]
[5]
H. Hashimi, A. Hafez, and H. Mathkour, "Selection criteria for text mining approaches", Comput. Human Behav., vol. 51, pp. 729-733, 2015.
[http://dx.doi.org/10.1016/j.chb.2014.10.062]
[6]
A.M. Rush, S. Chopra, and J. Weston, A neural attention model for abstractive sentence summarization arXiv preprint arXiv:1509.00685, 2015.
[7]
M. Davidian, and D.M. Giltinan, "Nonlinear models for repeated measurement data: an overview and update", J. Agric. Biol. Environ. Stat., vol. 8, no. 4, p. 387, 2017.
[http://dx.doi.org/10.1198/1085711032697]
[8]
P. Ewels, M. Magnusson, S. Lundin, and M. Käller, "MultiQC: Summarize analysis results for multiple tools and samples in a single report", Bioinformatics, vol. 32, no. 19, pp. 3047-3048, 2016.
[http://dx.doi.org/10.1093/bioinformatics/btw354 PMID: 27312411]
[9]
Z. Cao, F. Wei, L. Dong, S. Li, and M. Zhou, "Ranking with recursive neural networks and its application to multi-document summarization", In Twenty-ninth AAAI Conference on Artificial Intelligence, 2015
[10]
Z. Wang, L. Shou, K. Chen, G. Chen, and S. Mehrotra, "On summarization and timeline generation for evolutionary tweet streams", IEEE Trans. Knowl. Data Eng., vol. 27, no. 5, pp. 1301-1315, 2015.
[http://dx.doi.org/10.1109/TKDE.2014.2345379]
[11]
Z. Fu, K. Ren, J. Shu, X. Sun, and F. Huang, "Enabling personalized search over encrypted outsourced data with efficiency improvement", IEEE Trans. Parallel Distrib. Syst., vol. 27, no. 9, pp. 2546-2559, 2016.
[http://dx.doi.org/10.1109/TPDS.2015.2506573]
[12]
F. Ricci, L. Rokach, and B. Shapira, Recommender systems: Introduction and challenges.In Recommender systems handbook., Springer: Heidelberg, Germany, 2015, pp. 1-34.
[http://dx.doi.org/10.1007/978-1-4899-7637-6_1]
[13]
C.Y. Liu, M.S. Chen, and C.Y. Tseng, "Incrests: Towards real-time incremental short text summarization on comment streams from social network services", IEEE Trans. Knowl. Data Eng., vol. 27, no. 11, pp. 2986-3000, 2015.
[http://dx.doi.org/10.1109/TKDE.2015.2405553]
[14]
C.C. Chen, and M.C. Chen, "TSCAN: A content anatomy approach to temporal topic summarization", IEEE Trans. Knowl. Data Eng., vol. 24, no. 1, pp. 170-183, 2012.
[http://dx.doi.org/10.1109/TKDE.2010.228]
[15]
B.P. Sharifi, D.I. Inouye, and J.K. Kalita, "Summarization of Twitter microblogs", Comput. J., vol. 57, no. 3, pp. 378-402, 2014.
[http://dx.doi.org/10.1093/comjnl/bxt109]
[16]
A. Shimada, F. Okubo, C. Yin, and H. Ogata, "Automatic summarization of lecture slides for enhanced student preview–technical report and user study", IEEE Trans. Learn. Technol., vol. 11, no. 2, pp. 165-178, 2018.
[http://dx.doi.org/10.1109/TLT.2017.2682086]
[17]
K.Y. Chen, S.H. Liu, B. Chen, H.M. Wang, E.E. Jan, W.L. Hsu, and H.H. Chen, "Extractive broadcast news summarization leveraging recurrent neural network language modeling techniques", IEEE Trans. Audio Speech Lang. Process., vol. 23, no. 8, pp. 1322-1334, 2015.
[http://dx.doi.org/10.1109/TASLP.2015.2432578]
[18]
K. Chen, S. Liu, B. Chen, and H. Wang, "An information distillation framework for extractive summarization", IEEE/ACM Trans. Audio Speech Lang. Process., vol. 26, no. 1, pp. 161-170, 2018.
[http://dx.doi.org/10.1109/TASLP.2017.2764545]
[19]
S. Na, L. Xumin, and G. Yong, "Research on k-means clustering algorithm: An improved k-means clustering algorithm In Intelligent Information Technology and Security Informatics (IITSI), IEEE 2010 Third International Symposium, ",
[20]
G. Salton, A. Singhal, M. Mitra, and C. Buckley, "Automatic text structuring and summarization", Inf. Process. Manage., vol. 33, no. 2, pp. 193-207, 1997.
[http://dx.doi.org/10.1016/S0306-4573(96)00062-3]
[21]
G. Silva, R. Ferreira, R.D. Lins, L. Cabral, H. Oliveira, S.J. Simske, and M. Riss, "Automatic text document summarization based on machine learning", In Proceedings of the 2015 ACM Symposium on Document Engineering, 2015pp. 191-194
[http://dx.doi.org/10.1145/2682571.2797099]