E. Rahm, and H.H. Do, "Data Cleaning: Problems and current approaches", Q. Bull. Comput. Soc. IEEE Tech. Comm. Data Eng., vol. 23, no. 4, pp. 3-13, 2000.
M. Karthigha, and S. Krishna Anand, "A survey on removal of duplicate records in database", Indian J. Sci. Technol., vol. 6, no. 4, pp. 4306-4311, 2013.
S.R. Alenazi, K. Ahmad, and A. Olowolayemo, "A review of similarity measurement for record duplication detection", 6th International Conference on Electrical Engineering and Informatics (ICEEI) Langkawi, Malaysia 2017
A. Bhamidipaty Sarawagi, "Interactive deduplication using active learning", Proc. Eighth ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, 2002pp. 269-278
L. Philips, "Hanging on the Metaphone", Comput. Lang. Mag., vol. 7, no. 12, pp. 39-44, 1990.
L. Philips, "The Double Metaphone Search Algorithm", C/C++ Users J., vol. 18, no. 5, pp. 38-43, 2000.
W. Su, J. Wang, and F.H. Lochovsky, "Record Matching over Query Results from Multiple Web Databases", IEEE Trans. Knowl. Data Eng., vol. 22, no. 4, pp. 578-589, 2010.
S.B. Kotsiantis, "Supervised learning: A review of classification techniques", Informatica, vol. 1, no. 31, pp. 249-268, 2007.
S. Hemalatha, K. Raja, and A. Tholkappia, Duplicate Detection of Query Results from Multiple Web Databases., IJCA Special Issue on Computational Science—New Dimension & Perspectives, 2011.
B. Daggupati, "Unsupervised Duplicate Detection (UDD) of query results from multiple web databases”, M.S thesis, California State University, Los Angeles, CA , 2011.
S. Gaikwad, and B. Nagaraju, "A survey analysis on duplicate detection in hierarchical data", 2015 International Conference on Pervasive Computing (ICPC) IEEE, Pune, India, 2015
G. Li, Q. Wu, D. Tu, and S. Sun, "A sorted neighborhood approach
for detecting duplicated regions in image forgeries based on DWT
and SVD", in Multimedia and Expo, 2007 IEEE International Conference on, 2007.
H. Lu, X. Chen, X. Lan, and F. Zheng, "Duplicate data detection using GNN", n 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), 2016",
Benedikt Forchhammer, "Duplicate detection on GPUs", HPI Future SOC Lab 70.3, Jan 2013.
J. Barateiro, and H. Galhardas, "A survey of data quality tools", Datenbank-Spektrum, vol. 14, no. 5, pp. 15-21, 2005.
W.E. Winkler, “Overview of record linkage and current research directions” Bureau of the Census, Feb 2006.
R. Baxter, P. Christen, and T. Churches, "A Comparison of Fast Blocking Methods for Record Linkage", Proc. KDD Workshop Data Cleaning, Record Linkage, and Object Consolidation, 2003pp. 25-27
W.E. Winkler, "The state of record linkage and current research problems", Technical Report RR99/04. US Census Bureau, 1999.
I.P. Fellegi, and A.B. Sunter, "A theory for record linkage", J. Am. Stat. Assoc., vol. 64, no. 328, pp. 1183-1210, 2012.
C.C. Chang, and C-J. Lin, "LIBSVM: A library for support vector machines, manual", ACM Trans. Intell. Syst. Technol., vol. 2, no. 3, pp. 1-27, 2011.
P. Christen, "Automatic Record Linkage Using Seeded Nearest Neighbour and Support Vector Machine Classification", in Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 08, 2008.
"Yu, AIii, Han, J, and Chang, C.C, “PEBL: Web page classification without negative examples”, IEEE Trans Knowledge Data Engineer", Eng, vol. 16, no. 1, pp. 70-812004, 2004.
B. Mikhail, "and M. Raymond J", Proceedings of the KDD-03 Workshop on Data Cleaning, Record Linkage, and Object Consolidation, p. pp. 7-12.
P. Ravikumar, and W.W. Cohen, A hierarchical graphical model for record linkage", arXiv [cs.LG], 2012.