[1]
E. Rahm, and H.H. Do, "Data Cleaning: Problems and current approaches", Q. Bull. Comput. Soc. IEEE Tech. Comm. Data Eng., vol. 23, no. 4, pp. 3-13, 2000.
[3]
M. Karthigha, and S. Krishna Anand, "A survey on removal of duplicate records in database", Indian J. Sci. Technol., vol. 6, no. 4, pp. 4306-4311, 2013.
[5]
S.R. Alenazi, K. Ahmad, and A. Olowolayemo, "A review of similarity measurement for record duplication detection", 6th International Conference on Electrical Engineering and Informatics (ICEEI) Langkawi, Malaysia 2017
[6]
A. Bhamidipaty Sarawagi, "Interactive deduplication using active learning", Proc. Eighth ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining, 2002pp. 269-278
[8]
L. Philips, "Hanging on the Metaphone", Comput. Lang. Mag., vol. 7, no. 12, pp. 39-44, 1990.
[10]
L. Philips, "The Double Metaphone Search Algorithm", C/C++ Users J., vol. 18, no. 5, pp. 38-43, 2000.
[11]
W. Su, J. Wang, and F.H. Lochovsky, "Record Matching over Query Results from Multiple Web Databases", IEEE Trans. Knowl. Data Eng., vol. 22, no. 4, pp. 578-589, 2010.
[12]
S.B. Kotsiantis, "Supervised learning: A review of classification techniques", Informatica, vol. 1, no. 31, pp. 249-268, 2007.
[13]
S. Hemalatha, K. Raja, and A. Tholkappia, Duplicate Detection of Query Results from Multiple Web Databases., IJCA Special Issue on Computational Science—New Dimension & Perspectives, 2011.
[15]
B. Daggupati, "Unsupervised Duplicate Detection (UDD) of query results from multiple web databases”, M.S thesis, California State University, Los Angeles, CA , 2011.
[16]
S. Gaikwad, and B. Nagaraju, "A survey analysis on duplicate detection in hierarchical data", 2015 International Conference on Pervasive Computing (ICPC) IEEE, Pune, India, 2015
[18]
G. Li, Q. Wu, D. Tu, and S. Sun, "A sorted neighborhood approach
for detecting duplicated regions in image forgeries based on DWT
and SVD", in Multimedia and Expo, 2007 IEEE International Conference on, 2007.
[19]
H. Lu, X. Chen, X. Lan, and F. Zheng, "Duplicate data detection using GNN", n 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), 2016",
[20]
Benedikt Forchhammer, "Duplicate detection on GPUs", HPI Future SOC Lab 70.3, Jan 2013.
[22]
J. Barateiro, and H. Galhardas, "A survey of data quality tools", Datenbank-Spektrum, vol. 14, no. 5, pp. 15-21, 2005.
[27]
W.E. Winkler, “Overview of record linkage and current research directions” Bureau of the Census, Feb 2006.
[28]
R. Baxter, P. Christen, and T. Churches, "A Comparison of Fast Blocking Methods for Record Linkage", Proc. KDD Workshop Data Cleaning, Record Linkage, and Object Consolidation, 2003pp. 25-27
[29]
W.E. Winkler, "The state of record linkage and current research problems", Technical Report RR99/04. US Census Bureau, 1999.
[30]
I.P. Fellegi, and A.B. Sunter, "A theory for record linkage", J. Am. Stat. Assoc., vol. 64, no. 328, pp. 1183-1210, 2012.
[31]
C.C. Chang, and C-J. Lin, "LIBSVM: A library for support vector machines, manual", ACM Trans. Intell. Syst. Technol., vol. 2, no. 3, pp. 1-27, 2011.
[32]
P. Christen, "Automatic Record Linkage Using Seeded Nearest Neighbour and Support Vector Machine Classification", in Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 08, 2008.
[33]
"Yu, AIii, Han, J, and Chang, C.C, “PEBL: Web page classification without negative examples”, IEEE Trans Knowledge Data Engineer", Eng, vol. 16, no. 1, pp. 70-812004, 2004.
[37]
B. Mikhail, "and M. Raymond J", Proceedings of the KDD-03 Workshop on Data Cleaning, Record Linkage, and Object Consolidation, p. pp. 7-12.
[38]
P. Ravikumar, and W.W. Cohen, A hierarchical graphical model for record linkage", arXiv [cs.LG], 2012.