Advanced Search
CS Search Google Search
Subscribers, please login

Published Articles >> Table of Contents >> Abstract

18th International Conference on Data Engineering (ICDE'02)   p. 0017
TAILOR: A Record Linkage Tool Box

Full Article Text: Download PDF of full textBuy this articleGet full text from IEEE Xplore

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDE.2002.994694
Send link to a friend

Abstract
Data cleaning is a vital process that ensures the quality of data stored in real-world databases. Data cleaning prob-lems are frequently encountered in many research areas, such as knowledge discovery in databases, data ware-housing, system integration and e-services. The process of identifying the record pairs that represent the same entity (duplicate records), commonly known as record linkage, is one of the essential elements of data cleaning. In this paper, we address the record linkage problem by adopt-ing a machine learning approach. Three models are pro-posed and are analyzed empirically. Since no existing model, including those proposed in this paper, has been proved to be superior, we have developed an interactive Record Linkage Toolbox named TAILOR. Users of TAI-LOR can build their own record linkage models by tuning system parameters and by plugging in in-house developed and public domain tools. The proposed toolbox serves as a framework for the record linkage process, and is de-signed in an extensible way to interface with existing and future record linkage models. We have conducted an ex-tensive experimental study to evaluate our proposed mod-els using not only synthetic but also real data. Results show that the proposed machine learning record linkage models outperform the existing ones both in accuracy and in performance.
Additional Information

Citation:  Mohamed G. Elfeky, Ahmed K. Elmagarmid, Vassilios S. Verykios, "TAILOR: A Record Linkage Tool Box," icde, p. 0017,  18th International Conference on Data Engineering (ICDE'02),  2002

Similar Articles

Abstract Contents
Abstract
Citation




Free access to

  • Abstracts
  • Selected PDFs

Electronic subscribers login to:

  • Access HTML/PDFs of full text articles

Subscription information

Get a Web account

PDFs require Adobe Acrobat Reader.

Peer Review Notice

Give us Feedback