Advanced Search
CS Search Google Search
Subscribers, please login

Published Articles >> Table of Contents >> Abstract

Seventh European Conference on Software Maintenance and Reengineering (CSMR'03)   p. 82
Towards a Benchmark for Web Site Extractors: A Call for Community Participation

Full Article Text: Download PDF of full textBuy this articleGet full text from IEEE Xplore

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CSMR.2003.1192414
Send link to a friend

Abstract
The purpose of this paper is to propose a benchmark for comparing fact extractors for Web sites and to invite interested researchers and practitioners to participate in its development. Fact extraction is a fundamental and difficult problem in both traditional software reverse engineering and Web site reverse engineering. In both domains, there are often irregularities in the input that violate an extractor’s unstated assumptions. Consequently, it is difficult to predict how an extractor will perform in a given input. To remedy this problem, we created a benchmark for comparing fact extractors for the C++ programming language. We found that this benchmark improved our understanding of fact extraction, the tools produced, and the maturity of the community. The same approach, we believe, will be beneficial for Web site extractors and we propose WebETS (Web site Extractor Test Suite.) In this paper, we give some starting points for the design of WebETS and ask others to join in the effort.
Additional Information
Index Terms- Reverse engineering, benchmark, Web sites, fact extraction

Citation:  Holger M. Kienle, Susan Eliott Sim, "Towards a Benchmark for Web Site Extractors: A Call for Community Participation," csmr, p. 82,  Seventh European Conference on Software Maintenance and Reengineering (CSMR'03),  2003

Similar Articles

Abstract Contents
Abstract
Index Terms
Citation




Free access to

  • Abstracts
  • Selected PDFs

Electronic subscribers login to:

  • Access HTML/PDFs of full text articles

Subscription information

Get a Web account

Peer Review Notice

Give us Feedback