Advanced Search
CS Search Google Search
Subscribers, please login

Published Articles >> Table of Contents >> Abstract

The Fourth International Conference on Computer and Information Technology (CIT'04)   pp. 970-977
A Maximal Frequent Itemset Approach for Web Document Clustering

Full Article Text: Download PDF of full textBuy this articleGet full text from IEEE Xplore

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CIT.2004.1357322
Send link to a friend

Abstract
To efficiently and yet accurately cluster web documents is of great interests to web users and is a key component of the searching accuracy of a web search engine. To achieve this, this paper introduces a new approach for the clustering of web documents, which is called Maximal Frequent Item-set (MFI) approach. Iterative clustering algorithms, such as K-means and Expectation-Maximization (EM), are sensitive to their initial conditions. MFI approach firstly locates the center points of high density clusters precisely. These center points then are used as initial points for the K-means algorithm. Our experimental results tested on 3 web document sets show that our MFI approach outperforms the other methods we compared in most cases, particularly in the case of large number of categories in web document sets.
Additional Information

Citation:  Ling Zhuang, Honghua Dai, "A Maximal Frequent Itemset Approach for Web Document Clustering," cit, pp. 970-977,  The Fourth International Conference on Computer and Information Technology (CIT'04),  2004

Similar Articles

Abstract Contents
Abstract
Citation




Free access to

  • Abstracts
  • Selected PDFs

Electronic subscribers login to:

  • Access HTML/PDFs of full text articles

Subscription information

Get a Web account

Peer Review Notice

Give us Feedback