Abstract
Abstract: Mining the large Web based online distributed databases to discover new knowledge and financial gain is an important research problem. These computations require high performance distributed and parallel computing environments. Traditional data mining techniques such as classification, association, clustering can be extended to find new efficient solutions. The paper presents the scalable data mining problem, proposes the use of software DSM (distributed shared memory) with a new mechanism as an effective solution and discusses both the implementation and performance evaluation results. It is observed that the overhead of a software DSM is very large for scalable data mining programs. A new Log Based Consistency (LBC) mechanism, especially designed for scalable data mining on the software DSM is proposed to overcome this overhead. Traditional association rule based data mining programs frequently modify the same fields by count-up operations. In contrast, the LBC mechanism keeps up the consistency by broadcasting the count-up operation logs among the multiple nodes.