Clustering of Categorized Text Data Using Cobweb Algorithm

Kavita, Pallavi Bedi

Abstract: The objective of clustering is to partition an unstructured set of objects into clusters (groups). Initially the data is not structured. In clustering distance & the similarity between the objects is consider. One often wants to group similar objects in same cluster and dissimilar in different clusters. Clustering is a widely studied data mining problem in text domain. In this paper we make use of a database ‘Labor Dataset’ in arff (attribute relation file format) containing 17 attributes and 57 instances to perform an clustering and classification techniques of data mining. We get results with simple classification technique (using naïve bayes classifier) and clustering technique (using cobweb algorithm), based upon various parameters using WEKA (Waikato Environment for Knowledge Analysis), a Data Mining tool. The results of the experiment show that clustering and classification gives promising results with utmost accuracy rate and robustness even when the data set is containing missing values.

Keywords: Data Mining; Naïve bayes classifier; Cobweb algorithm; WEKA; labor dataset.

Title: Clustering of Categorized Text Data Using Cobweb Algorithm

Author: Kavita, Pallavi Bedi

International Journal of Computer Science and Information Technology Research

ISSN 2348-1196 (print), ISSN 2348-120X (online)

Research Publish Journals