Data Pre-processing For Performance Analysis

Shobha Tyagi, Komal Chaudhary

Abstract: Pre-processing the data is a need in today’s scenario. Data must be converted into a valid form so that it can be more useful and can assure great results. In this paper we are focusing on cleaning of data by filling in missing values and identifying the class basically known as classification. Results are compared in Weka for performance factor by taking data sets- first, with missing values and second with filling in those missing values with averaging and assigning class using Random Forest in terms of error rate. Two algorithms J48 and Random Forest are compared in weka in terms of performance (Accuracy and Error rate).

Keywords: Data mining, pre-processing, random forest, decision tree, classification, data cleaning.

Title: Data Pre-processing For Performance Analysis

Author: Shobha Tyagi, Komal Chaudhary

International Journal of Computer Science and Information Technology Research

ISSN 2348-1196 (print), ISSN 2348-120X (online)

Research Publish Journals