Finding Frequently Occuring Itemset Pair on Big Data

Raju K Gupta, Maheshwari R Tegampure, Purushottam K Singh, Shivani, Akshay Kumar, Prof. Prajakta Ugale

Abstract: The frequent itemset mining (FIM) is one of the most important techniques to extract knowledge from data in many real-world application. The total data that we have today (on internet) is increasing day by day. But with this increase in data, we are also facing a big problem of extracting information from those data. Now an important thing is to extract information from that huge amount of data. This process of extracting information from the given data is also called as data mining. Though there were a number of methods (including parallel programming) to do so. But when they are applied to Big Data, they couldn’t do too good. Apart these methods had their own counter effects such as 1) Balanced data distribution, and 2) Inter-communication costs. In this project, we investigate the applicability of FIM techniques on the MapReduce platform. Here in this project we are introducing two new methods for mining large datasets 1) FIC Algorithm (Fastest itemset calculating) (it focuses on speed) 2) Ec-Apriori Algorithm (It is optimized to run on really large datasets).

Keywords: bigdata, mapreduce, hadoop, frequent itemset.

Title: Finding Frequently Occuring Itemset Pair on Big Data

Author: Raju K Gupta, Maheshwari R Tegampure, Purushottam K Singh, Shivani, Akshay Kumar, Prof. Prajakta Ugale

International Journal of Computer Science and Information Technology Research

ISSN 2348-1196 (print), ISSN 2348-120X (online)

Research Publish Journals