Bigram Extraction and Sentiment Classification on Unstructured Movie Data

Shubham Tripathi

Abstract: Classifying text with respect to sentiment is being used very often in diverse fields these days. Number of movies releasing worldwide is increasing day by day. As the movie market is becoming more and more popular, movie reviews help in providing an insight to the movies. With number of reviews increasing up to thousands or even lakhs, it is imperative to classify them in the best possible manner to provide apt ratings as per the scale. Conventionally, sentiment prediction systems just look at words in isolation and provide positive and negative scores to the sentences. In this research, we aim to employ a statistical technique (Naïve Bayes classification) to the movie review data and find the overall sentiment of the document. We compare the conventional prediction system with a new technique to exploit features. The entire research is done in parts. Mining features of reviews that have been commented on the websites, Identifying relevant corpus to apply algorithms on after mining the corpus and finally concluding the results. We apply methods using R Studio software. I conclude by examining factors and devising ways for feature selection in the above mentioned technique.

Keywords: Sentiment classification, review, text mining, feature extraction.

Title: Bigram Extraction and Sentiment Classification on Unstructured Movie Data

Author: Shubham Tripathi

International Journal of Electrical and Electronics Research

ISSN 2348-6988 (online)

Research Publish Journals

Vol. 3, Issue 3, July 2015 – September 2015

Citation
Share : Facebook Twitter Linked In

Citation
Bigram Extraction and Sentiment Classification on Unstructured Movie Data by Shubham Tripathi