Abstract: Hive is an open source is designed to handle a large amount of data. It is an open source is built on top of Hadoop. It stores data at tables like a relational database management system. Today, a many organization use Apache Hive to process their data. Hive is begin increasingly used in the many organizations so, a more efficient and flexible technique is needed to improve the query performance in Hive. The goal of this paper to conduct a comparative analysis between a two of a query optimization technique that used by Hive. These techniques are map-reduce and cost-based optimization. In the paper, we determine the methodology to perform the test. After performing the test, we conclude that the map-reduce was best response time of a query in the hive.
Keywords: Hive; query optimization; map-reduce; cost-based optimization.
Title: Query optimization techniques in Hive: Comparative Analysis
Author: Khadeeja Alsolami, Fahad Alqurashi
International Journal of Engineering Research and Reviews
ISSN 2348-697X (Online)
Research Publish Journals