Data Management Using Hadoop and Hive
Abstract
Today is the era of digitization. It can be seen in every sector of life, starting from Government Offices upto professional front. Social networking sites, web blogs, financial transactions, clickstream applications generates huge data. Such continuously generating, flowing, high-speed voluminous structured-unstructured data, by various sources is called big data. Traditional database systems are unable to manage such data. This leads the necessity of optimization / modernization of traditional data management techniques.
Hadoop is an open-source distributed computing platform to store and manage big data efficiently. Hadoop ecosystem solves big data problems efficiently. Hive, a hadoop ecosystem’s data warehouse component, has SQL like interface to query, store and analyze voluminous data. In this paper movielense database is used to manage data on hadoop platform using hive