An Approach to Analyze YouTube Data using Hadoop
Generally data is increased in any area like education, medical, social sites and so on. The increased data is called big data, which can be any format like structured, unstructured and semi- structured. To handle such huge data, big data analytics techniques are used to help the organizations by giving the result in low cost and reduced time. It is difficult to access such huge amount of data with traditional methods and that can be possible with big data analytics techniques. In this paper, the Apache Hive method is used to reduce the latency and also reduce the multiple data processing modes for analyzing the YouTube data. In traditional method, accessing and processing speed is low. On the other hand, it handles only structured data and handles only limited amount of data. To overcome the drawbacks of traditional approach, Apache Hadoop framework is designed to access and process huge amount of data. In Hadoop framework, there are many components to handle and process big data. In the existing method, by using MapReduce big data can be analyzed and processed in multiple stages. Implementing iterative map reduce jobs are expensive due to high space consumption by each job. To overcome the drawbacks of existing method, a Hive method is used to analyze the big data and performs the state-of-the-art method. The hive extracts the YouTube information by generating API (Application Programming Interface) key and uses SQL queries. This can be implemented using Cloudera platform which gives efficient result.