Resource Aware Execution of Speculated Tasks in Hadoop with SDN

  • Mir Wajahat Hussain, K Hemant Kumar Reddy, Diptendu Sinha Roy

Abstract

Software defined networking (SDN) is a new approach to network paradigm that sets apart data and control plane and is aimed to address the network requirements where data is required to be transferred to and fro massively. Owing to the generation of substantial-sized data, Hadoop has become a defacto standard to handle it. Hadoop has a computation engine termed MapReduce to process this data. One important issue in MapReduce is how to identify and address the performance deterioration in slow running tasks. This deterioration is handled automatically by scheduling slow task on another node which has an empty slot. However, this might not improve performance as backed up tasks are launched on nodes anonymously without knowing their computational details. In this paper we discuss an approach on handling speculated tasks by scheduling it on nodes that have performed better by profiling the set of nodes in the cluster and improving the network resources by prioritizing the corresponding set of flow entries of nodes with the help of SDN. Further care is taken to schedule tasks which have data skew. Experiments conducted in the paper demonstrate that there is an improvement of about 10-15 % in the completion time of the job.

Published
2019-11-04
How to Cite
Diptendu Sinha Roy, M. W. H. K. H. K. R. (2019). Resource Aware Execution of Speculated Tasks in Hadoop with SDN. International Journal of Advanced Science and Technology, 28(13), 72 - 84. Retrieved from http://sersc.org/journals/index.php/IJAST/article/view/1282
Section
Articles