Shape Recognition Based on MapReduce and In-Memory Processing on Distributed File System

  • Namkyun Baik
  • Dipankar Hazra
  • Debnath Bhattacharyya


Two novel approaches for centroid-radii based shape retrieval on distributed file system are proposed in this paper. Modified Centroid-Radii model is used for calculating the shape features of trained images. These shape features are stored into Hadoop Distributed File System (HDFS) instead of relational database, generally used for feature storage. HDFS can store large number of shapes that is not possible to be stored in a single machine. Modified Centroid-Radii Model is also used to calculate the shape feature of query image. In one approach MapReduce query is used for recognizing binary shape. In another approach Apache Spark is used. Shape feature of query shape is compared with the shape features stored in HDFS. In-memory processing of Apache Spark used to increase the speed of retrieval process. Spark based image retrieval is faster than MapReduce based image retrieval.