Multi-Modal Approach with Deep Embedded Clustering for Social Image Retrieval

M.Kiruthika et al.

M.Kiruthika et al.

Abstract

Multi-Modal Approach (MMA) was proposed for social image retrieval which provided relevant and diverse result set of images for user query. It returned a set of images based on the text and visual features related to the images. However, for large collection of images it is more difficult to identify the connection between text and visual feature descriptors in social media. So, an Improved MMA (IMMA) was proposed to know the relevance between text and visual feature descriptors using optimized AlexNet for social image retrieval. In order to design a robust, accurate and computationally efficient deep learning method for social image retrieval, an IMMA with deep clustering (MMA-DeepCluster) is proposed in this paper. In MMA-DeepCluster, a fully connected layer of optimized AlexNet is replaced with clustering layer where both text and visual feature descriptors are clustered. The cluster feature descriptors are used to classify the images as relevant and non-relevant images using an activation function. Then, the feature vectors of the relevant images are mapped into binary codes using binary function. Finally, a hamming distance between relevant image and database image is used to rank the images. The experiments are conducted in terms of precision, recall and accuracy to prove the effectiveness of the proposed MMA-DeepCluster approach.