Camera-based Tri-lingual Script Identification at Word level using a Combination of SFTA and LBP Features

  • B.V.Dhandra, Satishkumar Mallappa, Gururaj Mukarambi

Abstract

This paper exhibit the identification of scripts at word level from the camera-based multi-script document images. The Camera-based document images suffer from noise while capturing documents and scripts are challenging to identify when noise is present. The scripts like Tamil, Punjabi, English, Oriya, Telugu, Gujarathi, Malayalam, Kannada, Hindi, Bengali, and Urdu combinations considered. The experiment conducted on a large dataset consisting of 77,000-word images and each script has 7000-word images word images. The texture features are combined to get the highest recognition accuracy. The recognition rate is 77.94% and 82.39% from SFTA features and 89.82% and 93.94% from LBP features, by using KNN and SVM classifiers, for combined feature vector KNN has given 94.45%, and SVM has given 93.88% recognition accuracy.

Published
2020-03-31
How to Cite
Gururaj Mukarambi, B. S. M. (2020). Camera-based Tri-lingual Script Identification at Word level using a Combination of SFTA and LBP Features. International Journal of Advanced Science and Technology, 29(3), 6609 - 6617. Retrieved from http://sersc.org/journals/index.php/IJAST/article/view/7251
Section
Articles