Script Identification at Line-level using SFTA and LBP Features from Bi-lingual and Tri-lingual documents Captured from the Camera
This paper aims to identify scripts from multilingual document images captured from the camera. The features like SFTA, LBP, and combination of both consider for identification of the script at line level based on camera captured images. The scripts like English, Hindi, Urdu, Telugu, Kannada, Tamil, Oriya, Gujrathi, Bangla, Malayalam, and Punjabi languages consider for the experimental setup at bi-script (English with regional languages) and tri-script (English and Hindi with regional languages) combinations. The experiment conducted on 22,000 line-level images (each script with 2000 line images) revealed the overall recognition rate of 98.78% for bi-script and 97.66% for tri-scripts based on the SVM classifier.