CNN Based Feature Extractionand Classificationfor Degraded Historical Documents

  • Devendran K, Keerthika P, Manjula Devi R, Santhosh sivan P, RebhashiA, Ragul M

Abstract

Preservation of important historical records/papers is a difficult task because of large volume and periodic degradation of texts. So character identification of historical records/papers is inevitable. Image Binarization is the pre-processing step for character recollection in historical records/papers. The Binarization process converts a greyscale image into a binary image. Thus image binarization is very important for character recognition. But it is a challenging task due to complex background and noises in the images. As binarization and text line segmentation also play a vital role in character recognition. In this paper we present an additional step, they are Feature extraction and Classification. The text line segmentation and binarization are implemented using Global Threshold values derived from Otsu’s algorithm and Local threshold values form Niblack and Savaula’s algorithms. The Feature Extraction and Classification is implemented by Convolution Neural Network. The result obtained from the methodology is less sensitive to noise and has high contrast. 
Published
2020-06-01
How to Cite
Devendran K, Keerthika P, Manjula Devi R, Santhosh sivan P, RebhashiA, Ragul M. (2020). CNN Based Feature Extractionand Classificationfor Degraded Historical Documents. International Journal of Control and Automation, 13(02), 1215 - 1221. Retrieved from https://sersc.org/journals/index.php/IJCA/article/view/26925
Section
Articles