Statistical Features Based Text Detection Using Modified OCR
Abstract
This paper presents a modified OCR based text detection scheme in natural scene images. In the proposed scheme, the statistical features of the image are used to train the OCR. In order to segment the text area accurately present in natural scene images, OTSU based thresholding is incorporated which helps to overcome many challenges such as blur, uneven illumination, complex texture, occlusion etc. The main aim of the present work is to detect text areas in the image, mark the boundaries of the detected areas and then extract the recognized text in a proper file format such as .txt file. The experimental analysis is performed on the standard data-set (ICDAR 2011). Performance of the proposed scheme is noticeable especially in case of horizontal text present in natural scene images. The proposed scheme shows a remarkable improvement in comparison with other existing schemes.