Real Time Optical Character Recognition (OCR) for Regional Languages
Abstract
OCR is a framework which can change over the JPEG record or PDF document from printed structure or written by hand to editable content structure. English Character recognition and text to speech conversion strategy is as of now developed in OCR and its accuracy is very high. For Indian territorial dialects progress is excessively moderated. This research work is more focused on the language Gujarati and Hindi. There are numerous individuals who use Gujarati language in Gujarat and Hindi language in all over India for communication purposes. The primary reason for this venture is to build up an android application which can recognize the text content present in a JPEG picture, PDF record or from the images captured during real time into an editable content document for various local dialects like Gujarati, Hindi and for English as well. For the same this research also focuses on to conversation of this text documents into speech.