IMAGE CAPTION GENERATOR USING DEEP LEARNING
Computer Vision and Natural Language Processing in artificial intelligence is used for automatically describing the content of an image. In order to describe the image a well-formed English phrases is needed. Automatically describing image content is very much helpful to the visually impaired people to understand the problem better. The paper is intended to identify objects and inform people through audio and text messages. It recognizes image and converts to audio using GTTS and converts to text using LSTM. Initially, the input image is converted to a grayscale image that is processed through the Convolution Neural Network (CNN) to correctly identify the objects. Objects in the image are correctly identified using OpenCV, which is then converted to audio and text messages. The proposed method for blind people is designed to expand to people with vision loss in order to achieve their full potential.