Semi-Automatic Image Annotation for Videos

  • A. Ancy Micheal, Chao-Hung Lin, K. Vani, S. Sanjeevi


Emergence of deep learning has created change in the field of computer vision. Annotating the objects is an essential task in preparing the data for deep learning based object detection. Image annotation is a tedious job and involves heavy human labour and time. This paper focuses on reducing the manual annotation time , thereby saving human labor cost and time. In this paper, a novel semi-automatic image annotation methodology is proposed. The image frames are splitted into keyframes and remaining frames. Keyframes are generated based on histogram. The keyframes are manually annotated. The objects in the respective frames are subjective to data augmentation such as vertical and horizontal flipping, thereby producing new images. The keyframes, augmented images and their respective boundary box locations are fed into Tiny-DSOD for fine tuning. Object detection is performed with Tiny-DSOD in the remaining frames for the detected objects. Manual correction is performed to correct the missed and false detections. The keyframes with Tiny- DSOD for image annotation obtains recall and precision as 92.42% and 94.56%. The proposed methodology outperforms by reducing the manual annotation from 2858 seconds to 551 seconds.

How to Cite
A. Ancy Micheal, Chao-Hung Lin, K. Vani, S. Sanjeevi. (2020). Semi-Automatic Image Annotation for Videos. International Journal of Advanced Science and Technology, 29(10s), 6872 - 6878. Retrieved from