Clustering Video Frames with Facenet Embeddings
Abstract
Videos of different methods like record videos, video surveillance, video calling data, searching certain people in pictures and in videos are increasingly widespread. Manual examination of all frames is not possible as the number of such videos is increasing. A particular degree of automation is needed. An automated system can be designed which can group all the similar faces present in the data. Face clustering is a technique to group faces of people into different clusters,each cluster having images of persons with similar features or one single person. This will result in a way where each person has their own cluster containing images of a particular person. The proposed model is built on various deep learning techniques which are helpful in clustering various images of different people that can further help in detecting fake images. We use DBSCAN to cluster similar faces ,by using FaceNet embeddings, for face images recognised by MTCNN. PCA is used to convert the higher embedding dimension of the images to two dimensions. T-SNE is used to improve the visualization of the data.