Speaker Identification using GFCC with PITCH & ZCR
Abstract
Speaker Recognition as a biometric technique used for audio classification The sound generated by a person is said to be altogether exceptional and relies upon larynx or the voice box. Mel Frequency Cepstral Coefficients are said to be less superior and more robust to noise than the less commonly used Gammtone Frequency Cepstral Coefficients. Adaptive whitening noise filtering is used over the given audio wave and calculating gammatone frequency cepstral coefficients features with addition of Pitch and Zero Crossing Rate are used as an input. Decision making models such as Neural network, Support Vector Machine ,XG Boost algorithm and K-means clustering are used for the purpose of classification of speakers and a correlation is made for the equivalent.