Performance Evaluation of Neural Networks for Speaker Recognition

Abstract

Speaker Recognition is one of the principle problems in Speech processing. The performance of speaker recognition systems can be improved by carefully choosing and calculating suitable features, which is an arduous task. Therefore, the learning based approach has been found to be simpler, more general and with the rapid growth in Artificial Intelligence, more accurate. This paper is a comparative study of the performance of different neural networks in speaker recognition. The focus of this work is to find which of these learning algorithms is more accurate, less complex, and more generic when it comes to speaker recognition. A database of 5000 utterances, 100 for each of the 50 different speakers, in both clean and noisy environment, with varying levels of noise was used. The MFCC (Mel Frequency Cepstral Coefficients) of these utterances were used as features to train and evaluate the neural networks. Accuracy of all neural networks was expectedly very high (textgreater90%) for clean data, large variations coming in with introduction and change in the level of noise. RBFNN has been shown to consistently perform well under all conditions. DNN was the other consistent performer and has the potential to outperform other techniques, if trained on more data.

Publication
2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT)
comments powered by Disqus