Improving Speaker Identification in Reverberant Environments Using MFCCs and Comb Filtering with Neural Network Classification
Abstract
Reverberation presents a persistent challenge to the accuracy of speaker identification systems, especially in real-world acoustic settings. This paper proposes a robust and lightweight framework that enhances speaker recognition performance under reverberant conditions using a combination of comb filtering, Mel-Frequency Cepstral Coefficients (MFCCs), and a neural network classifier. The comb filter is applied as a preprocessing stage to suppress delayed reflections and reduce temporal smearing of the speech signal prior to feature extraction. Experimental evaluations were conducted across multiple reverberation levels (RT60 = 0.3 sec to 0.9 sec) and noise conditions (SNR from 30 dB to 0 dB). Results show that the proposed system outperforms baseline and transformation-based methods, achieving a recognition accuracy of 85.4% at RT60 = 0.9 sec compared to 70.2% for the unfiltered baseline, and up to 97.6% in low-reverberation scenarios. Additionally, the comb filter introduces a non-invertible transformation that enables cancelable biometric templates, reinforcing the system's security. The proposed method demonstrates a good balance between effectiveness, simplicity, and privacy, making it well-suited for real-time speaker identification applications in reverberant and noisy environments.
Related articles
Related articles are currently not available for this article.