Liu, Ying (2010)
Ph.D. thesis, University of Birmingham.
The thesis presents study to explore the role of dynamic features in speaker verification. Based on the theory that dynamic information should contain important speaker information, modelling the dynamics should have the potential to improve the speaker verification performance. Experiments on TD-SV using segmental hidden Markov models (SHMMs) on the YOHO database show performance improvement. However there is no significant improvement for TI-SV from experiments on the Switchboard database, using segmental GMMs. Analysis of the TD-SV results confirms that the speech dynamics modeled by SHMMs contribute more to the SV accuracy. Analysis of the TI-SV results indicates that the lack of speech dynamic information is a feature of GMM systems. It seems that the priority of the maximum likelihood training algorithm is to model stationary regions, and the role of dynamic features in GMM system, is to ensure that the classification focuses on static regions rather than to model dynamics. Study on TI-SV was carried out using conventional GMMs. Without RASTA filtering, the `delta-only' system works best. However, after RASTA filtering, the `static-plus-delta' system performs best. The results suggest that the good performance of the `delta-only' system before RASTA is mainly due to the noise robustness of the delta parameters.
This unpublished thesis/dissertation is copyright of the author and/or third parties. The intellectual property rights of the author or third parties in respect of this work are as defined by The Copyright Designs and Patents Act 1988 or as modified by any successor legislation. Any use made of information contained in this thesis/dissertation must be in accordance with that legislation and must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the permission of the copyright holder.
Repository Staff Only: item control page