Speech analysis using very low-dimensional bottleneck features and phone-class dependent neural networks

Bai, Linxue (2018). Speech analysis using very low-dimensional bottleneck features and phone-class dependent neural networks. University of Birmingham. Ph.D.

PDF - Accepted Version

Download (2MB)


The first part of this thesis focuses on very low-dimensional bottleneck features (BNFs), extracted from deep neural networks (DNNs) for speech analysis and recognition. Very low-dimensional BNFs are analysed in terms of their capability of representing speech and their suitability for modelling speech dynamics. Nine-dimensional BNFs obtained from a phone discrimination DNN are shown to give comparable phone recognition accuracy to 39-dimensional MFCCs, and an average of 34% higher phone recognition accuracy than formant-based features of the same dimensions. They also preserve the trajectory continuity well and thus hold promise for modelling speech dynamics. Visualisations and interpretations of the BNFs are presented, with phonetically motivated studies of the strategies that DNNs employ to create these features. The relationships between BNF representations resulting from different initialisations of DNNs are explored.

The second part of this thesis considers BNFs from the perspective of feature extraction. It is motivated by the observation that different types of speech sounds lend themselves to different acoustic analysis, and that the mapping from spectra-in-context to phone posterior probabilities implemented by the DNN is a continuous approximation to a discontinuous function. This suggests that it may be advantageous to replace the single DNN with a set of phone class dependent DNNs. In this case, the appropriate mathematical structure is a manifold. It is shown that this approach leads to significant improvements in frame level phone classification accuracy.

Type of Work: Thesis (Doctorates > Ph.D.)
Award Type: Doctorates > Ph.D.
College/Faculty: Colleges (2008 onwards) > College of Engineering & Physical Sciences
School or Department: School of Engineering, Department of Electronic, Electrical and Systems Engineering
Funders: None/not applicable
Subjects: T Technology > TK Electrical engineering. Electronics Nuclear engineering
URI: http://etheses.bham.ac.uk/id/eprint/8137


Request a Correction Request a Correction
View Item View Item


Downloads per month over past year