Melnikoff, Stephen Jonathan (2003)
Ph.D. thesis, University of Birmingham.
Speech recognition is a computationally demanding task, especially the decoding part, which converts pre-processed speech data into words or sub-word units, and which incorporates Viterbi decoding and Gaussian distribution calculations. In this thesis, this part of the recognition process is implemented in programmable logic, specifically, on a field-programmable gate array (FPGA). Relevant background material about speech recognition is presented, along with a critical review of previous hardware implementations. Designs for a decoder suitable for implementation in hardware are then described. These include details of how multiple speech files can be processed in parallel, and an original implementation of an algorithm for summing Gaussian mixture components in the log domain. These designs are then implemented on an FPGA. An assessment is made as to how appropriate it is to use hardware for speech recognition. It is concluded that while certain parts of the recognition algorithm are not well suited to this medium, much of it is, and so an efficient implementation is possible. Also presented is an original analysis of the requirements of speech recognition for hardware and software, which relates the parameters that dictate the complexity of the system to processing speed and bandwidth. The FPGA implementations are compared to equivalent software, written for that purpose. For a contemporary FPGA and processor, the FPGA outperforms the software by an order of magnitude.
|Type of Work:||Ph.D. thesis.|
|Supervisor(s):||Quigley, Steven Francis|
|School/Faculty:||Schools (1998 to 2008) > School of Engineering|
|Department:||Electronic, Electrical and Computer Engineering|
Publications in the Appendix are available at http://eprints.bham.ac.uk/23/ http://eprints.bham.ac.uk/24/ http://eprints.bham.ac.uk/25/ http://eprints.bham.ac.uk/26/ http://eprints.bham.ac.uk/27/ http://eprints.bham.ac.uk/28/
|Keywords:||Speech recognition, programmable logic, FPGA|
|Subjects:||TK Electrical engineering. Electronics Nuclear engineering|
QA75 Electronic computers. Computer science
|Institution:||University of Birmingham|
|Library Catalogue:||Check for printed version of this thesis|
Repository Staff Only: item control page