Melnikoff, Stephen Jonathan (2003). Speech recognition in programmable logic. University of Birmingham. Ph.D.
|
Melnikoff03PhD.pdf
Download (1MB) |
Abstract
Speech recognition is a computationally demanding task, especially the decoding part, which converts pre-processed speech data into words or sub-word units, and which incorporates Viterbi decoding and Gaussian distribution calculations. In this thesis, this part of the recognition process is implemented in programmable logic, specifically, on a field-programmable gate array (FPGA). Relevant background material about speech recognition is presented, along with a critical review of previous hardware implementations. Designs for a decoder suitable for implementation in hardware are then described. These include details of how multiple speech files can be processed in parallel, and an original implementation of an algorithm for summing Gaussian mixture components in the log domain. These designs are then implemented on an FPGA. An assessment is made as to how appropriate it is to use hardware for speech recognition. It is concluded that while certain parts of the recognition algorithm are not well suited to this medium, much of it is, and so an efficient implementation is possible. Also presented is an original analysis of the requirements of speech recognition for hardware and software, which relates the parameters that dictate the complexity of the system to processing speed and bandwidth. The FPGA implementations are compared to equivalent software, written for that purpose. For a contemporary FPGA and processor, the FPGA outperforms the software by an order of magnitude.
Type of Work: | Thesis (Doctorates > Ph.D.) | ||||||
---|---|---|---|---|---|---|---|
Award Type: | Doctorates > Ph.D. | ||||||
Supervisor(s): |
|
||||||
Licence: | |||||||
College/Faculty: | Schools (1998 to 2008) > School of Engineering | ||||||
School or Department: | School of Engineering, Department of Electronic, Electrical and Systems Engineering | ||||||
Funders: | None/not applicable | ||||||
Subjects: | T Technology > TK Electrical engineering. Electronics Nuclear engineering Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
||||||
URI: | http://etheses.bham.ac.uk/id/eprint/16 |
Actions
Request a Correction | |
View Item |
Downloads
Downloads per month over past year