Publication Details

Improved MLP Structures for Data-Driven Feature Extraction for ASR

ZHU Qifeng, CHEN Barry, GRÉZL František and MORGAN Nelson. Improved MLP Structures for Data-Driven Feature Extraction for ASR. In: Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology. Lisabon, 2005, p. 4. ISSN 1018-4074.
Czech title
Vylepšená struktura MLP pro datově-řízenou extrakci píznaků pro ASR
Type
conference paper
Language
english
Authors
Zhu Qifeng (ICSI Berkeley)
Chen Barry, Msc. (ICSI Berkeley)
Grézl František, Ing., Ph.D. (DCGM FIT BUT)
Morgan Nelson, Prof. (ICSI Berkeley)
Keywords

feature extraction, MLP structure, time-frequency patterns

Abstract

Data-driven feature extraction using improved MLP structure for ASR. Four-layer MLPs are used in this feature extraction. It is shown that the the first hidden layer of a four-layer MLP is able to detect some basic patterns from the time-frequency plane.

Annotation

In this paper, we present our recent progress on multi-layer perceptron (MLP) based data-driven feature extraction using improved MLP structures. Four-layer MLPs are used in this study. Different signal processing methods are applied before the input layer of the MLP. We show that the first hidden
layer of a four-layer MLP is able to detect some basic patterns from the time-frequency plane. KLT-based dimension reduction along time is applied as a modulation frequency filter. The new feature extraction was tested on a large
vocabulary continuous speech recognition (LVCSR) task using the NIST 2001 evaluation set. We achieved 11.6% relative word error rate (WER) reduction compared to the traditional PLP-based baseline feature. This is also a
significant improvement compared to our previously published results on the same task using MLP-based features with three-layer MLPs.

Published
2005
Pages
4
Journal
European Speech Communication, ISSN 1018-4074
Proceedings
Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology
Conference
Eurospeech 2005 - Lisboa 9th European conference on speech communication and technology, Lisabon, PT
Place
Lisabon, PT
BibTeX
@INPROCEEDINGS{FITPUB7909,
   author = "Qifeng Zhu and Barry Chen and Franti\v{s}ek Gr\'{e}zl and Nelson Morgan",
   title = "Improved MLP Structures for Data-Driven Feature Extraction for ASR",
   pages = 4,
   booktitle = "Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology",
   journal = "European Speech Communication",
   year = 2005,
   location = "Lisabon, PT",
   ISSN = "1018-4074",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/7909"
}
Back to top