GRÉZL František a FOUSEK Petr. Optimizing bottle-neck features for LVCSR. In: 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing. Las Vegas, Nevada: IEEE Signal Processing Society, 2008, s. 4729-4732. ISBN 1-4244-1484-9.
Jazyk publikace:angličtina
Název publikace:Optimizing bottle-neck features for LVCSR
Název (cs):Optimalizace Bottle-neck parametrů pro LVCSR
Sborník:2008 IEEE International Conference on Acoustics, Speech, and Signal Processing
Konference:33rd International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Místo vydání:Las Vegas, Nevada, US
Vydavatel:IEEE Signal Processing Society
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2008/grezl_BN_optim_icassp_2008.pdf [PDF]
Klíčová slova
uzké hrdlo, MLP struktura, parametry, LVCSR
Tato publikace se zabývá optimalizací jednotlivých kroků při získávání Bottle-Neck parametrů pro nižší slovní chybovost na systémech pro rozpoznávání plynulé spontální řeči s velkým slovníkem.
This work continues in development of the recently proposed. Bottle-Neck features for ASR. A five-layers MLP used in bottle-neck  feature extraction allows to obtain arbitrary feature size without dimensionality reduction by transforms, independently on the MLP training targets. The MLP topology -- number and sizes of layers, suitable training targets, the impact of output feature transforms, the need of delta features, and the dimensionality of the final feature vector are studied with respect to the best ASR result. Optimized features are employed in three LVCSR tasks: Arabic broadcast news, English conversational telephone speech and English meetings. Improvements over standard cepstral features and probabilistic MLP features are shown for different tasks and different neural net input representations. A significant improvement is observed when phoneme MLP training targets are replaced by phoneme states and when delta features are added.
