| Karafiát, M., Grézl, F., Burget, L.: Combination of MFCC and TRAP features for LVCSR of meeting data, Martigny, CH, 2004, s. 1 | | Jazyk publikace: | angličtina |
|---|
| Název publikace: | Combination of MFCC and TRAP features for LVCSR of meeting data |
|---|
| Název (cs): | Combinace MFCC a TRAP priznaku pro rozponavani meetingovych dat |
|---|
| Strany: | 1 |
|---|
| Místo vydání: | Martigny, CH |
|---|
| Rok: | 2004 |
|---|
| Klíčová slova |
|---|
speech recognition, TRAP, feature extraction, feature combination, hlda
|
| Anotace |
|---|
he aim of this work is to examine TempoRAl Patterns (TRAPs) based
feature extraction for the task of large vocabulary continuous speech
recognition (LVCSR). Previously, TRAPs based features were mainly used
in conjunction with hybrid NN-HMM recognition system (the conectionist
approach). In this work, we use Tandem-TRAPS system to generate speech
features, which are then used as an input for a standard GMM-HMM
system. This approach allows for more precise modeling of phonetic
context (context dependent models), which is important for LVCSR.
Experiments are carried out on ICSI meetings database. For TRAPS
processing, it is shown that use of frequency differentiation and local
operators can significantly improve recognition performance.
Performances obtained with TRAPs based features and convetional MFCC
features are compared. Although stand-alone TRAPs based features never
outperform MFCC in our experiments, we have reported an improvement
over MFCC when TRAPs based features and MFCC features are combined
together. The combined features are created by concatenation of the
original feature streams followed by Heteroscedastic Linear
Discriminant Analysis to perform decorelation and dimensionality
reduction. Compared to previous works, the big advantage is brought by
HLDA which combines the two feature streams optimally without strong
assumptions imposed on data by previously used transforms (as PCA and
LDA)
|
|