Modern Methods of Speech Processing

Language of Instruction:Czech
Completion:examination (written)
Type of
Guarantor:Černocký Jan, doc. Dr. Ing. (DCGM)
Lecturer:Černocký Jan, doc. Dr. Ing. (DCGM)
Faculty:Faculty of Information Technology BUT
  From simple systems to stochastic modelling. Hidden Markov models. Large vocabulary continuous speech recognition. Language models. Speech production, speech perception: time and frequency. Data-driven methods for feature extraction. Speech databases. Excitation in speech coding, CELP. Speaker identification.
Learning outcomes and competencies:
  This course allows students to implement simple speech processinga pplications, as for example voice command of a process. However, first of all it enables them to join the development of complex systems for speech recognition and coding systems, using modern methods, in academic and industrial environments.
Syllabus of lectures:
  1. Review of notions: signal vectors and parameter matrices, basic statistics.
  2. Stochastic modeling of parameters, modeling of time by state sequences.
  3. Hidden Markov models: basic structure, training.
  4. Recognition of speech using HMM: Viterbi search, token passing.
  5. Pronunciation dictionaries and language models.
  6. Speech production and derived parameters: LPC, Log area ratios, line spectral pairs.
  7. Speech perception and derived parameters: Mel-frequency cepstral coefficients, Perceptual linear prediction.
  8. Temporal properties of hearing - RASTA filtering.
  9. Training the feature extractor on the data - linear discriminant analysis.
  10. Speech databases: standards, contents, speakers, annotations.
  11. Vocoders and modeling of the excitation: multi-pulse and stochastic excitations (GSM coding).
  12. CELP coding: long-term predictor, codebooks. Very low bit-rate coders.
  13. Current methods of speaker identification and verification.
Fundamental literature:
  1. Psutka, J.: Komunikace s s počítačem mluvenou řečí. Academia, Praha, 1995
  2. Gold, B., Morgan, N.: Speech and audio signal processing, John Wiley & Sons, 2000
  3. Texts from <a href=http://www.fit.vutbr.cz/~cernocky/speech/> http://www.fit.vutbr.cz/~cernocky/speech/
Study literature:
  1. Moore, B.C.J., : An introduction to the psychology of hearing, Academic Press, 1989
  2. Jelinek, F.: Statistical Methods for Speech Recognition, MIT Press, 1998
  3. Fukunaga, K.: Introduction to Statistical Pattern Recognition, Academic Press, 1990
  4. Vapnik, V. N.: Statistical Learning Theory, Wiley-Interscience, 1998
  5. Dutoit, T.: An Introduction to Text-To-Speech Synthesis, Kluwer Academic Publishers, 1997

Your IPv4 address:
Switch to https