Title:

Modern Methods of Speech Processing

Code:MZD
Ac.Year:2003/2004
Sem:Summer
Language of Instruction:Czech
Completion:examination (written)
Type of
instruction:
Hour/semLecturesSeminar
Exercises
Laboratory
Exercises
Computer
Exercises
Other
Hours:390000
 ExamsTestsExercisesLaboratoriesOther
Points:00000
Guarantor:Černocký Jan, doc. Dr. Ing. (DCGM)
Lecturer:Černocký Jan, doc. Dr. Ing. (DCGM)
Faculty:Faculty of Information Technology BUT
 
Description:
  From simple systems to stochastic modelling. Hidden Markov models. Large vocabulary continuous speech recognition. Language models. Speech production, speech perception: time and frequency. Data-driven methods for feature extraction. Speech databases. Excitation in speech coding, CELP. Speaker identification.
Learning outcomes and competencies:
  This course allows students to implement simple speech processinga pplications, as for example voice command of a process. However, first of all it enables them to join the development of complex systems for speech recognition and coding systems, using modern methods, in academic and industrial environments.
Syllabus of lectures:
 
  1. Review of notions: signal vectors and parameter matrices, basic statistics.
  2. Stochastic modeling of parameters, modeling of time by state sequences.
  3. Hidden Markov models: basic structure, training.
  4. Recognition of speech using HMM: Viterbi search, token passing.
  5. Pronunciation dictionaries and language models.
  6. Speech production and derived parameters: LPC, Log area ratios, line spectral pairs.
  7. Speech perception and derived parameters: Mel-frequency cepstral coefficients, Perceptual linear prediction.
  8. Temporal properties of hearing - RASTA filtering.
  9. Training the feature extractor on the data - linear discriminant analysis.
  10. Speech databases: standards, contents, speakers, annotations.
  11. Vocoders and modeling of the excitation: multi-pulse and stochastic excitations (GSM coding).
  12. CELP coding: long-term predictor, codebooks. Very low bit-rate coders.
  13. Current methods of speaker identification and verification.
Fundamental literature:
 
  1. Psutka, J.: Komunikace s s počítačem mluvenou řečí. Academia, Praha, 1995
  2. Gold, B., Morgan, N.: Speech and audio signal processing, John Wiley & Sons, 2000
  3. Texts from <a href=http://www.fit.vutbr.cz/~cernocky/speech/> http://www.fit.vutbr.cz/~cernocky/speech/
Study literature:
 
  1. Moore, B.C.J., : An introduction to the psychology of hearing, Academic Press, 1989
  2. Jelinek, F.: Statistical Methods for Speech Recognition, MIT Press, 1998
  3. Fukunaga, K.: Introduction to Statistical Pattern Recognition, Academic Press, 1990
  4. Vapnik, V. N.: Statistical Learning Theory, Wiley-Interscience, 1998
  5. Dutoit, T.: An Introduction to Text-To-Speech Synthesis, Kluwer Academic Publishers, 1997
 

Your IPv4 address: 52.23.234.7