| Title: | Speech Processing Systems |
|---|
| Code: | SRE |
|---|
| Ac.Year: | 2009/2010 |
|---|
| Term: | Winter |
|---|
| Study plans: | |
|---|
| Language: | Czech, English |
|---|
| Public info: | http://www.fit.vutbr.cz/study/courses/SRE/public/ |
|---|
| Credits: | 5 |
|---|
| Completion: | examination (written) |
|---|
Type of instruction: | | Hour/sem | Lectures | Sem. Exercises | Lab. exercises | Comp. exercises | Other |
|---|
| Hours: | 39 | 0 | 0 | 0 | 13 |
|---|
| | Examination | Tests | Exercises | Laboratories | Other |
|---|
| Points: | 50 | 15 | 0 | 0 | 35 |
|---|
|
|---|
| Guarantee: | Černocký Jan, doc. Dr. Ing., DCGM |
|---|
| Lecturer: | Burget Lukáš, Ing., Ph.D., DCGM Černocký Jan, doc. Dr. Ing., DCGM Fapšo Michal, Ing., DCGM Glembek Ondřej, Ing., DCGM Matějka Pavel, Ing., Ph.D., DCGM Schwarz Petr, Ing., Ph.D., DCGM Smrž Pavel, doc. RNDr., Ph.D., DCGM |
| Faculty: | Faculty of Information Technology BUT |
|---|
| Department: | Department of Computer Graphics and Multimedia FIT BUT |
|---|
| Prerequisites: | |
|---|
| | | Learning objectives: |
|---|
To extend the on the structure of language (phonetics, phonology) and
acquire bases of statistical classifiers. To get acquainted with
advanced methods of speech recognition and coding. To get acquainted
with advanced methods of language modeling and syntactic analysis.
| | Description: |
|---|
Phonetics and phonology. Statistical pattern recognition. HMM training
and adaptation. HMM recognition. Phoneme recognition. Keyword spotting
and search. Speaker identification and verification. Language
identification. CELP speech coding. Language modeling.
Psycholinguistics. Probabilistic parsing.
| | Learning outcomes and competences: |
|---|
Students will extend the knowledge acquired in the basic speech signal
processing and natural language processing courses toward modern
methods. They will get acquainted with
methods currently deployed in industrial applications (GSM telephones
or commercially available recognizers). They will get acquainted with
promising methods existing in research environment. They will
deepen their knowledge of natural langugage processing and language
modelilng. This course allows students to
implement simple speech processing applications, as for example voice
command of a process. However, first of all it enables them to join the
development of complex systems for speech recognition and coding
systems in both academic and industrial environments. | | Syllabus of lectures: |
|---|
- Phonetics and phonology - syllable structure, phonological processes and distinctive features.
- Statistical pattern classification I. - Bayesian framework, Maximum
likelihood learning, Gaussian mixture models. Features for GMM modeling.
- Statistical pattern classification II. - Artificial Neural
Networks, Support vector machines. Sequence modeling - Hidden Markov
models.
- HMM training and adaptation - MLLR, MAP, discriminative training.
- HMM recognition - pronunciation dictionaries and networks, language modeling, decoding, lattices.
- Phoneme recognition. Keyword spotting and search - LVCSR, acoustic and phonetic lattices. Figure of Merit.
- Speaker identification and verification - GMM, SVM. Channel
normalization and compensation - feature mapping, eigen-voices and
nuissance attributes projection (NAP). Evaluation of speaker verification: DET curves, EER, cost function.
- Language identification - acoustic vs. phonotactic, evaluation.
- Speech coding - CELP framework - adaptive and stochastic codebooks, GSM standards.
- Language modeling 1 - n-gram models, class-based models
- Language modeling 2 - language-specific features, factored-language models
- Psycholinguistics - word recognition models, word associations
- Probabilistic parsing - inside-outside algorithm, dependency parsing
| | Fundamental literature: |
|---|
- Gussenhoven, J. and Jacobs, H.: Understanding Phonology, Oxford University Press, 1998, ISBN: 0-340-69218-9
- Psutka, J.: Komunikace s počítačem mluvenou řečí. Academia, Praha, 1995, ISBN 80-200-0203-0.
- Gold, B., Morgan, N.: Speech and audio signal processing, John Wiley & Sons, 2000, ISBN 0-471-35154-7.
- Moore, B.C.J.: An introduction to the psychology of hearing, Academic Press, 1989, ISBN 0-12-505627-3.
- Jelinek, F.: Statistical Methods for Speech Recognition, MIT Press, 1998, ISBN 0-262-10066-5.
- Manning, C. and Schütze, H.: Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999.
| | Study literature: |
|---|
- Gold, B., Morgan, N.: Speech and audio signal processing, John Wiley & Sons, 2000, ISBN 0-471-35154-7.
| | Progress assessment: |
|---|
- mid-term test - 20pts
- presentation of projects - 30pts
- exam - 50pts
| | |
|