Title:

Speech Signal Processing

Code:ZREe
Ac.Year:2017/2018
Term:Summer
Curriculums:
ProgrammeBranchYearDuty
IT-MGR-1HMGH-Recommended
IT-MSC-2MBI-Compulsory-Elective - group S
IT-MSC-2MBS-Elective
IT-MSC-2MGM1stCompulsory
IT-MSC-2MIN-Compulsory-Elective - group C
IT-MSC-2MIS-Elective
IT-MSC-2MMI-Compulsory-Elective - group S
IT-MSC-2MMM-Elective
IT-MSC-2MPV-Compulsory-Elective - group G
IT-MSC-2MSK2ndCompulsory-Elective - group B
Language:English
News:
* This course is prepared for incoming Erasmus+ students only, and it is instructed in English.
* This course will be open if a certain/sure minimum of enrolled students is at least five students.

Credits:5
Completion:examination (written&verbal)
Type of
instruction:
Hour/semLecturesSem. ExercisesLab. exercisesComp. exercisesOther
Hours:2600260
 ExaminationTestsExercisesLaboratoriesOther
Points:7525000
Guarantee:Černocký Jan, doc. Dr. Ing., DCGM
Lecturer:Černocký Jan, doc. Dr. Ing., DCGM
Grézl František, Ing., Ph.D., DCGM
Instructor:Grézl František, Ing., Ph.D., DCGM
Faculty:Faculty of Information Technology BUT
Department:Department of Computer Graphics and Multimedia FIT BUT
 
Learning objectives:
  To provide students with the knowledge of basic characteristics of speech signal in relation to production and hearing of speech by humans. To describe basic algorithms of speech analysis common to many applications. To give an overview of applications (recognition, synthesis, coding) and to inform about practical aspects of speech algorithms implementation.
Description:
  Aplikace počítačového zpracování řeči, číslicové zpracování řečových signálů, tvorba a slyšení řeči, úvod do fonetiky, předzpracování a základní parametry, lineárně-prediktivní model, cepstrum, určování základního tónu hlasu, kódování - časová oblast a vokodéry, rozpoznávání - DTW a HMM, syntéza. Software a knihovny pro zpracování řeči.
Knowledge and skills required for the course:
  Solid knowledge of basic mathematics and signal processing (Fourier transform, linear filtering, random signals).
Learning outcomes and competences:
  The students will get familiar with basic characteristics of speech signal in relation to production and hearing of speech by humans. They will understand basic algorithms of speech analysis common to many applications. They will be given an overview of applications (recognition, synthesis, coding) and be informed about practical aspects of speech algorithms implementation. The students will be able to design a simple system for speech processing (speech activity detector, recognizer of limited number of isolated words), including its implementation into application programs.
Syllabus of lectures:
 
  • Introduction, applications of speech processing, sciences relevant for SP, informational content of speech.
  • Digital processing of speech signals.
  • Speech production and perception, basic notions from psycho-acoustics, applications in speech processing.
  • Introduction to phonetics, international norms for phoneme mark-up.
  • Pre-processing and basic parameters of speech.
  • Linear-predictive model, spectrum using LP, applications of LP.
  • Cepstral analysis, Mel-frequency cepstrum.
  • Determination of fundamental frequency.
  • Speech coding
  • Speech recognition - dynamic programming DTW, hidden Markov models HMM
  • Speech synthesis
  • Software and libraries for speech processing.
Syllabus of numerical exercises:
 
  • Parameterization, DTW, HMM.
  • Presentation of projects.
Syllabus of computer exercises:
 
    Except the last one, Matlab is used in labs.
  • Frames, windows, spectrum, pre-processing.
  • Linear prediction (LPC).
  • Fundamental frequency estimation.
  • Coding.
  • Recognition - Dynamic time Warping (DTW).
  • Recognition - hidden Markov models (Hidden Markov Model Toolkit - HTK).
Study literature:
 
  • Gold, B., Morgan, N.: Speech and Audio Signal Processing, John Wiley & Sons, 2000, ISBN 0-471-35154-7
Progress assessment:
  
  • mid-term test
  • presentation of projects
  • presentation of results in computer labs