Tutorials

  1. Auditory processing of speech in noise
  2. Introduction to Statistical Machine Translation
  3. Advanced learning techniques for NLP
  4. Vocal dialogue systems: an introduction and system development
  5. Speech recognition based on Hidden Markov Models
  6. Phoneme posterior estimation and acoustic keyword-spotting
  7. Dictionary Editor and Browser - How to make your own electronic dictionary
  8. Corpus Architect: Building text corpora
  9. Information Extraction and Ontology Learning from Texts
  10. Speech Synthesis Architectures and Speech Synthesis Evaluation

1. Auditory processing of speech in noise
    Robert Mill, Sheffield University, United Kingdom  [2 slots]

Speech and other sounds sources are encoded as a time-varying pattern of spikes in the auditory nerve. However, it is not yet certain which features of this pattern are responsible for conveying information in speech. On the one hand, the average firing rate in each fibre can be considered to be an internal representation or an auditory spectrum. Alternatively, the time intervals between spikes also reflects locally-dominant spectral components and can be processed to produce an interval-based spectrum. The situation is made more complex by the presence of a number of important nonlinearities in the processing chain within the cochlea. One of these ensures that the rate response saturates at moderate to high signal levels. Consequently, it is difficult to understand how a rate-based representation can encode stimuli such as vowels at high intensities, or with added noise, since most of the auditory nerve fibres will show a saturated rate response, leading to a flat spectral representation and an absence of formant peaks. An interval-based representation, on the other hand, shows no such saturation.

The purpose of this tutorial is to build computational models for both rate and interval based processing, and to evaluate their response to speech in various levels of background noise. Results will be compared with physiological data. Various components of an auditory model will be provided as MATLAB functions. If time permits, a number of different approaches to the estimation of dominant frequencies can be explored, including Seneff's generalised synchrony detector, Ghitza's ensemble interval histogram and Cooke's instantaneous frequency strands.

The project is suited to either individual or paired study. Some basic familiarity with MATLAB is needed for at least one member of the team.

2. Introduction to Statistical Machine Translation
    Rafael E. Banchs,  [2 slots]

This tutorial presents the basic concepts in Statistical Machine Translation (SMT) from both theoretical and practical points of view. The theory session includes a brief review of machine translation history and a complete description of the state of the art. Different approaches to machine translation will be briefly revised, but particular attention will be paid to the statistical approach. Some of the key concepts in SMT, such as word alignment, model estimation, and decoding will be discussed in detail. These concepts and their related problems will be illustrated by using some simple exercises. The lab session includes some guided SMT exercises by using some available free-distribution software tools and data. After the tutorial, students will be familiar with some of these on-line available resources for SMT research.

3. Advanced learning techniques for NLP
    Lubos Popelinsky, KDLab FI MU  [2 slots]

Inductive logic programing (ILP) aims at learning first-order predicate formula from positive and maybe negative examples. This learning technique is not limited to single-table data (like most of other learning method) and is especially suitable for data of complex structure. ILP has been successful in part-of-speech tagging (English, Swedish, Spanish, Czech), error detection in a morphologically tagged Czech corpus, in text categorization and information extraction.

The aim of the tutorial is to provide the participants with practical usage of ILP for several NLP tasks.

Summary

  1. A brief overview of ILP
  2. ILP for Part-of-Speech Tagging.
    A case studies: POS tagging for English;
    Error detection in a Czech corpus
  3. ILP for Text filtering and Information Extraction
    A case studies: Filtering situations and action from news reports;
    Learning agent-target from biomedical texts
  4. First-order frequent patterns and association rules for NLP

Exercises:

learning rules from English text that has been morphologically and syntactically tagged with Memory Based Shallow Tagger/Parser

4. Vocal dialogue systems: an introduction and system development
    Martin Rajman and Miroslav Melichar, EPFL, Lausanne, Switzerland  [2 slots]

The tutorial provides an introduction to domain of human-machine vocal dialogue systems. We give an overview of dialogue system architectures and approaches to dialogue management. Frame-based approaches are discussed in detail. During hands-on exercise, participants will have a chance to build and test a dialogue system for controlling home devices by voice. The dialogue system will be created using a framework for rapid dialogue prototyping developed at EPFL.

Further reading: http://liawww.epfl.ch/Publications/Archive/MartinRajman2004.pdf

Download

1. Introduction to Dialogue Systems
(http://icwww.epfl.ch/~melichar/RDP-course/IntroductionToDialogueSystems.pdf)

2. Rapid Dialogue Prototyping
(http://icwww.epfl.ch/~melichar/RDP-course/RapidDialogPrototyping.pdf)

3. Practical exercise
(http://icwww.epfl.ch/~melichar/RDP-course/)

5. Speech recognition based on Hidden Markov Models
    Honza Cernocky, Brno University of Technology, Czech Republic  [2 slots]

This tutorial will provide the student with the basis of speech recognition based on Hidden Markov models (HMM). The tutorial starts with a lecture giving basic notions of HMMs (structure, probabilities and probability density functions, Baum-Welch training, Viterbi decoding).

In the PC-lab part, the students will create a simple connected-digits recognizer. The software used in this tutorial will be HTK. The lab will include all stages necessary in the recognizer building, such as preparation of speech and label files, feature extraction, training, recognition and evaluation.

Download

6. Phoneme posterior estimation and acoustic keyword-spotting
    Igor Szoke, Brno University of Technology, Czech Republic  [2 slots]

This tutorial will provide students with theory and practice of phoneme recognition and posterior estimation and with one of useful applications of phoneme posteriors - acoustic keyword spotting (KWS).

The first part of the lecture deals with phoneme recognition based on long temporal trajectories and neural networks. The second part concentrates on acoustic keyword spotting: keyword model, background model, log likelihood ratio and thresholding are covered.

The practical work in PC-labs starts with phonetically-labeled database on which the students will train their own net for phoneme posterior estimation. The KWS part covers setting up a system for detection of a few keywords and its evaluation using Figure-of-Merit (FOM). The software used in labs will be SNet and SLRatio (both part of the STK toolkit developed at Brno University of Technology).

Download

7. Dictionary Editor and Browser - How to make your own electronic dictionary
    Ales Horak and Adam Rambousek, Masaryk University, Czech Republic  [1 slot]

During the tutorial the Dictionary Editor and Browser (DEB) platform will be introduced with demonstrations of existing DEB applications with several hundreds users all over the world (DEBDict, DEBVisDic). The new DEB administration interface will be presented with practical exercise exploiting the possibility to prepare an instant dictionary writing application of students' own design.

8. Corpus Architect: Building text corpora
    Jan Pomikalek, Masaryk University, Czech Republic  [1 slot]

The tutorial will explain how to compile text corpora from both own textual sources and the texts on the web using the Corpus Architect system. The architecture of the system will be briefly described and it's functionality demonstrated on simple examples. We will also explain how to explore the built corpora in the Sketch Engine corpus manager and cover the basics of the CQL (Corpus Query Language).

9. Information Extraction and Ontology Learning from Texts
    Pavel Smrz and Marek Schmidt, Brno University of Technology, Czech Republic  [1 slot]

This tutorial presents the basics of information extraction and ontology learning from texts. The common work will be task-driven - providing a set of domain-specific texts (from Wikipedia), the systems developed by students should be able to answer simple questions such as who plays piano on the track "Morning Hollow" (Dreamt for Light Years in the Belly of a Mountain by Sparklehorse), what musical instruments one can hear in the Jethro Tull Christmas Album or what is the first album of Coldplay? We will briefly introduce the methods that can be applied, as well as available software tools. Students will learn how to work with GATE (http://gate.ac.uk/) and other advanced language-engineering packages to carry out the given tasks.

10. Speech Synthesis Architectures and Speech Synthesis Evaluation
      Petra Wagner, Universitaet Bonn, Germany  [1 slot]

This tutorial will offer a brief overview of speech synthesis technologies with a focus on unit selection systems. In a second part of the tutorial, the challenge of evaluating state-of-the-art synthesis systems will be in focus and in a practical part of the tutorial, it is planned to carry out a diagnostic speech synthesis evaluation with the help of a perception test.