Publication Details

But System for the Second Dihard Speech Diarization Challenge

LANDINI Federico Nicolás, WANG Shuai, DIEZ Sánchez Mireia, BURGET Lukáš, MATĚJKA Pavel, ŽMOLÍKOVÁ Kateřina, MOŠNER Ladislav, SILNOVA Anna, PLCHOT Oldřich, NOVOTNÝ Ondřej, ZEINALI Hossein and ROHDIN Johan A.. But System for the Second Dihard Speech Diarization Challenge. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Barcelona: IEEE Signal Processing Society, 2020, pp. 6529-6533. ISBN 978-1-5090-6631-5. Available from: https://ieeexplore.ieee.org/document/9054251
Czech title
Systém VUT pro druhou soutěž DIHARD v diarizaci řeči
Type
conference paper
Language
english
Authors
Landini Federico Nicolás (DCGM FIT BUT)
Wang Shuai (DCGM FIT BUT)
Diez Sánchez Mireia, M.Sc., Ph.D. (DCGM FIT BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Matějka Pavel, Ing., Ph.D. (DCGM FIT BUT)
Žmolíková Kateřina, Ing., Ph.D. (DCGM FIT BUT)
Mošner Ladislav, Ing. (DCGM FIT BUT)
Silnova Anna, MSc., Ph.D. (DCGM FIT BUT)
Plchot Oldřich, Ing., Ph.D. (DCGM FIT BUT)
Novotný Ondřej, Ing., Ph.D. (DCGM FIT BUT)
Zeinali Hossein, Ph.D. (DCGM FIT BUT)
Rohdin Johan A., Dr. (DCGM FIT BUT)
URL
Keywords

Speaker Diarization, Variational Bayes, HMM, DIHARD, CHiME

Abstract

This paper describes the winning systems developed by the BUT team for the four tracks of the Second DIHARD Speech Diarization Challenge. For tracks 1 and 2 the systems were mainly based on performing agglomerative hierarchical clustering (AHC) of x-vectors, followed by another x-vector clustering based on Bayes hidden Markov model and variational Bayes inference. We provide a comparison of the improvement given by each step and share the implementation of the core of the system. For tracks 3 and 4 with recordings from the Fifth CHiME Challenge, we explored different approaches for doing multi-channel diarization and our best performance was obtained when applying AHC on the fusion of per channel probabilistic linear discriminant analysis scores.

Published
2020
Pages
6529-6533
Proceedings
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Conference
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), Barcelona, ES
ISBN
978-1-5090-6631-5
Publisher
IEEE Signal Processing Society
Place
Barcelona, ES
DOI
UT WoS
000615970406158
EID Scopus
BibTeX
@INPROCEEDINGS{FITPUB12281,
   author = "Nicol\'{a}s Federico Landini and Shuai Wang and Mireia S\'{a}nchez Diez and Luk\'{a}\v{s} Burget and Pavel Mat\v{e}jka and Kate\v{r}ina \v{Z}mol\'{i}kov\'{a} and Ladislav Mo\v{s}ner and Anna Silnova and Old\v{r}ich Plchot and Ond\v{r}ej Novotn\'{y} and Hossein Zeinali and A. Johan Rohdin",
   title = "But System for the Second Dihard Speech Diarization Challenge",
   pages = "6529--6533",
   booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
   year = 2020,
   location = "Barcelona, ES",
   publisher = "IEEE Signal Processing Society",
   ISBN = "978-1-5090-6631-5",
   doi = "10.1109/ICASSP40776.2020.9054251",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/12281"
}
Back to top