Publication Details

Noise-robust speech triage

BARTOS Anthony L., CIPR Tomáš, NELSON Douglas J., SCHWARZ Petr, BANOWETZ John and JERABEK Ladislav. Noise-robust speech triage. Journal of the Acoustical Society of America, vol. 143, no. 4, 2018, pp. 2313-2320. ISSN 1520-8524. Available from: https://asa.scitation.org/doi/10.1121/1.5031029

Czech title

Třídění řeči odolné vůči šumu

Type

journal article

Language

english

Authors

Bartos Anthony L. (SRMA)
Cipr Tomáš, Ing. (Phonexia)
Nelson Douglas J. (DOD)
Schwarz Petr, Ing., Ph.D. (DCGM FIT BUT)
Banowetz John (NRL)
Jerabek Ladislav (SRMA)

URL

Keywords

speech algorithms, noisy environments, multiple speaker identification

Abstract

A method is presented in which conventional speech algorithms are applied, with no modifications, to improve their performance in extremely noisy environments. It has been demonstrated that, for eigen-channel algorithms, pre-training multiple speaker identification (SID) models at a lattice of signal-to-noise-ratio (SNR) levels and then performing SID using the appropriate SNR dependent model was successful in mitigating noise at all SNR levels. In those tests, it was found that SID performance was optimized when the SNR of the testing and training data were close or identical. In this current effort multiple i-vector algorithms were used, greatly improving both processing throughput and equal error rate classification accuracy. Using identical approaches in the same noisy environment, performance of SID, language identification, gender identification, and diarization were significantly improved. A critical factor in this improvement is speech activity detection (SAD) that performs reliably in extremely noisy environments, where the speech itself is barely audible. To optimize SAD operation at all SNR levels, two algorithms were employed. The first maximized detection probability at low levels (10 dB SNR < 10 dB) using just the voiced speech envelope, and the second exploited features extracted from the original speech to improve overall accuracy at higher quality levels (SNR10 dB).

Published

2018

Pages

2313-2320

Journal

Journal of the Acoustical Society of America, vol. 143, no. 4, ISSN 1520-8524

Publisher

American Institute of Physics for the Acoustical Society of America

DOI

10.1121/1.5031029

UT WoS

000430570900039

EID Scopus

2-s2.0-85045888415

BibTeX

@ARTICLE{FITPUB11716,
   author = "L. Anthony Bartos and Tom\'{a}\v{s} Cipr and J. Douglas Nelson and Petr Schwarz and John Banowetz and Ladislav Jerabek",
   title = "Noise-robust speech triage",
   pages = "2313--2320",
   journal = "Journal of the Acoustical Society of America",
   volume = 143,
   number = 4,
   year = 2018,
   ISSN = "1520-8524",
   doi = "10.1121/1.5031029",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/11716"
}