Conference paper

KESIRAJU Santosh, BURGET Lukáš, SZŐKE Igor and ČERNOCKÝ Jan. Learning document representations using subspace multinomial model. In: Proceedings of Interspeech 2016. San Francisco: International Speech Communication Association, 2016, pp. 700-704. ISBN 978-1-5108-3313-5. Available from: https://www.researchgate.net/publication/307889473_Learning_Document_Representations_Using_Subspace_Multinomial_Model
Publication language:english
Original title:Learning document representations using subspace multinomial model
Title (cs):Učení reprezentací dokumentů pomocí podprostorového multinomiálního modelu
Pages:700-704
Proceedings:Proceedings of Interspeech 2016
Conference:Interspeech 2016
Place:San Francisco, US
Year:2016
URL:https://www.researchgate.net/publication/307889473_Learning_Document_Representations_Using_Subspace_Multinomial_Model
ISBN:978-1-5108-3313-5
Publisher:International Speech Communication Association
URL:http://www.fit.vutbr.cz/research/groups/speech/publi/2016/kesiraju_interspeech2016_IS161634.pdf [PDF]
Files: 
+Type Name Title Size Last modified
iconkesiraju_interspeech2016_IS161634.pdf281 KB2016-09-19 16:47:36
^ Select all
With selected:
Keywords
Document representation, subspace modelling, topic identification, latent topic discovery
Annotation
Subspace multinomial model (SMM) is a log-linear model and can be used for learning low dimensional continuous representation for discrete data. SMMand its variants have been used for speaker verification based on prosodic features and phonotactic language recognition. In this paper, we propose a new variant of SMM that introduces sparsity and call the resulting model as `1 SMM. We show that `1 SMM can be used for learning document representations that are helpful in topic identification or classification and clustering tasks. Our experiments in document classification show that SMM achieves comparable results to models such as latent Dirichlet allocation and sparse topical coding, while having a useful property that the resulting document vectors are Gaussian distributed.
Abstract
Subspace multinomial model (SMM) is a log-linear model and can be used for learning low dimensional continuous representation for discrete data. SMMand its variants have been used for speaker verification based on prosodic features and phonotactic language recognition. In this paper, we propose a new variant of SMM that introduces sparsity and call the resulting model as `1 SMM. We show that `1 SMM can be used for learning document representations that are helpful in topic identification or classification and clustering tasks. Our experiments in document classification show that SMM achieves comparable results to models such as latent Dirichlet allocation and sparse topical coding, while having a useful property that the resulting document vectors are Gaussian distributed.
BibTeX:
@INPROCEEDINGS{
   author = {Santosh Kesiraju and Luk{\'{a}}{\v{s}} Burget and Igor
	Sz{\H{o}}ke and Jan {\v{C}}ernock{\'{y}}},
   title = {Learning document representations using subspace multinomial
	model},
   pages = {700--704},
   booktitle = {Proceedings of Interspeech 2016},
   year = {2016},
   location = {San Francisco, US},
   publisher = {International Speech Communication Association},
   ISBN = {978-1-5108-3313-5},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=11269}
}

Your IPv4 address: 54.163.210.170
Switch to IPv6 connection

DNSSEC [dnssec]