Thesis Details
Pokročilé metody strojového učení pro klasifikaci textu
Bachelor's Thesis
Student: Dočekal Martin
Academic Year: 2016/2017
Supervisor: Smrž Pavel, doc. RNDr., Ph.D.
English title
Advanced Machine-Learning Methods for Text Classification
Language
Czech
Abstract
This thesis deals with advanced machine-learning methods for text classification. At first, these methods are described, and then text classification system is created based on these methods. The system also provides tools for document preprocessing and evaluation of classifier. The thesis describes the use of the system in a real-life task.
Keywords
machine-learning, feature extraction, Bag-of-words, TF-IDF, word2vec, doc2vec, feature hashing, k-nearest neighbors, Multinomial Naive Bayes, Support Vector Machine, classification, evaluation of classifier, preprocessing, ensemble classification methods, balancing algorithms
Department
Degree Programme
Information Technology
Files
Status
defended, grade B
Date
14 June 2017
Reviewer
Committee
Zbořil František V., doc. Ing., CSc. (DITS FIT BUT), předseda
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Hliněná Dana, doc. RNDr., Ph.D. (DMAT FEEC BUT), člen
Matoušek Petr, doc. Ing., Ph.D., M.A. (DIFS FIT BUT), člen
Zachariášová Marcela, Ing., Ph.D. (DCSY FIT BUT), člen
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Hliněná Dana, doc. RNDr., Ph.D. (DMAT FEEC BUT), člen
Matoušek Petr, doc. Ing., Ph.D., M.A. (DIFS FIT BUT), člen
Zachariášová Marcela, Ing., Ph.D. (DCSY FIT BUT), člen
Citation
DOČEKAL, Martin. Pokročilé metody strojového učení pro klasifikaci textu. Brno, 2017. Bachelor's Thesis. Brno University of Technology, Faculty of Information Technology. 2017-06-14. Supervised by Smrž Pavel. Available from: https://www.fit.vut.cz/study/thesis/19757/
BibTeX
@bachelorsthesis{FITBT19757, author = "Martin Do\v{c}ekal", type = "Bachelor's thesis", title = "Pokro\v{c}il\'{e} metody strojov\'{e}ho u\v{c}en\'{i} pro klasifikaci textu", school = "Brno University of Technology, Faculty of Information Technology", year = 2017, location = "Brno, CZ", language = "czech", url = "https://www.fit.vut.cz/study/thesis/19757/" }