Thesis Details

Automatická detekce jazyka textového dokumentu

Bachelor's Thesis Student: Cakl Jan Academic Year: 2015/2016 Supervisor: Szőke Igor, Ing., Ph.D.
English title
Language Identification of Text Document
Language
Czech
Abstract
The thesis deals with a language identification of a text document. The final program includes three different implementation methods of language identification. The first method is based on a frequency statistics of N-gram. The second one represents Markov chains and the last one uses the simulated neural net for the identification purposes. The result is implemented in the Python language.
Keywords

N-gram, artificial neural network, language identification, Markov chains

Department
Degree Programme
Information Technology
Files
Status
defended, grade B
Date
15 June 2016
Reviewer
Committee
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT), předseda
Bidlo Michal, doc. Ing., Ph.D. (DCSY FIT BUT), člen
Drahanský Martin, prof. Ing., Dipl.-Ing., Ph.D. (DITS FIT BUT), člen
Rychlý Marek, RNDr., Ph.D. (DIFS FIT BUT), člen
Španěl Michal, Ing., Ph.D. (DCGM FIT BUT), člen
Citation
CAKL, Jan. Automatická detekce jazyka textového dokumentu. Brno, 2016. Bachelor's Thesis. Brno University of Technology, Faculty of Information Technology. 2016-06-15. Supervised by Szőke Igor. Available from: https://www.fit.vut.cz/study/thesis/18569/
BibTeX
@bachelorsthesis{FITBT18569,
    author = "Jan Cakl",
    type = "Bachelor's thesis",
    title = "Automatick\'{a} detekce jazyka textov\'{e}ho dokumentu",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2016,
    location = "Brno, CZ",
    language = "czech",
    url = "https://www.fit.vut.cz/study/thesis/18569/"
}
Back to top