Thesis Details

Kontrola konzistence informací extrahovaných z textu

Bachelor's Thesis Student: Stejskal Jakub Academic Year: 2015/2016 Supervisor: Smrž Pavel, doc. RNDr., Ph.D.

English title

Consistency Checking of Relations Extracted from Text

Language

Czech

Abstract

This bachelor thesis is dedicated to mechanical techniques that are used in the natural language processing and information extraction from particular text. It is approaching the general methods that starting to process the raw text and it continues to the relations extraction from processed language constructs, moreover it provides options for the use of obtained relational data which can be seen for example in the project DBpedia. Another milestone of the described bachelor thesis is the design and implementation of an automated system for extracting information about entities, which do not have their own article on the English version of Wikipedia. Thesis also presents algorithms developed for the extraction of entities with their own name, the verification of the articles ‘existence of the extracted entities and finally the actual extraction of information about individual entities, which can be used during the information consistency checking. In the end, it can be seen the results and suggestions for further development of the created system.

Keywords

Wikipedia, corpus, DBpedia, coreference, information extraction, NLP, named entity recognition, Open Information Extraction, consistency checking, entity extraction

Department

Department of Computer Graphics and Multimedia FIT BUT

Degree Programme

Information Technology

Files

Status

defended, grade E

Date

15 June 2016

Reviewer

Otrusina Lubomír, Ing.

Committee

Kolář Dušan, doc. Dr. Ing. (DIFS FIT BUT), předseda
Burget Radek, doc. Ing., Ph.D. (DIFS FIT BUT), člen
Szőke Igor, Ing., Ph.D. (DCGM FIT BUT), člen
Vašíček Zdeněk, doc. Ing., Ph.D. (DCSY FIT BUT), člen
Zbořil František, doc. Ing., Ph.D. (DITS FIT BUT), člen

Citation

STEJSKAL, Jakub. Kontrola konzistence informací extrahovaných z textu. Brno, 2016. Bachelor's Thesis. Brno University of Technology, Faculty of Information Technology. 2016-06-15. Supervised by Smrž Pavel. Available from: https://www.fit.vut.cz/study/thesis/18808/

BibTeX

@bachelorsthesis{FITBT18808,
    author = "Jakub Stejskal",
    type = "Bachelor's thesis",
    title = "Kontrola konzistence informac\'{i} extrahovan\'{y}ch z textu",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2016,
    location = "Brno, CZ",
    language = "czech",
    url = "https://www.fit.vut.cz/study/thesis/18808/"
}

Theses