Thesis Details
Kontrola konzistence informací extrahovaných z textu
This bachelor thesis is dedicated to mechanical techniques that are used in the natural language processing and information extraction from particular text. It is approaching the general methods that starting to process the raw text and it continues to the relations extraction from processed language constructs, moreover it provides options for the use of obtained relational data which can be seen for example in the project DBpedia. Another milestone of the described bachelor thesis is the design and implementation of an automated system for extracting information about entities, which do not have their own article on the English version of Wikipedia. Thesis also presents algorithms developed for the extraction of entities with their own name, the verification of the articles ‘existence of the extracted entities and finally the actual extraction of information about individual entities, which can be used during the information consistency checking. In the end, it can be seen the results and suggestions for further development of the created system.
Wikipedia, corpus, DBpedia, coreference, information extraction, NLP, named entity recognition, Open Information Extraction, consistency checking, entity extraction
Burget Radek, doc. Ing., Ph.D. (DIFS FIT BUT), člen
Szőke Igor, Ing., Ph.D. (DCGM FIT BUT), člen
Vašíček Zdeněk, doc. Ing., Ph.D. (DCSY FIT BUT), člen
Zbořil František, doc. Ing., Ph.D. (DITS FIT BUT), člen
@bachelorsthesis{FITBT18808, author = "Jakub Stejskal", type = "Bachelor's thesis", title = "Kontrola konzistence informac\'{i} extrahovan\'{y}ch z textu", school = "Brno University of Technology, Faculty of Information Technology", year = 2016, location = "Brno, CZ", language = "czech", url = "https://www.fit.vut.cz/study/thesis/18808/" }