| Title: | Natural Language Processing |
|---|
| Code: | ZPJ |
|---|
| Ac.Year: | 2009/2010 |
|---|
| Term: | Winter |
|---|
| Study plans: | |
|---|
| Language: | Czech |
|---|
| Private info: | http://www.fit.vutbr.cz/study/courses/ZPJ/private/ |
|---|
| Credits: | 5 |
|---|
| Completion: | accreditation+exam (written) |
|---|
Type of instruction: | | Hour/sem | Lectures | Sem. Exercises | Lab. exercises | Comp. exercises | Other |
|---|
| Hours: | 26 | 0 | 0 | 0 | 26 |
|---|
| | Examination | Tests | Exercises | Laboratories | Other |
|---|
| Points: | 50 | 10 | 0 | 0 | 40 |
|---|
|
|---|
| Guarantee: | Smrž Pavel, doc. RNDr., Ph.D., DCGM |
|---|
| Lecturer: | Schmidt Marek, Ing., DCGM |
| Instructor: | Schmidt Marek, Ing., DCGM |
|---|
| Faculty: | Faculty of Information Technology BUT |
|---|
| Department: | Department of Computer Graphics and Multimedia FIT BUT |
|---|
| | | Learning objectives: |
|---|
To understand natural language processing and to learn how to apply basic algorithms in this field. To get acquainted with the algorithmic description of the main language levels: morphology, syntax, semantics, and pragmatics, as well as the resources of natural language data - corpora. To conceive basics of knowledge representation, inference, and relations to the artificial intelligence. | | Description: |
|---|
Foundations of the natural language processing, language data in corpora, levels of description: phonetics and phonology, morphology, syntax, semantics and pragmatics. Traditional vs. formal grammars: representation of morphological and syntactic structures, meaning representation. context-free grammars and their context-sensitive extensions, DCG (Definite Clause Grammars), CKY algorithm (Cocke-Kasami-Younger), chart-parsing. Problem of ambiguity. Electronic dictionaries: representation of lexical knowledge. Types of the machine readable dictionaries. Semantic representation of sentence meaning. The Compositionality Principle, composition of meaning. Semantic classification: valency frames, predicates, ontologies, transparent intensional logic (TIL) and its application to semantic analysis of sentences. Pragmatics: semantic and pragmatic nature of noun groups, discourse structure, deictic expressions, verbal and non-verbal contexts. Natural language understanding: semantic representation, inference and knowledge representations. | | Knowledge and skills required for the course: |
|---|
Basic knowledge of C/C++ programming or a scripting language (Perl, Python, Ruby) | | Subject specific learning outcomes and competences: |
|---|
The students will get acquainted with natural language processing and learn how to apply basic algorithms in this field. They will understand the algorithmic description of the main language levels: morphology, syntax, semantics, and pragmatics, as well as the resources of natural language data - corpora. They will also grasp basics of knowledge representation, inference, and relations to the artificial intelligence. | | Generic learning outcomes and competences: |
|---|
The students will learn to work in a team. They will also improve their programming skills and their knowledge of development tools. | | Syllabus of lectures: |
|---|
- Introduction, history of NLP, subdisciplines
- How to build a Google-like search engine, text categorization, document similarity
- Morphological analysis, inflective and derivational morphology, trie structure for dictionaries
- Syntactical analysis, constituent and dependency structures, feature structures, grammar specification formats
- Grammar formalisms, categorial grammars, LFG, HPSG, LTAG
- Methods of syntactic analysis, CKY-algorithm, chart-parsing
- Korpus linguistics, treebanks, TBL method
- Probabilistic context-free analysis, automatic alignment, machine translation
- Lexical semantics, dictionaries vs. encyclopedias, compositionality
- Transparent intensional logic for the description of meaning
- Pragmatics, contextual meaning relations, dynamic semantics
- Knowledge representation, possible-world semantics, inference
- The Semantic Web technologies, ontologies, OWL
| | Syllabus - others, projects and individual work of students: |
|---|
- Individually assigned projects
| | Fundamental literature: |
|---|
- Allen, J., Natural language understanding. 2nd ed. Redwood City : Benjamin/Cummings Publishing Company, 1995. ISBN 0-8053-0334-0.
- Manning, C. D., Schütze, H., Foundations of Statistical Natural Language Processing, MIT Press, 1999, ISBN 0-262-13360-1.
| | Study literature: |
|---|
- Manning, C. D., Schütze, H., Foundations of Statistical Natural Language Processing, MIT Press, 1999, ISBN 0-262-13360-1.
| | Controlled instruction: |
|---|
The evaluation includes mid-term test, individual project, and the final exam. The mid-term test does not have a correction option, the final exam has two possible correction terms. | | Progress assessment: |
|---|
- Mid-term test - up to 10 points
- Individual project - up to 40 points
- Written final exam - up to 50 points
| | Exam prerequisites: |
|---|
- Realized individual project
| | |
|