Title:

Natural Language Processing (in English)

Code:ZPJa
Ac.Year:2019/2020
Sem:Winter
Curriculums:
ProgrammeField/
Specialization
YearDuty
IT-MSC-2MBI-Compulsory-Elective - group S
IT-MSC-2MBS-Elective
IT-MSC-2MGM-Elective
IT-MSC-2MGMe-Elective
IT-MSC-2MIN-Elective
IT-MSC-2MIS-Elective
IT-MSC-2MMM-Elective
IT-MSC-2MPV-Elective
IT-MSC-2MSK-Elective
MITAINADE-Elective
MITAINBIO-Elective
MITAINCPS-Elective
MITAINEMB-Elective
MITAINGRI-Elective
MITAINHPC-Elective
MITAINIDE-Elective
MITAINISD-Elective
MITAINISY-Elective
MITAINMAL-Elective
MITAINMAT-Elective
MITAINNET-Elective
MITAINSEC-Elective
MITAINSEN-Elective
MITAINSPE-Compulsory
MITAINVER-Elective
MITAINVIZ-Elective
Language of Instruction:English
Credits:5
Completion:examination (written)
Type of
instruction:
Hour/semLecturesSeminar
Exercises
Laboratory
Exercises
Computer
Exercises
Other
Hours:2600026
 ExamsTestsExercisesLaboratoriesOther
Points:5190040
Guarantor:Smrž Pavel, doc. RNDr., Ph.D. (DCGM)
Deputy guarantor:Hradiš Michal, Ing., Ph.D. (DCGM)
Lecturer:Smrž Pavel, doc. RNDr., Ph.D. (DCGM)
Faculty:Faculty of Information Technology BUT
Department:Department of Computer Graphics and Multimedia FIT BUT
Schedule:
DayLessonWeekRoomStartEndLect.Gr.Groups
FrilecturelecturesN104 N105 09:0010:501EIT 1MIT 2EIT 2MIT INTE xx
 
Learning objectives:
  To understand natural language processing and to learn how to apply modern machine learning methods in this field. To get acquainted with advanced deep learning architectures that proved to be successful in various NLP tasks. To
understand the use of neural networks for sequential language modelling, to understand  their use as conditional language models for transduction tasks, and to approaches employing these techniques in combination with other mechanisms for advanced applications.
Description:
  Foundations of the natural language processing, historical perspective, statistical NLP and modern era dominated by machine learning and, specifically, deep neural networks. Meaning of individual words, lexicology and lexicography, word senses and neural architectures for computing word embeddings, word sense classification and inferrence. Constituency and dependency parsing, syntactic ambiguity, neural dependency parsers. Language modeling and its applications in general architectures. Machine translation, historical perspective on the statistical approach, neural translation and evaluation scores. End-to-end models, attention mechanisms, limits of current seq2seq models. Question answering based on neural models, information extraction components, text understanding challenges, learning by reading and machine comprehension. Text classification and its modern applications, convolutional neural networks for sentence classification. Language-independent representations, non-standard texts from social networks, representing parts of words, subword models. Contextual representations and pretraining for context-dependent language modules. Transformers and self-attention for generative models. Communication agents and natural language generation. Coreference resolution and its interconnection to other text understanding components.
Knowledge and skills required for the course:
  Good knowledge of artificial neural network models and Python programming.
Subject specific learning outcomes and competencies:
  The students will get acquainted with natural language processing and will understand a range of neural network models that are commonly applied in the field. They will also grasp basics of neural implementations of attention mechanisms and sequence embedding models and how these modular components can be combined to build state of the art NLP systems. They will be able to implement and to evaluate common neural network models for various NLP applications.
Generic learning outcomes and competencies:
  Students will improve their programming skills and their knowledge and practical experience with tools for deep learning as well as with general processing of textual data.
Why is the course taught:
  More and more people use natural language processing (NLP) in their everyday life - machine translators, virtual assistants, etc. Most NLP tasks have been recently realised by means of deep neural networks. Students of this course will learn, how the computer translates texts between languages, how it recognizes what a review author likes or dislikes about a new product, and how virtual asistants can answer questions on the Wikipedia text.
Syllabus of lectures:
 
  1. Introduction, history of NLP, and modern approaches based on deep learning
  2. Word senses and word vector
  3. Dependency parsing
  4. Language models
  5. Machine translation
  6. Seq2seq models and attention
  7. Question answering
  8. Convolutional neural networks for sentence classification
  9. Information from parts of words: Subword models
  10. Modeling contexts of use: Contextual representations and pretraining
  11. Transformers and self-attention for generative models 
  12. Natural language generation 
  13. Coreference resolution
Syllabus - others, projects and individual work of students:
 
  • Individually assigned project
Fundamental literature:
 
  • Goldberg, Yoav. "Neural network methods for natural language processing." Synthesis Lectures on Human Language Technologies 10, no. 1 (2017): 1-309.
  • Deng, Li, and Yang Liu, eds. Deep Learning in Natural Language Processing. Springer, 2018.
Study literature:
 
  • Géron, Aurélien. Hands-on machine learning with Scikit-Learn and TensorFlow: concepts, tools, and techniques to build intelligent systems. " O'Reilly Media, Inc.", 2017.
  • Raaijmakers, Stephan. Deep Learning for Natural Language Processing. Manning, 2019.
Controlled instruction:
  The evaluation includes mid-term test, individual project, and the final exam. The mid-term test does not have a correction option, the final exam has two possible correction runs.
Progress assessment:
  
  • Mid-term test - up to 9 points
  • Individual project - up to 40 points
  • Written final exam - up to 51 points
Exam prerequisites:
  
  • Realized individual project
 

Your IPv4 address: 54.211.135.32