Title:  Knowledge Discovery in Databases 

Code:  ZZD 

Ac.Year:  2017/2018 

Term:  Winter 

Curriculums:  

Language of Instruction:  Czech 

Completion:  examination (verbal) 

Type of instruction:  Hour/sem  Lectures  Sem. Exercises  Lab. exercises  Comp. exercises  Other 

Hours:  39  0  0  0  13 

 Examination  Tests  Exercises  Laboratories  Other 

Points:  51  0  0  0  49 



Guarantor:  Zendulka Jaroslav, doc. Ing., CSc., DIFS 

Lecturer:  Zendulka Jaroslav, doc. Ing., CSc., DIFS 
Faculty:  Faculty of Information Technology BUT 

Department:  Department of Information Systems FIT BUT 

Prerequisites:  

Substitute for:  

 Learning objectives: 

  To deepen students' knowledge in the field of knowledge discovery in databases and other data sources (KDD) with special focus on theoretical foundations of the used techniques, algorithms and models.  Description: 

 
 The deepening of basics in KDD  basics of methods of data preprocessing (statistics quantities used in data summarization, approaches to data cleaning, transformation and reduction), basics of data warehousing, basic methods and algorithms of mining frequent items and patterns and association rules (Apriori algorithm, FPtree, multilevel association rules, mining multidimensional association rules from relational databases), basic methods and algorithms of classification (decision tree, Bayesian classification, using neural networks, SVM) and prediction (linear and nonlinear regression), basic methods and algorithms of cluster analysis (distance of data, partitioning methods, hierarchical methods, CFtree, densitybased methods, grid and modelbased methods).
 Advanced data mining techniques  advanced techniques of data mining in 'classic' data sources, mining in data streams, time series and sequences, mining in biological data; mining in graphs, multirelational data mining, mining in object, spatial and multimedia data, mining in text, mining on the Web.
 Knowledge and skills required for the course: 

  Students should have basic knowledge in statistics, database systems, information theory, machine learning, neural networks. It is assumed that they have passed some subject on KDD.  Learning outcomes and competences: 

  Students get a broad, yet indepth overview of the field of data mining and knowledge discovery. They get a deeper view mainly in the field related to the topic of their thesis.  Syllabus of lectures: 


 Data preprocessing.
 Data warehousing.
 Asociation analysis.
 Classification and prediction.
 Cluster analysis.
 Advanced data mining in 'classic' data sources.
 Mining in data streams.
 Data mining in time series and sequences.
 Mining in biological data.
 Data mining in graph structures.
 Mining in object, spatial and multimedia data.
 Text mining and Web mining.
 Mining moving object data.
 Syllabus  others, projects and individual work of students: 


 Reading up and treatment of a selected topic concerning knowledge discovery in a field related to the student's PhD thesis.
 Fundamental literature: 


 Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Third Edition. Elsevier Inc., 2012, 703 p. ISBN 9780123814791.
 Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Second Edition. Elsevier Inc., 2006, 770 p. ISBN 1558609013.
 Study literature: 


 Bishop, CH. M.: Pattern Recognition and Machine Learning. Springer, 2006, 738 p. ISBN 9780387310732.
 Aggarwal, Ch.C. (ed.): Data Streams: Models and Algorithms. Advances in Database Systems. Springer, 2006, 358 p. ISBN 0387287590.
 Papers in journals and conference proceedings (including those in ACM Digital library, IEEE Digital library and other electronic sources).
 Controlled instruction: 

  Consultations, elaboration of a given topic, written report and presentation on the final seminar.  Progress assessment: 

  Control questions during consultations.  
