Course: Data Mining

» List of faculties » FM » MTI
Course title Data Mining
Course code MTI/DM
Organizational form of instruction Not filled in + Lesson
Level of course Master
Year of study not specified
Semester Summer
Number of ECTS credits 5
Language of instruction Czech
Status of course Compulsory, Compulsory-optional
Form of instruction Face-to-face
Work placements Course does not contain work placement
Recommended optional programme components None
Lecturer(s)
  • Tyl Pavel, Ing.
  • Lamr Marián, Ing. Ph.D.
Course content
Lectures 1. Process of knowledge discovery - history, goals definition, overview of methodologies. 2. Classification of data mining problems, examples of typical problems - CRM customer relationship management, customer acquisition, prediction about the customer migration to competitors (churn rate), success of marketing campaigns. 3. Fraud detection - fraud problems, credit risk, behavioral scores used for risk evaluation by credit repayment. 4. Data preparation, data understanding, data set description, preparation of data matrix, data selection and cleaning, building and merging of data sources, type homogeneity, data formatting. 5. Classification algorithms as tools for prediction from historical data. Decision trees, C&RT, C5.0, CHAID&QUEST algorithms, converting from a tree to rules. 6. Discrimination analysis - classification of problems into classes, scoring. 7. Segmentation algorithms - discovery of unusual structures in data using clustering algorithms K-Means, Two Step, Anomaly. 8. Association algorithms - discovery of association rules, Apriori model, Carma, implications statistics, and prediction model. 9. Introduction to neural networks - working with categorized and numeric variables; used in such case where classical linear methods are not sufficient. 10. Analysis and prediction of time series using DM models, data preparation, completing missing values, difference, seasonality, moving averages, medians, smoothing of time series. 11. Modeling and evaluation of the solution, inclusion of scoring processes into business decision workflow. 12. Webmining, Textmining - analysis of typical data mining problems. Practice 1.- 2. Data processing and visualization in SPSS Modeler and its other functionalities, comparison with another Open source software. 3. - 9. Model preparation for use cases, their analysis and interpretation of results - extension and modification of a sample study. 9. - 12. Individual work on assignments. 13. - 14. Presentation of solutions.

Learning activities and teaching methods
Dialogue metods(conversation,discussion,brainstorming)
  • Class attendance - 56 hours per semester
Learning outcomes
This course examines typical problems and techniques of data mining (DM). It teaches basic DM algorithms used for classification, segmentation, association and other tasks required while operating on large data sets. DM techniques enable to make predictions in order to optimize decision making processes. Students will learn how to find anomalies, relationships, patterns and other hidden information. DM problems will be solved using CRISP-DM methodology enabled software tool. Students will learn all DM steps involving problem formulation, data collection, data preparation and data processing for modeling, model design and model evaluation and finally deployment of their solution. The emphasis will be put on understanding and correct interpretation of the results.
Students will aquire general information about Development Datamining Task.
Prerequisites
Basic knowledge of Statistics and Database Systems.

Assessment methods and criteria
Combined examination

Students are expected to follow all given problems, submit and present their semestral project, have knowledge of taught subjects and obtain satisfactory results from semester-long checks.
Recommended literature
  • Berka Petr. Dobývání znalostí z databází. Academia, Oraha, 2006.
  • Hendl Jan. Přehled statistických metod zpracování dat. Praha, 2009.
  • Olivia Parr Rud. Datamining. Computer Press a.s., 2006.
  • Yong Yin, Ikou Kaku, Jiafu Tang. Data Mining. Springer London Ltd. , 2011.


Study plans that include the course
Faculty Study plan (Version) Category of Branch/Specialization Recommended year of study Recommended semester
Faculty: Faculty of Mechatronics, Informatics and Interdisciplinary Studies Study plan (Version): Information Technology (2013) Category: Informatics courses 1 Recommended year of study:1, Recommended semester: Summer