Předmět: Data analysis and knowledge mining

« Zpět
Název předmětu Data analysis and knowledge mining
Kód předmětu MTI/DAKM
Organizační forma výuky Přednáška + Cvičení
Úroveň předmětu Magisterský
Rok studia nespecifikován
Semestr Letní
Počet ECTS kreditů 5
Vyučovací jazyk Angličtina
Statut předmětu Povinně-volitelný
Způsob výuky Kontaktní
Studijní praxe Nejedná se o pracovní stáž
Doporučené volitelné součásti programu Není
Vyučující
  • Lamr Marián, Ing. Ph.D.
Obsah předmětu
Lectures (topics): 1. Analysis of current data sources, data types and their processing, data archiving. Data import and export. 2. The process of mining data from large data structures, CRISP-DM methodology. 3. Preparing data, understanding data, description of data sets, and preparation of data matrices, choosing and scouring data, construction and merging of data sources, type homogeneity, formatting and common transformations of data. 4. Categorical data versus numerical data, their use in DM algorithms, categorisation, the method of optimal categorisation, the solution for missing values, multiple imputations, dependencies inside data, reduction of dimensionality. 5. Machine learning and data mining methods. 6. Building decision-making and classification trees. 7. Algorithms of searching for association rules in large data structures. 8. Neural networks, genetic algorithms, particle swarm optimisation - procedures inspired by nature. 9. Teaching without a teacher- the fundamental principles of cluster analysis, the significance of similarities and anomalies in data. 10. Advanced methods of model evaluation. Seminars (topics): Within the scope of the seminars, the students will be acquainted with chosen software tools for finding hidden information, knowledge and behavioural patterns in data of various types to support decision-making. "Knowledge" refers to generalised information presented through e.g. discovered rules. The students will be working with large sets of various real numerical and non-numerical data, with data created through enterprise management, during the management of the operation of production technologies, with experimental data, customer data, client data and other. They will be solving the assignments in the IBM SPSS modeler environment and other open-source data mining tools, following the lectures. The application of DM procedures and algorithms will be discussed and studied through a wide array of assignments, e. g., marketing campaign targeting, customers churn, monitoring of test operation, prediction of machine failures etc.

Studijní aktivity a metody výuky
Přednáška, Cvičení
  • Účast na výuce - 56 hodin za semestr
  • Příprava na zkoušku - 44 hodin za semestr
  • Příprava na zápočet - 20 hodin za semestr
  • Domácí příprava na výuku - 30 hodin za semestr
Výstupy z učení
The goal of the subject is to acquaint the students with the matter of making decisions based on the knowledge acquired from different types of data sources, especially through the analysis of large volumes of data. The individual steps of the process of acquiring the knowledge will be demonstrated on practical assignments. The students will be acquainted with the techniques, tools and algorithms used in this process.

Předpoklady
nespecifikováno

Hodnoticí metody a kritéria
Kombinovaná zkouška

Doporučená literatura
  • WITTEN, I. H. and Frank EIBE. Data mining: practical machine learning tools and techniques. Cambridge: Morgan Kaufmann, 2017. ISBN 9780128042915.


Studijní plány, ve kterých se předmět nachází
Fakulta Studijní plán (Verze) Kategorie studijního oboru/specializace Doporučený ročník Doporučený semestr