Lectures 1. Process of knowledge discovery - history, goals definition, overview of methodologies. 2. Classification of data mining problems, examples of typical problems - CRM customer relationship management, customer acquisition, prediction about the customer migration to competitors (churn rate), success of marketing campaigns. 3. Fraud detection - fraud problems, credit risk, behavioral scores used for risk evaluation by credit repayment. 4. Data preparation, data understanding, data set description, preparation of data matrix, data selection and cleaning, building and merging of data sources, type homogeneity, data formatting. 5. Classification algorithms as tools for prediction from historical data. Decision trees, C&RT, C5.0, CHAID&QUEST algorithms, converting from a tree to rules. 6. Discrimination analysis - classification of problems into classes, scoring. 7. Segmentation algorithms - discovery of unusual structures in data using clustering algorithms K-Means, Two Step, Anomaly. 8. Association algorithms - discovery of association rules, Apriori model, Carma, implications statistics, and prediction model. 9. Introduction to neural networks - working with categorized and numeric variables; used in such case where classical linear methods are not sufficient. 10. Analysis and prediction of time series using DM models, data preparation, completing missing values, difference, seasonality, moving averages, medians, smoothing of time series. 11. Modeling and evaluation of the solution, inclusion of scoring processes into business decision workflow. 12. Webmining, Textmining - analysis of typical data mining problems. Practice 1.- 2. Data processing and visualization in SPSS Modeler and its other functionalities, comparison with another Open source software. 3. - 9. Model preparation for use cases, their analysis and interpretation of results - extension and modification of a sample study. 9. - 12. Individual work on assignments. 13. - 14. Presentation of solutions.
|
This course examines typical problems and techniques of data mining (DM). It teaches basic DM algorithms used for classification, segmentation, association and other tasks required while operating on large data sets. DM techniques enable to make predictions in order to optimize decision making processes. Students will learn how to find anomalies, relationships, patterns and other hidden information. DM problems will be solved using CRISP-DM methodology enabled software tool. Students will learn all DM steps involving problem formulation, data collection, data preparation and data processing for modeling, model design and model evaluation and finally deployment of their solution. The emphasis will be put on understanding and correct interpretation of the results.
Students will aquire general information about Development Datamining Task.
|
-
Berka Petr. Dobývání znalostí z databází. Academia, Oraha, 2006.
-
Hendl Jan. Přehled statistických metod zpracování dat. Praha, 2009.
-
Olivia Parr Rud. Datamining. Computer Press a.s., 2006.
-
Yong Yin, Ikou Kaku, Jiafu Tang. Data Mining. Springer London Ltd. , 2011.
|