SubjectsSubjects(version: 948)
Course, academic year 2023/2024
  
Data Mining - B500007
Title: Vytěžování znalostí z dat
Guaranteed by: Department of Informatics and Chemistry (143)
Faculty: Faculty of Chemical Technology
Actual: from 2021
Semester: both
Points: 4
E-Credits: 4
Examination process:
Hours per week, examination: 2/2, C+Ex [HT]
Capacity: winter:unlimited / unlimited (unknown)
summer:unknown / unknown (unknown)
Min. number of students: unlimited
Language: Czech
Teaching methods: full-time
Teaching methods: full-time
Level:  
For type: Bachelor's
Note: you can enroll for the course in winter and in summer semester
Guarantor: Kordík Pavel doc. Ing. Ph.D.
Interchangeability : N500011
Is interchangeable with: B500010, B500011
Annotation -
Last update: Kubová Petra Ing. (02.01.2018)
Students are introduced to the basic methods of discovering knowledge in data. In particular, they learn the basic techniques of data preprocessing, multidimensional data visualization, statistical techniques of data transformation, and fundamental principles of knowledge discovery methods. Students will be aware of the relationships between model bias and variance, and know the fundamentals of assessing model quality. Data mining software is extensively used in the module. Students will be able to apply basic data mining tools to common problems (classification, regression, clustering).
Aim of the course -
Last update: Kubová Petra Ing. (02.01.2018)

Students will be able to:

Understand knowledge discovery in data.

Literature -
Last update: Svozil Daniel prof. Mgr. Ph.D. (26.03.2019)

R: Larose, D. T. Discovering Knowledge in Data: An Introduction to Data Mining. Wiley-Interscience, 2004. ISBN 0471666572.

R: Berka, P. Dobývání znalostí z databází. Praha: Academia, 2003

A: L. Pierson: Data Science for Dummies (2nd edition), 2017

Learning resources -
Last update: Kubová Petra Ing. (02.01.2018)

https://edux.fit.cvut.cz/courses/BI-VZD/

(login necessary)

Syllabus -
Last update: Kubová Petra Ing. (02.01.2018)

1. Introduction to data mining, data preparation, data visualization.

2. Statistical analysis of data.

3. Data model, nearest neighbour classifier.

4. Training, validation and testing, model's quality evaluation.

5. Artificial neural networks in data mining.

6. Unsupervised neural networks - competitive learning

7. Probability and Bayesian classification.

8. Decision trees and rules.

9. Neural networks with supervised learning.

10. Cluster analysis.

11. Combining neural networks and models in general.

12. Data mining in the Clementine environment.

13. Text mining, Web mining, selected applications, new trends.

Registration requirements -
Last update: Kubová Petra Ing. (02.01.2018)

none

Course completion requirements - Czech
Last update: Svozil Daniel prof. Mgr. Ph.D. (07.02.2018)

Pro zı́skánı́ zápočtu je potřeba dostatek bodů ze programovacích úloh a testu. Zkouška se skládá z povinné pı́semné části.

Teaching methods
Activity Credits Hours
Účast na přednáškách 1 28
Příprava na přednášky, semináře, laboratoře, exkurzi nebo praxi 1 28
Příprava na zkoušku a její absolvování 1 28
Účast na seminářích 1 28
4 / 4 112 / 112
 
VŠCHT Praha