SubjectsSubjects(version: 947)
Course, academic year 2023/2024
Text mining - P500001
Title: Text mining
Guaranteed by: Department of Informatics and Chemistry (143)
Faculty: Faculty of Chemical Technology
Actual: from 2020
Semester: winter
Points: winter s.:0
E-Credits: winter s.:0
Examination process: winter s.:
Hours per week, examination: winter s.:3/0, other [HT]
Capacity: unlimited / unknown (unknown)
Min. number of students: unlimited
Language: Czech
Teaching methods: full-time
Teaching methods: full-time
For type: doctoral
Note: course is intended for doctoral students only
can be fulfilled in the future
Guarantor: Kroha Petr prof. Dr. Ing. CSc.
Is interchangeable with: AP500001
Annotation -
Last update: Svozil Daniel prof. Mgr. Ph.D. (25.05.2018)
A number of electronic documents grows much faster than a human is able to deal with. Though inormation retrieval methods help to identify documents likely containing a given information based on keywords, text mining approaches deal with the interpretation of information hidden in the documents. This difficult task is related to the semantics of a natural language that is difficult to interpret unequivocally even for trained experts. Text mining adopts various statistical and information retrieval methods, approaches of a computational linguistics and artificial intelligence classification methods. In text mining, following tasks are solved: Informatin extraction - the identification of key text components and of relationships between them, Topic tracking - an intelligent text filtering based on the user profile, Summarization - the summariozation of text content, Sentence extraction - the identification of sentences that are important for text understanding, Categorization, classification, clustering - text categorization based on content similarity, Concept linkage - the identification of relationships between texts with common concepts.
Aim of the course -
Last update: Svozil Daniel prof. Mgr. Ph.D. (25.05.2018)

Students will know:

  • to identify key text components and relationships between them
  • to automatically summarize text content
  • to indetify key factual sentences
  • to categorize texts into classes based on the similarity of their contents
  • to search for relationships between texts with same concepts
Literature -
Last update: Svozil Daniel prof. Mgr. Ph.D. (23.05.2018)

R: Weiss, S.M. et all: Text Mining - Predictive Methods for Analyzing Unstructured Information. Springer, 2005

Learning resources -
Last update: Svozil Daniel prof. Mgr. Ph.D. (23.05.2018)

Lecturer materials

Syllabus -
Last update: Svozil Daniel prof. Mgr. Ph.D. (25.05.2018)

Text Mining, Data Mining, Knowledge Discovery, Text Processing - basic concepts

Information Retrieval - basic concepts, text documents and keywords, relevance and fuzzy logic, indexing, vector model

Latent semantic indexing and singular value decomposition

Clustering of keywords and documents

Text classification, porobabilistic classification - Naive Bayes, k nearest neighbors, decision trees, neural networks, support vector machines

Linguistics in text mining, lexicon, part-of-speech tagging, named entity recognition, parsing, co-references

Text mining applications, automatic content extraction, automatic question answering

Registration requirements -
Last update: Svozil Daniel prof. Mgr. Ph.D. (25.05.2018)

DSP lecture Information retrieval

Course completion requirements -
Last update: Svozil Daniel prof. Mgr. Ph.D. (23.05.2018)

oral exam