Subjects

Your browser does not support JavaScript, or its support is disabled. Some features may not be available.

Text mining - AP500001

Title:	Text mining
Podoba výuky:	lecture
Guaranteed by:	Department of Informatics and Chemistry (143)
Faculty:	Faculty of Chemical Technology
Actual:	from 2019
Počet semestrů výuky:	1
Semester:	both
Points:	0
E-Credits:	0
Examination process:
Hours per week, examination:	3/0, other [HT]
Capacity:	winter:unlimited / unknown (unknown) summer:unknown / unknown (unknown)
Maximální kapacita předmětu:	unlimited
Min. number of students:	unlimited
State of the course:	taught
Language:	English
Teaching methods:	full-time
Level:
Enroll for the course repeatedly:	- / - / - / 9
Note:	can be fulfilled in the future you can enroll for the course in winter and in summer semester

Guarantor:	Kroha Petr prof. Dr. Ing. CSc.
Interchangeability :	P500001

Examination dates

Annotation -

A number of electronic documents grows much faster than a human is able to deal with. Though inormation retrieval methods help to identify documents likely containing a given information based on keywords, text mining approaches deal with the interpretation of information hidden in the documents. This difficult task is related to the semantics of a natural language that is difficult to interpret unequivocally even for trained experts. Text mining adopts various statistical and information retrieval methods, approaches of a computational linguistics and artificial intelligence classification methods. In text mining, following tasks are solved: Informatin extraction - the identification of key text components and of relationships between them, Topic tracking - an intelligent text filtering based on the user profile, Summarization - the summariozation of text content, Sentence extraction - the identification of sentences that are important for text understanding, Categorization, classification, clustering - text categorization based on content similarity, Concept linkage - the identification of relationships between texts with common concepts.

Last update: Pátková Vlasta (08.06.2018)

Course completion requirements -

oral exam

Last update: Pátková Vlasta (08.06.2018)

Literature -

R: Weiss, S.M. et all: Text Mining - Predictive Methods for Analyzing Unstructured Information. Springer, 2005

Last update: Pátková Vlasta (08.06.2018)

Syllabus -

Text Mining, Data Mining, Knowledge Discovery, Text Processing - basic concepts

Information Retrieval - basic concepts, text documents and keywords, relevance and fuzzy logic, indexing, vector model

Latent semantic indexing and singular value decomposition

Clustering of keywords and documents

Text classification, porobabilistic classification - Naive Bayes, k nearest neighbors, decision trees, neural networks, support vector machines

Linguistics in text mining, lexicon, part-of-speech tagging, named entity recognition, parsing, co-references

Text mining applications, automatic content extraction, automatic question answering

Last update: Pátková Vlasta (08.06.2018)

Learning resources -

Lecturer materials

Last update: Pátková Vlasta (08.06.2018)

Learning outcomes -

Students will know:

to identify key text components and relationships between them

to automatically summarize text content

to indetify key factual sentences

to categorize texts into classes based on the similarity of their contents

to search for relationships between texts with same concepts

Last update: Pátková Vlasta (08.06.2018)

Registration requirements -

DSP lecture Information retrieval

Last update: Pátková Vlasta (08.06.2018)