SubjectsSubjects(version: 954)
Course, academic year 2019/2020
Efficient Text Pattern Matching - M500001
Title: Efektivní vyhledávání v textech
Guaranteed by: Department of Informatics and Chemistry (143)
Faculty: Faculty of Chemical Technology
Actual: from 2019
Semester: winter
Points: winter s.:4
E-Credits: winter s.:4
Examination process: winter s.:
Hours per week, examination: winter s.:2/1, C+Ex [HT]
Capacity: unlimited / unlimited (unknown)
Min. number of students: unlimited
State of the course: taught
Language: Czech
Teaching methods: full-time
Teaching methods: full-time
Guarantor: Holub Jan prof. Ing. Ph.D.
Interchangeability : N500001
Examination dates   Schedule   
Annotation -
Students get knowledge of efficient algorithms for text pattern matching. They learn to use so called succinct data structures that are efficient in both access time and memory complexity. They will be able to use the knowledge in design of applications that utilize pattern matching.
Last update: Hladíková Jana (05.01.2018)
Aim of the course -

Students will be able to:

Understand pattern matching as a key task in most applications.

Understand efficient algorithms and data structures for pattern matching (not only in information processing, but also in other fields like security - intrusion detection, virus scans).

Have an extensive overview of such algorithms and data structures.

Last update: Hladíková Jana (05.01.2018)
Literature -

R: Pokorný J., Dokumentografické informační systémy, Karolinum, 2005, ISBN 80-246-1148-1

R: M.Crochemore, C. Hancart, T. Lecroq: Algorithms on Strings. Cambridge University Press, 2014. ISBN 1107670993

A: Melichar B., Textové informační systémy, Praha, 1994, ISBN 8001012069, 9788001012062

A: W.F. Smyth: Computing Patterns in Strings, Pearson Addison Wesley (UK), 2003, 423 pp. ISBN 0201398397

A: M.Crochemore, W. Rytter: Jewels of Stringology. World Scientific Publishing Company, 2003. ISBN 9810248970

A: G.Navarro, M. Raffinot: Flexible Pattern Matching in Strings. Cambridge University Press, 2008. ISBN 0521039932

Last update: Svozil Daniel (04.11.2018)
Learning resources -

(login necessary)

Last update: Hladíková Jana (05.01.2018)
Syllabus -

1. Introduction, basic definitions, border array.

2. Text full index: Suffix array.

3. Text full index: Suffix tree, LCP construction.

4. Text full index: Factor, suffix automata, on-line construction.

5. Exact pattern matching algorithms.

6. FFT in pattern matching.

7. Succinct data structure: rank & select.

8. Succinct data structure: wavelet tree.

9. FM-Index.

10. Dictionary representation, spell checking.

11. Approximate pattern matching.

12. Pattern matching in bioinformatics and musicology.

13. Pattern matching in bioinformatics and musicology.

Last update: Hladíková Jana (05.01.2018)
Registration requirements -

Automata and grammars

Last update: Svozil Daniel (08.02.2018)
Course completion requirements - Czech

Pro zı́skánı́ zápočtu je potřeba dostatek bodů ze zápočtových testů. Zkouška se skládá z povinné pı́semné a povinné ústnı́ části.

Last update: Svozil Daniel (07.02.2018)
Teaching methods
Activity Credits Hours
Účast na přednáškách 1 28
Příprava na přednášky, semináře, laboratoře, exkurzi nebo praxi 1 28
Příprava na zkoušku a její absolvování 1.5 42
Účast na seminářích 0.5 14
4 / 4 112 / 112
Coursework assessment
Form Significance
Examination test 30
Continuous assessment of study performance and course -credit tests 40
Oral examination 30