Information Retrieval and Data Mining
Core course, 9 ECTS credits, winter semester 2015 – 2016
Basic Information
Teaching Assistants
- Tuesday 14-16 in lecture hall 002 in building E1.3
- Thursday 14-16 in lecture hall 002 in buidling E1.3
The first lecture is on Tuesday, October 20.
Tutorial Groups
| Group Name | Time | Place | Group Name | Time | Place |
|---|---|---|---|---|---|
| Group A | Tuesday 16-18 | Room 024 | Group B | Tuesday 16-18 | Room 021 |
Group C | Thursday 16-18 | Room 021 | Group D | Thursday 16-18 | Room 021 |
| Group E | Friday 14-16 | Room 021 |
All tutorial rooms are located in building E1.4.
Contact
Tentative Schedule and Lecture Slides
| Week/Date | Slides | Lecturer | Notes |
|---|---|---|---|
| JV & GW | First assignment handed out | |
| GW | First assignment will be submitted. | |
| JV | Tutorials on first assignment | |
| JV | Tutorials on second assignment | |
| JV | Tutorials on Patterns | |
| JV | Tutorials on Clusters | |
| JV | Tutorials on Classification | |
| JV | Tutorials on Sequences | |
| JV & GW | Tutorials on Graphs | |
Holiday break: Dec 21 - Jan 1 | |||
| GW | No tutorials | |
| GW | Tutorials on text indexing and matching | |
| GW | Tutorials on query processing. | |
| GW | Tutorials on language models. | |
| GW | Tutorials on on web mining. | |
| GW | ||
| JV & GW | The dates are preliminary. Type of the exam is currently planned to be oral. | |
| JV & GW | Repetitions of oral exams are only for the students who fail oral exam on Feb 15/16. | |
Homework Assignments
- Homework 1. Sample Solutions.
- Homework 2. Sample Solutions.
- Homework 3. Sample Solutions.
- Homework 4. Sample Solutions.
- Homework 5. Sample Solutions.
- Homework 6. Sample Solutions.
- Homework 7. Sample Solutions.
- Homework 8. Sample Solutions.
- Homework 9. Sample Solutions.
- Homework 10. Sample Solutions.
- Homework 11. Sample Solutions.
Information on Exams
- Test1 solutions are online. Check here.
- Test2 solutions are online. Check here.
- Test3 solutions are online. Check here.
Course Contents
Information Retrieval (IR) and Data Mining (DM) are methodologies for organizing, searching and analyzing digital contents from the web, social media and enterprises as well as multivariate datasets in these contexts. IR models and algorithms include text indexing, query processing, search result ranking, and information extraction for semantic search. DM models and algorithms include pattern mining, rule mining, classification and recommendation. Both fields build on mathematical foundations from the areas of linear algebra, graph theory, and probability and statistics.
Prerequisites
Good knowledge of undergraduate mathematics (linear algebra, probability theory) and basic algorithms.
Grading and Requirements for Passing the Course
To pass the course and earn 9 credit points, the following is required:
- Regular attendance of classes and tutor groups
- Presentation of solutions in tutor groups
- Passing 2 of 3 written tests (after each third of the semester)
- Passing the final exam (at the end of the semester)
The overall grade will be determined by the performance in the final exam combined with your bonus points.
Suggested Reading
The following textbooks will be used:
on data mining:
- primary: Charu Aggarwal: Data Mining - The Textbook
- secondary: Mohamed Zaki and Wagner Meira: Data Mining and Analysis
on information retrieval:
- primary: Stefan Büttcher, Charles Clarke, Gordom Comarck: Information Retrieval
- secondary: Chris Manning, Prabhakar Raghavan, Hinrich Schütze: Introduction to Information Retrieval
on probability and statistics:
- primary: Larry Wasserman: All of Statistics
- secondary: Arnold Allen: Probability, Statistics, and Queueing Theory
These and addditional references are available in the library: