Tensors in Data Analysis

Block lecture, 5 ECTS credits, winter semester 2017--18

Organization

Lecturer

Dr. Pauli Miettinen

Time & Location

  • The lectures will take place on week 41 (9–13 October 2017) at 10:15–11:45 and 12:30–14:00 in room 023 of building E1.4 (MPI-INF)
  • Tutorial sessions will take place Tuesday–Friday on week 41 (10–13 October 2017) at 14:15–15:30 in room 023 of builiding E1.4 (MPI-INF)
  • Place for the lectures and tutorial sessions is to be announced
  • Final exam will be on Friday, 20 October (tentative). Time and place to be announced.

Registration

  • Presense on the first lecture is mandatory
  • No pre-registration other than exam registration in HISPOS

Course Contents

Tensors are multidimensional extensions of matrices. In the past decade, there has been a significant increase in the interest of using tensors in data analysis, where they can be used to store, for example, multi-relational data (subject-predicate-object triples, user-movie-tag triples, etc.), high spectral data (X-Y-spectrum images), or spatio-temporal data (X-Y-time data). Various tensor factorization methods are developed and proposed for analysing such data sets and for finding the latent structure from them.

This course will cover the use of tensor factorizations in data analysis. We will cover a number of different factorizations, their applications, their strengths and weaknesses, and algorithms for finding them. In addition, we will also cover other important topics related to tensors in data analysis, such as how to select which factorization to use and how to interpret their results.

This is a block course, and all the lectures will happen within one week (two times two 1:30 h lectures per day) before the teaching starts. In addition there will be four tutorial sessions after the lectures.

The tentative list of contents for the course is:

  • Tensor algebra and tensor operations
  • The CANDECOMP/PARAFAC (CP) decomposition and tensor rank
  • Variants and applications of and algorithms for the CP decomposition
  • The Tucker decompositions, their algorithms and applications
  • The tensor train decomposition
  • Choosing the factorization and rank

Prerequisites

The students are expected to have a good knowledge of linear algebra and matrix analysis. Basic knowledge of matrix factorization methods such as SVD, PCA, and NMF is expected. Taking the course Data Mining and Matrices is not mandatory, but knowing its contents is recommended.
 

Learning Objectives

The course aims at teaching the theory behind the tensor decompositions, as well as the practical side of which decomposition to use when and how to interpret their results. After the course, the students should know the most common tensor decompositions in data analysis and be able to use them in their own work. The students should also be able to understand new decompositions, and how they relate to the ones they already know. The students should be able to understand the basic algorithmic ideas used in computing the decompositions, and to read and implement basic tensor decomposition algorithms. The students should be able to choose the correct decomposition to the given data analysis task, to be able to interpret the results, and to know the strengths and the weaknesses of the most common decompositions.

Reading Material

The following material provides background information on the topic. It is not mandatory for the course, but it is definitely helpful. 

  1. Skillicorn, D., 2007. Understanding Complex Datasets: Data Mining with Matrix Decompositions, Chapter 9. Boca Raton: Chapman & Hall/CRC.
  2. Kolda, T.G. & Bader, B.W., 2009. Tensor decompositions and applications. SIAM Review, 51(3), pp.455–500. [PDF]