This course will introduce the basic concepts of probabilistic graphical models. Graphical Models are a unified framework that allow to express complex probability distributions in a compact way. Many machine learning applications are tackled by the use of these models, in this course we will highlight the possibilities with computer vision applications.
The main goal of the class is to understand the concepts behind graphical models and to give hands-on knowledge such that one is able to design models for computer vision applications but also in other domains. Therefore the lecture is roughly divided in two parts: learning about graphical models and seeing them in action.
In the first part of the lecture we will discuss the basics of solving these models, eg. for special kinds of graphs where efficient exact inference is possible and approimate methods for the general case. In the second part we will then discuss prominent applications for both low- and high-level computer vision problems. Some examples are statistical models of images (eg denoising), body pose estimation, person tracking, object detection and semantic image segmentation.
The exercises will be a mix of theoretical and practical assignments.
Lecture start: Wednesday, 17.10
Time and Location:
Lecture: Wednesday 2 pm - 4 pm at MPI for Informatics in room 024
Exercise: Friday 10 am - 12 pm at MPI for Informatics in room 024
VL 2 + Ü 2
Mailing List: Send an email with your matriculation number and full name to jlange[at]mpi-inf.mpg.de with [pgm-subscribe] in the subject.
Exam: February 19
Registration: send an email to mohomran[at]mpi-inf.mpg.de with [pgm exam registration] in the subject
TA(s): Mohamed Omran (office hour: Wednesdays 16:00-17:00)
Jan-Hendrik Lange (office hour: Fridays 15:00-16:00)
There will be lecture on January 11th at 10AM (the exercises time)
Lectures start Wednesday October 17th
Exercises start Friday October 19th
January 18th Half-time projects presentation
Final projects presentation: February 8th
Exam date: February 19th
October 17: Discrete Graphical Models - An Optimization Perspective, Chapters 1 & 2,Slides
October 24: Introduction to probabilities and directed/undirected graphs. Slides
October 31: Discrete Graphical Models - An Optimization Perspective, Chapter 3, Slides
November 7: Discrete Graphical Models - An Optimization Perspective, Chapter 4, (Local) Marginal Polytope
November 14: Discrete Graphical Models - An Optimization Perspective, Chapter 5, Duality for Local Marginal Polytope, Slides
November 21: Discrete Graphical Models - An Optimization Perspective, Chapter 8, Dual block coordinate ascent
December 19: Body_Models_2. Human models of pose and shape.
January 7: Body_Models_3. Registration as energy minimization and differentiable rendering.
January 23: Stereo
Script for first part
Oct 19: Assignment 1
Nov 2: Assignment 2
Nov 9: Matlab Intro
Nov 30: Assignment 4
Sample Project Report
This report is from last year's PGM course project, can be used for reference. Thanks to Maksym and Yue for making this available. File
Note: You are welcome to design your own project or adapt from the ones below.
The list will continued to be filled with more projects.
Pictorial Structures for Pose Estimation
Given an image of a person, infer the optimal pose using a tree-structured graphical model that incorporates a unary term (evidence for individual parts) as well as a pairwise term (that models connections between parts). As a follow-up step, fit a 3D body model (SMPL) to the inferred pose.
Pictorial Structures Revisited
- Associating 3D poses to 2D pose detections: The goal is track 1 or 2 persons in scenes with multiple people. Initial 3D poses per frame, which are computed using IMU sensors on the body, are given as input. The task is to associate the 3D poses with their corresponding 2D pose detections in the image. The challenge is that there are multiple 2D poses per frame. To start with this project, we provide in "Data" the frame-wise 3D poses, and the 2D pose detections as arrays. The assignment part of the following paper is relevant: virtualhumans.mpi-inf.mpg.de/papers/vonmarcardECCV18/vonmarcardECCV18.pdf
Data: poseCandidateFittings.zip (Data is in matlab but it is recommended to load the matfiles in python:https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.loadmat.html)
Predicting text from 3x4 numeric keypad inputs (T9)
Given an input by a user (e.g. 1 2 2) build a model that can predict what the user wants to express (e.g. add or bee or bed).
A simple solution will condition the output on the input of the last letter, yet it might be more powerful to incorporate specific user statistics as well and condition on previous (or in hindsight even following) words.
Image denoising with spatially dependent noise and color channels
Multiple problems can be solved:
Denoising of spatially dependent noise
Denoising images of various types of sources (e.g. color images or x-ray medical images)
Image segmentation to partition an image into segments
(for example on: archive.ics.uci.edu/ml/datasets/image+segmentation)
Assumption: data is governed by a number of latent variables (topics)
The task would be to train a graphical topic model, for instance
On images (visual topics)
On text documents (semantic topics)
Probabilistic game playing agent
Train a graphical model representing a stochastic game, such as Blackjack (one or more players)
Local statistical shape model:
Build local statistical shape models for each part of the body and define a graph that connects the parts to produce coherent shapes. The graph structure can be a tree or with loops.
The end goal is then to sample from the graph to produce realistic shapes with more variation than using a global shape model.
http://smpl.is.tue.mpg.de/ (server might be down for 1-2 days)
Shape registration using distributed inference:
Given two 3D shapes (for example of humans), split the shapes into parts and find the alignment between them. In the lecture exercises you have found the registration by treating the shape as a whole. In this project the idea is to consider the shape as a union of weakly connected parts. The goal is to find the optimal alignment by designing an energy function that can be decomposed into unaries (part registration cost) and pairwise terms (part to part cost).
3D pose and shape estimation using particle message passing (Note: very difficult):
Using a variation of particle filters for graphs track the human pose and shape of people in images. The human pose of the person is represented in 3D as a collection of loosely connected parts.
NOTE: To make the problem easier as images use synthetic renderings of a human with colored parts to produce virtual images of people.
https://github.com/mattloper/opendr/wiki -- code to render images of people from the SMPL model. You will also need this code to evaluate the likelihood of the images given the current state estimation.
Sample code to work with 3D human shapes:
- Discrete Graphical Models - An Optimization Perspective, Bogdan Savchynskyy, preprint, online available at Textbook website
- Probabilistic Graphical Models: Principles and Techniques, Daphne Koller and Nir Friedman, MIT Press, 2009 (careful: 1300 pages)
- Bayesian Reasoning and Machine Learning, David Barber, Cambridge University Press, online available at Textbook website (thanks David)
- Pattern Recognition and Machine Learning, Chris Bishop, Springer, 2006