Computer Vision and Machine Learning

Scalable Multitask Representation Learning for Scene Classification

  title = {Scalable Multitask Representation Learning for Scene Classification},
  author = {Maksim Lapin and Bernt Schiele and Matthias Hein},
  booktitle = {CVPR},
  year = {2014}

MTL-SDCA achieves state of the art on SUN397

In [1], we propose a multitask learning approach (MTL-SDCA) to jointly train a low-dimensional representation and the corresponding classifiers which scales to high-dimensional image descriptors, such as the Fisher Vector [2], and consistently outperforms the current (November 2013) state of the art on the SUN397 scene classification benchmark [3] with varying amounts of training data.

Consistent improvement over single task learning (STL)

The proposed method is evaluated on the SUN397 scene classification benchmark and consistently outperforms single task learning (STL) with and without color information, with varying amounts of training data and varying K when classification performance is measured via top-K accuracy.

More human-like confusions

We observed that there is a small tendency for the MTL-SDCA method to produce more human-like predictions. Figure 4 shows a hand-picked example where MTL-SDCA top-5 predictions show improvement over the single task (STL) ones. See the supplementary material for details.

Technical details

The proposed multitask representation learning method jointly learns a linear mapping into a lower dimensional subspace which is then used to build one-vs-all classifiers for each class. The resulting optimization problem is solved via an adaptation of the recently developed stochastic dual coordinate ascent (SDCA) algorithm [4]. For further details, please consult the paper and the supplementary material.


[1] M. Lapin, B. Schiele and M. Hein, Scalable Multitask Representation Learning for Scene Classification. CVPR 2014.

[2] J. Sánchez, F. Perronnin, T. Mensink and J. Verbeek, Image Classification with the Fisher Vector: Theory and Practice. International Journal of Computer Vision, 105:3 222-245, 2013.

[3] J. Xiao, J. Hays, K A. Ehinger, A. Oliva and A. Torralba, SUN Database: Large-Scale Scene Recognition from Abbey to Zoo. CVPR 2010, 3485-3492.

[4] S. Shalev-Shwartz and T. Zhang, Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization. Journal of Machine Learning Research, 14: 567-599, 2013.