Simple does it: Weakly Supervised Instance and Semantic Segmentation

Anna Khoreva, Rodrigo Benenson, Jan Hosang,

Matthias Hein and Bernt Schiele


Semantic labelling and instance segmentation are two tasks that require particularly costly annotations. Starting from weak supervision in the form of bounding box detection annotations, we propose a new approach that does not require modification of the segmentation training procedure. We show that when carefully designing the input labels from given bounding boxes, even a single round of training is enough to improve over previously reported weakly supervised results. Overall, our weak supervision approach reaches ~95% of the quality of the fully supervised model, both for semantic labelling and instance segmentation.



Figure 1. Example results of using only rectangle segments and recursive training (using convnet predictions as supervision for the next round).


Examples of generated data


Segmentation quality versus number of training rounds

Our M&G+ is able to improve over previously reported results and reach 95% of the full supervison quality.


Qualitative results on VOC12

Visually, the results from our weakly supervised method M&G+ are hardly distinguishable from the fully supervised ones.


Generated segmentations: 

Trained models:


For further information or data, please contact Anna Khoreva <khoreva at>.


[Khoreva et al., 2017] Simple Does It: Weakly Supervised Instance and Semantic Segmentation, A. Khoreva, R. Benenson, J. Hosang,  M. Hein and B. Schiele, CVPR, 2017.

title={Simple Does It: Weakly Supervised Instance and Semantic Segmentation
author={A. Khoreva and R. Benenson and J. Hosang and M. Hein and B. Schiele},
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},