Jan Hosang (PhD Student)

Personal Information

Research Interests

Computer Vision (object recognition and localization)
Machine Learning (deep learning)

Research Projects

Learning non-maximum suppression
Pedestrian detection: CVPR 2015, CVPR 2016/PAMI 2017
Detection proposal evaluation

For more and more recent information, please visit my personal homepage (or Google Scholar, GitHub, arXiv).

Publications

2018

Article

S. Zhang, R. Benenson, M. Omran, J. Hosang, and B. Schiele

“Towards Reaching Human Performance in Pedestrian Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, 2018.

Abstract

Encouraged by the recent progress in pedestrian detection, we investigate the gap between current state-of-the-art methods

and the “perfect single frame detector”. We enable our analysis by creating a human baseline for pedestrian detection (over the Caltech

pedestrian dataset). After manually clustering the frequent errors of a top detector, we characterise both localisation and background-

versus-foreground errors.

To address localisation errors we study the impact of training annotation noise on the detector performance, and show that we can

improve results even with a small portion of sanitised training data. To address background/foreground discrimination, we study convnets

for pedestrian detection, and discuss which factors affect their performance.

Other than our in-depth analysis, we report top performance on the Caltech pedestrian dataset, and provide a new sanitised set of

training and test annotations.

BibTeX

@article{ZBOHS2017,
TITLE = {Towards Reaching Human Performance in Pedestrian Detection},
AUTHOR = {Zhang, Shanshan and Benenson, Rodrigo and Omran, Mohamed and Hosang, Jan and Schiele, Bernt},
LANGUAGE = {eng},
ISSN = {0162-8828},
DOI = {10.1109/TPAMI.2017.2700460},
PUBLISHER = {IEEE Computer Society},
ADDRESS = {Los Alamitos, CA},
YEAR = {2018},
DATE = {2018},
ABSTRACT = {Encouraged by the recent progress in pedestrian detection, we investigate the gap between current state-of-the-art methods and the {\textquotedblleft}perfect single frame detector{\textquotedblright}. We enable our analysis by creating a human baseline for pedestrian detection (over the Caltech pedestrian dataset). After manually clustering the frequent errors of a top detector, we characterise both localisation and background- versus-foreground errors. To address localisation errors we study the impact of training annotation noise on the detector performance, and show that we can improve results even with a small portion of sanitised training data. To address background/foreground discrimination, we study convnets for pedestrian detection, and discuss which factors affect their performance. Other than our in-depth analysis, we report top performance on the Caltech pedestrian dataset, and provide a new sanitised set of training and test annotations.},
JOURNAL = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
VOLUME = {40},
NUMBER = {4},
PAGES = {973--986},
}

Endnote

%0 Journal Article
%A Zhang, Shanshan
%A Benenson, Rodrigo
%A Omran, Mohamed
%A Hosang, Jan
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Towards Reaching Human Performance in Pedestrian Detection :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002D-440B-2
%R 10.1109/TPAMI.2017.2700460
%7 2017-05-02
%D 2018
%X Encouraged by the recent progress in pedestrian detection, we investigate the gap between current state-of-the-art methods
and the &#8220;perfect single frame detector&#8221;. We enable our analysis by creating a human baseline for pedestrian detection (over the Caltech
pedestrian dataset). After manually clustering the frequent errors of a top detector, we characterise both localisation and background-
versus-foreground errors.
To address localisation errors we study the impact of training annotation noise on the detector performance, and show that we can
improve results even with a small portion of sanitised training data. To address background/foreground discrimination, we study convnets
for pedestrian detection, and discuss which factors affect their performance.
Other than our in-depth analysis, we report top performance on the Caltech pedestrian dataset, and provide a new sanitised set of
training and test annotations.
%J IEEE Transactions on Pattern Analysis and Machine Intelligence
%O IEEE Trans. Pattern Anal. Mach. Intell. TPAMI
%V 40
%N 4
%& 973
%P 973 - 986
%I IEEE Computer Society
%C Los Alamitos, CA
%@ false

2017

Conference paper

J. Hosang, R. Benenson, and B. Schiele

“Learning Non-maximum Suppression,” in 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), Honolulu, HI, USA, 2017.

@inproceedings{HBSCVPR17,
TITLE = {Learning Non-maximum Suppression},
AUTHOR = {Hosang, Jan and Benenson, Rodrigo and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-1-5386-0458-8},
DOI = {10.1109/CVPR.2017.685},
PUBLISHER = {IEEE Computer Society},
YEAR = {2017},
DATE = {2017},
BOOKTITLE = {30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017)},
PAGES = {6469--6477},
ADDRESS = {Honolulu, HI, USA},
}

Endnote

%0 Conference Proceedings
%A Hosang, Jan
%A Benenson, Rodrigo
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Learning Non-maximum Suppression : 
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002D-39E9-1
%R 10.1109/CVPR.2017.685
%D 2017
%B 30th IEEE Conference on Computer Vision and Pattern Recognition
%Z date of event: 2017-07-22 - 2017-07-25
%C Honolulu, HI, USA
%B 30th IEEE Conference on Computer Vision and Pattern Recognition 
%P 6469 - 6477
%I IEEE Computer Society
%@  978-1-5386-0458-8

Conference paper

A. Khoreva, R. Benenson, J. Hosang, M. Hein, and B. Schiele

“Simple Does It: Weakly Supervised Instance and Semantic Segmentation,” in 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), Honolulu, HI, USA, 2017.

@inproceedings{Khoreva1603.07485,
TITLE = {Simple Does It: Weakly Supervised Instance and Semantic Segmentation},
AUTHOR = {Khoreva, Anna and Benenson, Rodrigo and Hosang, Jan and Hein, Matthias and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-1-5386-0458-8},
DOI = {10.1109/CVPR.2017.181},
PUBLISHER = {IEEE Computer Society},
YEAR = {2017},
DATE = {2017},
BOOKTITLE = {30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017)},
PAGES = {1665 --1674},
ADDRESS = {Honolulu, HI, USA},
}

Endnote

%0 Conference Proceedings
%A Khoreva, Anna
%A Benenson, Rodrigo
%A Hosang, Jan
%A Hein, Matthias
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Simple Does It: Weakly Supervised Instance and Semantic Segmentation : 
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002A-1A7D-5
%R 10.1109/CVPR.2017.181
%D 2017
%B 30th IEEE Conference on Computer Vision and Pattern
Recognition
%Z date of event: 2017-07-22 - 2017-07-25
%C Honolulu, HI, USA
%B 30th IEEE Conference on Computer Vision and Pattern
Recognition
%P 1665  - 1674
%I IEEE Computer Society
%@ 978-1-5386-0458-8

Thesis

D2IMPR-CS

J. Hosang

“Analysis and Improvement of the Visual Object Detection Pipeline,” Universität des Saarlandes, Saarbrücken, 2017.

Abstract

Visual object detection has seen substantial improvements during the last years due to the possibilities enabled by deep learning. While research on image classification provides continuous progress on how to learn image representations and classifiers jointly, object detection research focuses on identifying how to properly use deep learning technology to effectively localise objects. In this thesis, we analyse and improve different aspects of the commonly used detection pipeline. We analyse ten years of research on pedestrian detection and find that improvement of feature representations was the driving factor. Motivated by this finding, we adapt an end-to-end learned detector architecture from general object detection to pedestrian detection. Our deep network outperforms all previous neural networks for pedestrian detection by a large margin, even without using additional training data. After substantial improvements on pedestrian detection in recent years, we investigate the gap between human performance and state-of-the-art pedestrian detectors. We find that pedestrian detectors still have a long way to go before they reach human performance, and we diagnose failure modes of several top performing detectors, giving direction to future research. As a side-effect we publish new, better localised annotations for the Caltech pedestrian benchmark. We analyse detection proposals as a preprocessing step for object detectors. We establish different metrics and compare a wide range of methods according to these metrics. By examining the relationship between localisation of proposals and final object detection performance, we define and experimentally verify a metric that can be used as a proxy for detector performance. Furthermore, we address a structural weakness of virtually all object detection pipelines: non-maximum suppression. We analyse why it is necessary and what the shortcomings of the most common approach are. To address these problems, we present work to overcome these shortcomings and to replace typical non-maximum suppression with a learnable alternative. The introduced paradigm paves the way to true end-to-end learning of object detectors without any post-processing. In summary, this thesis provides analyses of recent pedestrian detectors and detection proposals, improves pedestrian detection by employing deep neural networks, and presents a viable alternative to traditional non-maximum suppression.

BibTeX

@phdthesis{Hosangphd17,
TITLE = {Analysis and Improvement of the Visual Object Detection Pipeline},
AUTHOR = {Hosang, Jan},
LANGUAGE = {eng},
URL = {urn:nbn:de:bsz:291-scidok-69080},
DOI = {10.22028/D291-26774},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2017},
DATE = {2017},
ABSTRACT = {Visual object detection has seen substantial improvements during the last years due to the possibilities enabled by deep learning. While research on image classification provides continuous progress on how to learn image representations and classifiers jointly, object detection research focuses on identifying how to properly use deep learning technology to effectively localise objects. In this thesis, we analyse and improve different aspects of the commonly used detection pipeline. We analyse ten years of research on pedestrian detection and find that improvement of feature representations was the driving factor. Motivated by this finding, we adapt an end-to-end learned detector architecture from general object detection to pedestrian detection. Our deep network outperforms all previous neural networks for pedestrian detection by a large margin, even without using additional training data. After substantial improvements on pedestrian detection in recent years, we investigate the gap between human performance and state-of-the-art pedestrian detectors. We find that pedestrian detectors still have a long way to go before they reach human performance, and we diagnose failure modes of several top performing detectors, giving direction to future research. As a side-effect we publish new, better localised annotations for the Caltech pedestrian benchmark. We analyse detection proposals as a preprocessing step for object detectors. We establish different metrics and compare a wide range of methods according to these metrics. By examining the relationship between localisation of proposals and final object detection performance, we define and experimentally verify a metric that can be used as a proxy for detector performance. Furthermore, we address a structural weakness of virtually all object detection pipelines: non-maximum suppression. We analyse why it is necessary and what the shortcomings of the most common approach are. To address these problems, we present work to overcome these shortcomings and to replace typical non-maximum suppression with a learnable alternative. The introduced paradigm paves the way to true end-to-end learning of object detectors without any post-processing. In summary, this thesis provides analyses of recent pedestrian detectors and detection proposals, improves pedestrian detection by employing deep neural networks, and presents a viable alternative to traditional non-maximum suppression.},
}

Endnote

%0 Thesis
%A Hosang, Jan
%Y Schiele, Bernt
%A referee: Ferrari, Vittorio
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
%T Analysis and Improvement of the Visual Object Detection Pipeline : 
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002D-8CC9-B
%U urn:nbn:de:bsz:291-scidok-69080
%R 10.22028/D291-26774
%F OTHER: hdl:20.500.11880/26787
%I Universit&#228;t des Saarlandes
%C Saarbr&#252;cken
%D 2017
%P 205 p.
%V phd
%9 phd
%X Visual object detection has seen substantial improvements during the last years due to the possibilities enabled by deep learning. While research on image classification provides continuous progress on how to learn image representations and classifiers jointly, object detection research focuses on identifying how to properly use deep learning technology to effectively localise objects. In this thesis, we analyse and improve different aspects of the commonly used detection pipeline. We analyse ten years of research on pedestrian detection and find that improvement of feature representations was the driving factor. Motivated by this finding, we adapt an end-to-end learned detector architecture from general object detection to pedestrian detection. Our deep network outperforms all previous neural networks for pedestrian detection by a large margin, even without using additional training data. After substantial improvements on pedestrian detection in recent years, we investigate the gap between human performance and state-of-the-art pedestrian detectors. We find that pedestrian detectors still have a long way to go before they reach human performance, and we diagnose failure modes of several top performing detectors, giving direction to future research. As a side-effect we publish new, better localised annotations for the Caltech pedestrian benchmark. We analyse detection proposals as a preprocessing step for object detectors. We establish different metrics and compare a wide range of methods according to these metrics. By examining the relationship between localisation of proposals and final object detection performance, we define and experimentally verify a metric that can be used as a proxy for detector performance. Furthermore, we address a structural weakness of virtually all object detection pipelines: non-maximum suppression. We analyse why it is necessary and what the shortcomings of the most common approach are. To address these problems, we present work to overcome these shortcomings and to replace typical non-maximum suppression with a learnable alternative. The introduced paradigm paves the way to true end-to-end learning of object detectors without any post-processing. In summary, this thesis provides analyses of recent pedestrian detectors and detection proposals, improves pedestrian detection by employing deep neural networks, and presents a viable alternative to traditional non-maximum suppression.
%U http://scidok.sulb.uni-saarland.de/doku/lic_ohne_pod.php?la=dehttp://scidok.sulb.uni-saarland.de/volltexte/2017/6908/

2016

Conference paper

S. Zhang, R. Benenson, M. Omran, J. Hosang, and B. Schiele

“How Far are We from Solving Pedestrian Detection?,” in 29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), Las Vegas, NV, USA, 2016.

@inproceedings{Shanshan2016CVPR,
TITLE = {How Far are We from Solving Pedestrian Detection?},
AUTHOR = {Zhang, Shanshan and Benenson, Rodrigo and Omran, Mohamed and Hosang, Jan and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-1-4673-8852-8},
DOI = {10.1109/CVPR.2016.141},
PUBLISHER = {IEEE Computer Society},
YEAR = {2016},
DATE = {2016},
BOOKTITLE = {29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016)},
PAGES = {1259--1267},
ADDRESS = {Las Vegas, NV, USA},
}

Endnote

%0 Conference Proceedings
%A Zhang, Shanshan
%A Benenson, Rodrigo
%A Omran, Mohamed
%A Hosang, Jan
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T How Far are We from Solving Pedestrian Detection? : 
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002A-D905-C
%R 10.1109/CVPR.2016.141
%D 2016
%B 29th IEEE Conference on Computer Vision and Pattern Recognition 
%Z date of event: 2016-06-26 - 2016-07-01
%C Las Vegas, NV, USA
%B 29th IEEE Conference on Computer Vision and Pattern Recognition 
%P 1259 - 1267
%I IEEE Computer Society
%@ 978-1-4673-8852-8

Article

J. Hosang, R. Benenson, P. Dollár, and B. Schiele

“What Makes for Effective Detection Proposals?,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 4, 2016.

@article{945,
TITLE = {What Makes for Effective Detection Proposals?},
AUTHOR = {Hosang, Jan and Benenson, Rodrigo and Doll{\'a}r, Piotr and Schiele, Bernt},
LANGUAGE = {eng},
ISSN = {0162-8828},
DOI = {10.1109/TPAMI.2015.2465908},
PUBLISHER = {IEEE Computer Society},
ADDRESS = {New York, NY},
YEAR = {2016},
DATE = {2016},
JOURNAL = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
VOLUME = {38},
NUMBER = {4},
PAGES = {814--830},
}

Endnote

%0 Journal Article
%A Hosang, Jan
%A Benenson, Rodrigo
%A Doll&#225;r, Piotr
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T What Makes for Effective Detection Proposals? : 
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0029-D3C7-D
%R 10.1109/TPAMI.2015.2465908
%7 2015-08-07
%D 2016
%J IEEE Transactions on Pattern Analysis and Machine Intelligence
%O IEEE Trans. Pattern Anal. Mach. Intell. TPAMI
%V 38
%N 4
%& 814
%P 814 - 830
%I IEEE Computer Society
%C New York, NY
%@ false

Conference paper

J. Hosang, R. Benenson, and B. Schiele

“A Convnet for Non-maximum Suppression,” in Pattern Recognition (GCPR 2016), Hannover, Germany, 2016.

Abstract

Non-maximum suppression (NMS) is used in virtually all state-of-the-art

object detection pipelines. While essential object detection ingredients such

as features, classifiers, and proposal methods have been extensively researched

surprisingly little work has aimed to systematically address NMS. The de-facto

standard for NMS is based on greedy clustering with a fixed distance threshold,

which forces to trade-off recall versus precision. We propose a convnet

designed to perform NMS of a given set of detections. We report experiments on

a synthetic setup, and results on crowded pedestrian detection scenes. Our

approach overcomes the intrinsic limitations of greedy NMS, obtaining better

recall and precision.

BibTeX

@inproceedings{Hosang2016Gcpr,
TITLE = {A Convnet for Non-maximum Suppression},
AUTHOR = {Hosang, Jan and Benenson, Rodrigo and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-3-319-45885-4},
DOI = {10.1007/978-3-319-45886-1 16},
PUBLISHER = {Springer},
YEAR = {2016},
DATE = {2016},
ABSTRACT = {Non-maximum suppression (NMS) is used in virtually all state-of-the-art object detection pipelines. While essential object detection ingredients such as features, classifiers, and proposal methods have been extensively researched surprisingly little work has aimed to systematically address NMS. The de-facto standard for NMS is based on greedy clustering with a fixed distance threshold, which forces to trade-off recall versus precision. We propose a convnet designed to perform NMS of a given set of detections. We report experiments on a synthetic setup, and results on crowded pedestrian detection scenes. Our approach overcomes the intrinsic limitations of greedy NMS, obtaining better recall and precision.},
BOOKTITLE = {Pattern Recognition (GCPR 2016)},
EDITOR = {Rosenhahn, Bodo and Andres, Bj{\"o}rn},
PAGES = {192--204},
SERIES = {Lecture Notes in Computer Science},
VOLUME = {9796},
ADDRESS = {Hannover, Germany},
}

Endnote

%0 Conference Proceedings
%A Hosang, Jan
%A Benenson, Rodrigo
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T A Convnet for Non-maximum Suppression : 
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0029-594D-A
%R 10.1007/978-3-319-45886-1 16
%D 2016
%B 38th German Conference on Pattern Recognition
%Z date of event: 2016-09-12 - 2016-09-15
%C Hannover, Germany
%X   Non-maximum suppression (NMS) is used in virtually all state-of-the-art
object detection pipelines. While essential object detection ingredients such
as features, classifiers, and proposal methods have been extensively researched
surprisingly little work has aimed to systematically address NMS. The de-facto
standard for NMS is based on greedy clustering with a fixed distance threshold,
which forces to trade-off recall versus precision. We propose a convnet
designed to perform NMS of a given set of detections. We report experiments on
a synthetic setup, and results on crowded pedestrian detection scenes. Our
approach overcomes the intrinsic limitations of greedy NMS, obtaining better
recall and precision.

%K Computer Science, Computer Vision and Pattern Recognition, cs.CV,Computer Science, Learning, cs.LG
%B Pattern Recognition
%E Rosenhahn, Bodo; Andres, Bj&#246;rn
%P 192 - 204
%I Springer
%@ 978-3-319-45885-4
%B Lecture Notes in Computer Science
%N 9796

2015

Conference paper

J. Hosang, M. Omran, R. Benenson, and B. Schiele

“Taking a Deeper Look at Pedestrians,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), Boston, MA, USA, 2015.

@inproceedings{Hosang15cvpr,
TITLE = {Taking a Deeper Look at Pedestrians},
AUTHOR = {Hosang, Jan and Omran, Mohamed and Benenson, Rodrigo and Schiele, Bernt},
LANGUAGE = {eng},
DOI = {10.1109/CVPR.2015.7299034},
PUBLISHER = {IEEE Computer Society},
YEAR = {2015},
DATE = {2015},
BOOKTITLE = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015)},
PAGES = {4073--4082},
ADDRESS = {Boston, MA, USA},
}

Endnote

%0 Conference Proceedings
%A Hosang, Jan
%A Omran, Mohamed
%A Benenson, Rodrigo
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Taking a Deeper Look at Pedestrians : 
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0025-01BF-0
%R 10.1109/CVPR.2015.7299034
%D 2015
%B IEEE Conference on Computer Vision and Pattern Recognition
%Z date of event: 2015-06-08 - 2015-06-10
%C Boston, MA, USA
%B IEEE Conference on Computer Vision and Pattern Recognition
%P 4073 - 4082
%I IEEE Computer Society

Article

T. Deselaers, D. Keysers, J. Hosang, and H. Rowley

“GyroPen: Gyroscopes for Pen-Input with Mobile Phones,” IEEE Transactions on Human-Machine Systems, vol. 45, no. 2, 2015.

@article{Deselaers2014Thms,
TITLE = {{GyroPen}: {Gyroscopes} for Pen-Input with Mobile Phones},
AUTHOR = {Deselaers, Thomas and Keysers, Daniel and Hosang, Jan and Rowley, Henry},
LANGUAGE = {eng},
ISSN = {2168-2291},
DOI = {10.1109/THMS.2014.2365723},
PUBLISHER = {IEEE},
ADDRESS = {Piscataway, NJ},
YEAR = {2015},
DATE = {2015},
JOURNAL = {IEEE Transactions on Human-Machine Systems},
VOLUME = {45},
NUMBER = {2},
PAGES = {263--271},
}

Endnote

%0 Journal Article
%A Deselaers, Thomas
%A Keysers, Daniel
%A Hosang, Jan
%A Rowley, Henry
%+ External Organizations
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
%T GyroPen: Gyroscopes for Pen-Input with Mobile Phones : 
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-4BD2-4
%R 10.1109/THMS.2014.2365723
%7 2014-12-04
%D 2015
%J IEEE Transactions on Human-Machine Systems
%V 45
%N 2
%& 263
%P 263 - 271
%I IEEE
%C Piscataway, NJ
%@ false

Paper

J. Hosang, R. Benenson, P. Dollár, and B. Schiele

“What Makes for Effective Detection Proposals?,” 2015. [Online]. Available: http://arxiv.org/abs/1502.05082.

Abstract

Current top performing object detectors employ detection proposals to guide

the search for objects, thereby avoiding exhaustive sliding window search

across images. Despite the popularity and widespread use of detection

proposals, it is unclear which trade-offs are made when using them during

object detection. We provide an in-depth analysis of twelve proposal methods

along with four baselines regarding proposal repeatability, ground truth

annotation recall on PASCAL and ImageNet, and impact on DPM and R-CNN detection

performance. Our analysis shows that for object detection improving proposal

localisation accuracy is as important as improving recall. We introduce a novel

metric, the average recall (AR), which rewards both high recall and good

localisation and correlates surprisingly well with detector performance. Our

findings show common strengths and weaknesses of existing methods, and provide

insights and metrics for selecting and tuning proposal methods.

BibTeX

@online{Hosang2015arXiv,
TITLE = {What Makes for Effective Detection Proposals?},
AUTHOR = {Hosang, Jan and Benenson, Rodrigo and Doll{\'a}r, Piotr and Schiele, Bernt},
LANGUAGE = {eng},
URL = {http://arxiv.org/abs/1502.05082},
EPRINT = {1502.05082},
EPRINTTYPE = {arXiv},
YEAR = {2015},
ABSTRACT = {Current top performing object detectors employ detection proposals to guide the search for objects, thereby avoiding exhaustive sliding window search across images. Despite the popularity and widespread use of detection proposals, it is unclear which trade-offs are made when using them during object detection. We provide an in-depth analysis of twelve proposal methods along with four baselines regarding proposal repeatability, ground truth annotation recall on PASCAL and ImageNet, and impact on DPM and R-CNN detection performance. Our analysis shows that for object detection improving proposal localisation accuracy is as important as improving recall. We introduce a novel metric, the average recall (AR), which rewards both high recall and good localisation and correlates surprisingly well with detector performance. Our findings show common strengths and weaknesses of existing methods, and provide insights and metrics for selecting and tuning proposal methods.},
}

Endnote

%0 Report
%A Hosang, Jan
%A Benenson, Rodrigo
%A Doll&#225;r, Piotr
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T What Makes for Effective Detection Proposals? : 
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-CCFC-6
%U http://arxiv.org/abs/1502.05082
%D 2015
%X   Current top performing object detectors employ detection proposals to guide
the search for objects, thereby avoiding exhaustive sliding window search
across images. Despite the popularity and widespread use of detection
proposals, it is unclear which trade-offs are made when using them during
object detection. We provide an in-depth analysis of twelve proposal methods
along with four baselines regarding proposal repeatability, ground truth
annotation recall on PASCAL and ImageNet, and impact on DPM and R-CNN detection
performance. Our analysis shows that for object detection improving proposal
localisation accuracy is as important as improving recall. We introduce a novel
metric, the average recall (AR), which rewards both high recall and good
localisation and correlates surprisingly well with detector performance. Our
findings show common strengths and weaknesses of existing methods, and provide
insights and metrics for selecting and tuning proposal methods.

%K Computer Science, Computer Vision and Pattern Recognition, cs.CV

2014

Conference paper

R. Benenson, M. Omran, J. Hosang, and B. Schiele

“Ten Years of Pedestrian Detection, What Have We Learned?,” in Computer Vision - ECCV 2014 Workshops (ECCV 2014 Workshop CVRSUAD), Zürich, Switzerland, 2015.

@inproceedings{890,
TITLE = {Ten Years of Pedestrian Detection, What Have We Learned?},
AUTHOR = {Benenson, Rodrigo and Omran, Mohamed and Hosang, Jan and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-3-319-16180-8},
DOI = {10.1007/978-3-319-16181-5_47},
PUBLISHER = {Springer},
YEAR = {2014},
DATE = {2015},
BOOKTITLE = {Computer Vision -- ECCV 2014 Workshops (ECCV 2014 Workshop CVRSUAD)},
EDITOR = {Agapito, Lourdes and Bronstein, Michael M. and Rother, Carsten},
PAGES = {613--627},
SERIES = {Lecture Notes in Computer Science},
VOLUME = {8926},
ADDRESS = {Z{\"u}rich, Switzerland},
}

Endnote

%0 Conference Proceedings
%A Benenson, Rodrigo
%A Omran, Mohamed
%A Hosang, Jan
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Ten Years of Pedestrian Detection, What Have We Learned? : 
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-4C6A-8
%R 10.1007/978-3-319-16181-5_47
%D 2015
%B 2nd Workshop on Computer Vision for Road Scene Understanding and Autonomous Driving
%Z date of event: 2014-09-07 - 2014-09-07
%C Z&#252;rich, Switzerland
%B Computer Vision - ECCV 2014 Workshops
%E Agapito, Lourdes; Bronstein, Michael M.; Rother, Carsten
%P 613 - 627
%I Springer
%@ 978-3-319-16180-8
%B Lecture Notes in Computer Science
%N 8926

Conference paper

J. Hosang, R. Benenson, and B. Schiele

“How Good are Detection Proposals, really?,” in Proceedings of the British Machine Vision Conference (BMVC 2014), Nottingham, UK, 2014.

Abstract

Current top performing Pascal VOC object detectors employ detection proposals to guide the search for objects thereby avoiding exhaustive sliding window search across images. Despite the popularity of detection proposals, it is unclear which trade‐offs are made when using them during object detection. We provide an in depth analysis of ten object proposal methods along with four baselines regarding ground truth annotation recall (on Pascal VOC 2007 and ImageNet 2013), repeatability, and impact on DPM detector performance. Our findings show common weaknesses of existing methods, and provide insights to choose the most adequate method for different settings.

BibTeX

@inproceedings{Hosang2013Bmvc,
TITLE = {How Good are Detection Proposals, really?},
AUTHOR = {Hosang, Jan and Benenson, Rodrigo and Schiele, Bernt},
LANGUAGE = {eng},
PUBLISHER = {BMVA Press},
YEAR = {2014},
ABSTRACT = {Current top performing Pascal VOC object detectors employ detection proposals to guide the search for objects thereby avoiding exhaustive sliding window search across images. Despite the popularity of detection proposals, it is unclear which trade-offs are made when using them during object detection. We provide an in depth analysis of ten object proposal methods along with four baselines regarding ground truth annotation recall (on Pascal VOC 2007 and ImageNet 2013), repeatability, and impact on DPM detector performance. Our findings show common weaknesses of existing methods, and provide insights to choose the most adequate method for different settings.},
BOOKTITLE = {Proceedings of the British Machine Vision Conference (BMVC 2014)},
EDITOR = {Valstar, Michel and French, Andrew and Pridmore, Tony},
PAGES = {1--12},
EID = {82},
ADDRESS = {Nottingham, UK},
}

Endnote

%0 Conference Proceedings
%A Hosang, Jan
%A Benenson, Rodrigo
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T How Good are Detection Proposals, really? : 
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-3C2E-2
%D 2014
%B 25th British Machine Vision Conference
%Z date of event: 2014-09-01 - 2014-09-05
%C Nottingham, UK
%X Current top performing Pascal VOC object detectors employ detection proposals to guide the search for objects thereby avoiding exhaustive sliding window search across images. Despite the popularity of detection proposals, it is unclear which trade&#8208;offs are made when using them during object detection. We provide an in depth analysis of ten object proposal methods along with four baselines regarding ground truth annotation recall (on Pascal VOC 2007 and ImageNet 2013), repeatability, and impact on DPM detector performance. Our findings show common weaknesses of existing methods, and provide insights to choose the most adequate method for different settings.
%B Proceedings of the British Machine Vision Conference
%E Valstar, Michel; French, Andrew; Pridmore, Tony
%P 1 - 12
%Z sequence number: 82
%I BMVA Press