Mohamed Omran (PhD Student)

Personal Information
Research Interests
- Computer Vision (looking at people, model-based recognition, detection and grouping, analysis-by-synthesis and early vision)
- Machine Learning (deep learning, structured output prediction)
Education
- 2014–present, Ph.D. student in Computer Science, Max Planck Institute for Informatics
- 2014, M.Sc. in Visual Computing, Saarland University
- 2011, B.Sc. in Media Informatics, Ulm University
Recent Positions
- Research Assistant, Daimler Research and Development, Ulm 2010/11 (supervisors: Prof. Dariu Gavrila, Dr. Markus Enzweiler)
Teaching
- Machine Learning (Dr. Mario Fritz and Dr. Bjoern Andres, 2014/15)
- Image Acquisition Methods (Pascal Peter and Prof. Joachim Weickert, 2014)
- Differential Equations in Image Processing and Computer Vision (Prof. Joachim Weickert, 2013)
- Image Processing and Computer Vision, (Prof. Joachim Weickert, 2012/13)
Reviewing Activities
- IEEE Intelligent Vehicles Symposium 2015
- CVPR 2018/2019
- ACCV 2018
- ICCV 2019
- 3DV 2019
- IEEE TPAMI
- IEEE TIP
- IJCV
- IEEE Robotics and Automation Letters
- IEEE T-CVST
Awards
- 3DV '18 Best Student Paper Award
- Outstanding Reviewer (CVPR 2018, CVPR 2019, ICCV 2019)
- IMPRS-CS M.Sc. Fellowship
Personal Pages
Publications
From Pixels to People
M. Omran
PhD Thesis, Universität des Saarlandes, 2021
M. Omran
PhD Thesis, Universität des Saarlandes, 2021
Abstract
Abstract<br>Humans are at the centre of a significant amount of research in computer vision.<br>Endowing machines with the ability to perceive people from visual data is an immense<br>scientific challenge with a high degree of direct practical relevance. Success in automatic<br>perception can be measured at different levels of abstraction, and this will depend on<br>which intelligent behaviour we are trying to replicate: the ability to localise persons in<br>an image or in the environment, understanding how persons are moving at the skeleton<br>and at the surface level, interpreting their interactions with the environment including<br>with other people, and perhaps even anticipating future actions. In this thesis we tackle<br>different sub-problems of the broad research area referred to as "looking at people",<br>aiming to perceive humans in images at different levels of granularity.<br>We start with bounding box-level pedestrian detection: We present a retrospective<br>analysis of methods published in the decade preceding our work, identifying various<br>strands of research that have advanced the state of the art. With quantitative exper-<br>iments, we demonstrate the critical role of developing better feature representations<br>and having the right training distribution. We then contribute two methods based<br>on the insights derived from our analysis: one that combines the strongest aspects of<br>past detectors and another that focuses purely on learning representations. The latter<br>method outperforms more complicated approaches, especially those based on hand-<br>crafted features. We conclude our work on pedestrian detection with a forward-looking<br>analysis that maps out potential avenues for future research.<br>We then turn to pixel-level methods: Perceiving humans requires us to both separate<br>them precisely from the background and identify their surroundings. To this end, we<br>introduce Cityscapes, a large-scale dataset for street scene understanding. This has since<br>established itself as a go-to benchmark for segmentation and detection. We additionally<br>develop methods that relax the requirement for expensive pixel-level annotations, focusing<br>on the task of boundary detection, i.e. identifying the outlines of relevant objects and<br>surfaces. Next, we make the jump from pixels to 3D surfaces, from localising and<br>labelling to fine-grained spatial understanding. We contribute a method for recovering<br>3D human shape and pose, which marries the advantages of learning-based and model-<br>based approaches.<br>We conclude the thesis with a detailed discussion of benchmarking practices in<br>computer vision. Among other things, we argue that the design of future datasets<br>should be driven by the general goal of combinatorial robustness besides task-specific<br>considerations.
Export
BibTeX
@phdthesis{Omranphd2021,
TITLE = {From Pixels to People},
AUTHOR = {Omran, Mohamed},
LANGUAGE = {eng},
URL = {nbn:de:bsz:291--ds-366053},
DOI = {10.22028/D291-36605},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2021},
MARGINALMARK = {$\bullet$},
DATE = {2021},
ABSTRACT = {Abstract<br>Humans are at the centre of a significant amount of research in computer vision.<br>Endowing machines with the ability to perceive people from visual data is an immense<br>scientific challenge with a high degree of direct practical relevance. Success in automatic<br>perception can be measured at different levels of abstraction, and this will depend on<br>which intelligent behaviour we are trying to replicate: the ability to localise persons in<br>an image or in the environment, understanding how persons are moving at the skeleton<br>and at the surface level, interpreting their interactions with the environment including<br>with other people, and perhaps even anticipating future actions. In this thesis we tackle<br>different sub-problems of the broad research area referred to as "looking at people",<br>aiming to perceive humans in images at different levels of granularity.<br>We start with bounding box-level pedestrian detection: We present a retrospective<br>analysis of methods published in the decade preceding our work, identifying various<br>strands of research that have advanced the state of the art. With quantitative exper-<br>iments, we demonstrate the critical role of developing better feature representations<br>and having the right training distribution. We then contribute two methods based<br>on the insights derived from our analysis: one that combines the strongest aspects of<br>past detectors and another that focuses purely on learning representations. The latter<br>method outperforms more complicated approaches, especially those based on hand-<br>crafted features. We conclude our work on pedestrian detection with a forward-looking<br>analysis that maps out potential avenues for future research.<br>We then turn to pixel-level methods: Perceiving humans requires us to both separate<br>them precisely from the background and identify their surroundings. To this end, we<br>introduce Cityscapes, a large-scale dataset for street scene understanding. This has since<br>established itself as a go-to benchmark for segmentation and detection. We additionally<br>develop methods that relax the requirement for expensive pixel-level annotations, focusing<br>on the task of boundary detection, i.e. identifying the outlines of relevant objects and<br>surfaces. Next, we make the jump from pixels to 3D surfaces, from localising and<br>labelling to fine-grained spatial understanding. We contribute a method for recovering<br>3D human shape and pose, which marries the advantages of learning-based and model-<br>based approaches.<br>We conclude the thesis with a detailed discussion of benchmarking practices in<br>computer vision. Among other things, we argue that the design of future datasets<br>should be driven by the general goal of combinatorial robustness besides task-specific<br>considerations.},
}
Endnote
%0 Thesis
%A Omran, Mohamed
%Y Schiele, Bernt
%A referee: Gall, Jürgen
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Graphics, MPI for Informatics, Max Planck Society
%T From Pixels to People : Recovering Location, Shape and Pose
of Humans in Images
%G eng
%U http://hdl.handle.net/21.11116/0000-000A-CDBF-9
%R 10.22028/D291-36605
%U nbn:de:bsz:291--ds-366053
%F OTHER: hdl:20.500.11880/33466
%I Universität des Saarlandes
%C Saarbrücken
%D 2021
%P 252 p.
%V phd
%9 phd
%X Abstract<br>Humans are at the centre of a significant amount of research in computer vision.<br>Endowing machines with the ability to perceive people from visual data is an immense<br>scientific challenge with a high degree of direct practical relevance. Success in automatic<br>perception can be measured at different levels of abstraction, and this will depend on<br>which intelligent behaviour we are trying to replicate: the ability to localise persons in<br>an image or in the environment, understanding how persons are moving at the skeleton<br>and at the surface level, interpreting their interactions with the environment including<br>with other people, and perhaps even anticipating future actions. In this thesis we tackle<br>different sub-problems of the broad research area referred to as "looking at people",<br>aiming to perceive humans in images at different levels of granularity.<br>We start with bounding box-level pedestrian detection: We present a retrospective<br>analysis of methods published in the decade preceding our work, identifying various<br>strands of research that have advanced the state of the art. With quantitative exper-<br>iments, we demonstrate the critical role of developing better feature representations<br>and having the right training distribution. We then contribute two methods based<br>on the insights derived from our analysis: one that combines the strongest aspects of<br>past detectors and another that focuses purely on learning representations. The latter<br>method outperforms more complicated approaches, especially those based on hand-<br>crafted features. We conclude our work on pedestrian detection with a forward-looking<br>analysis that maps out potential avenues for future research.<br>We then turn to pixel-level methods: Perceiving humans requires us to both separate<br>them precisely from the background and identify their surroundings. To this end, we<br>introduce Cityscapes, a large-scale dataset for street scene understanding. This has since<br>established itself as a go-to benchmark for segmentation and detection. We additionally<br>develop methods that relax the requirement for expensive pixel-level annotations, focusing<br>on the task of boundary detection, i.e. identifying the outlines of relevant objects and<br>surfaces. Next, we make the jump from pixels to 3D surfaces, from localising and<br>labelling to fine-grained spatial understanding. We contribute a method for recovering<br>3D human shape and pose, which marries the advantages of learning-based and model-<br>based approaches.<br>We conclude the thesis with a detailed discussion of benchmarking practices in<br>computer vision. Among other things, we argue that the design of future datasets<br>should be driven by the general goal of combinatorial robustness besides task-specific<br>considerations.
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/33466
Towards Reaching Human Performance in Pedestrian Detection
S. Zhang, R. Benenson, M. Omran, J. Hosang and B. Schiele
IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 40, Number 4, 2018
S. Zhang, R. Benenson, M. Omran, J. Hosang and B. Schiele
IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 40, Number 4, 2018
Abstract
Encouraged by the recent progress in pedestrian detection, we investigate the gap between current state-of-the-art methods
and the “perfect single frame detector”. We enable our analysis by creating a human baseline for pedestrian detection (over the Caltech
pedestrian dataset). After manually clustering the frequent errors of a top detector, we characterise both localisation and background-
versus-foreground errors.
To address localisation errors we study the impact of training annotation noise on the detector performance, and show that we can
improve results even with a small portion of sanitised training data. To address background/foreground discrimination, we study convnets
for pedestrian detection, and discuss which factors affect their performance.
Other than our in-depth analysis, we report top performance on the Caltech pedestrian dataset, and provide a new sanitised set of
training and test annotations.
Export
BibTeX
@article{ZBOHS2017,
TITLE = {Towards Reaching Human Performance in Pedestrian Detection},
AUTHOR = {Zhang, Shanshan and Benenson, Rodrigo and Omran, Mohamed and Hosang, Jan and Schiele, Bernt},
LANGUAGE = {eng},
ISSN = {0162-8828},
DOI = {10.1109/TPAMI.2017.2700460},
PUBLISHER = {IEEE Computer Society},
ADDRESS = {Los Alamitos, CA},
YEAR = {2018},
DATE = {2018},
ABSTRACT = {Encouraged by the recent progress in pedestrian detection, we investigate the gap between current state-of-the-art methods and the {\textquotedblleft}perfect single frame detector{\textquotedblright}. We enable our analysis by creating a human baseline for pedestrian detection (over the Caltech pedestrian dataset). After manually clustering the frequent errors of a top detector, we characterise both localisation and background- versus-foreground errors. To address localisation errors we study the impact of training annotation noise on the detector performance, and show that we can improve results even with a small portion of sanitised training data. To address background/foreground discrimination, we study convnets for pedestrian detection, and discuss which factors affect their performance. Other than our in-depth analysis, we report top performance on the Caltech pedestrian dataset, and provide a new sanitised set of training and test annotations.},
JOURNAL = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
VOLUME = {40},
NUMBER = {4},
PAGES = {973--986},
}
Endnote
%0 Journal Article
%A Zhang, Shanshan
%A Benenson, Rodrigo
%A Omran, Mohamed
%A Hosang, Jan
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Towards Reaching Human Performance in Pedestrian Detection :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002D-440B-2
%R 10.1109/TPAMI.2017.2700460
%7 2017-05-02
%D 2018
%X Encouraged by the recent progress in pedestrian detection, we investigate the gap between current state-of-the-art methods
and the “perfect single frame detector”. We enable our analysis by creating a human baseline for pedestrian detection (over the Caltech
pedestrian dataset). After manually clustering the frequent errors of a top detector, we characterise both localisation and background-
versus-foreground errors.
To address localisation errors we study the impact of training annotation noise on the detector performance, and show that we can
improve results even with a small portion of sanitised training data. To address background/foreground discrimination, we study convnets
for pedestrian detection, and discuss which factors affect their performance.
Other than our in-depth analysis, we report top performance on the Caltech pedestrian dataset, and provide a new sanitised set of
training and test annotations.
%J IEEE Transactions on Pattern Analysis and Machine Intelligence
%O IEEE Trans. Pattern Anal. Mach. Intell. TPAMI
%V 40
%N 4
%& 973
%P 973 - 986
%I IEEE Computer Society
%C Los Alamitos, CA
%@ false
Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation
M. Omran, C. Lassner,, G. Pons-Moll, P. Gehler and B. Schiele
3DV 2018 , International Conference on 3D Vision, 2018
M. Omran, C. Lassner,, G. Pons-Moll, P. Gehler and B. Schiele
3DV 2018 , International Conference on 3D Vision, 2018
Export
BibTeX
@inproceedings{omran2018nbf,
TITLE = {Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation},
AUTHOR = {Omran, Mohamed and Lassner,, Christoph and Pons-Moll, Gerard and Gehler, Peter and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-1-5386-8425-2 ; 978-1-5386-8426-9},
DOI = {10.1109/3DV.2018.00062},
PUBLISHER = {IEEE},
YEAR = {2018},
DATE = {2018},
BOOKTITLE = {3DV 2018 , International Conference on 3D Vision},
PAGES = {484--494},
ADDRESS = {Verona, Italy},
}
Endnote
%0 Conference Proceedings
%A Omran, Mohamed
%A Lassner,, Christoph
%A Pons-Moll, Gerard
%A Gehler, Peter
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation :
%G eng
%U http://hdl.handle.net/21.11116/0000-0001-E564-C
%R 10.1109/3DV.2018.00062
%D 2018
%B International Conference on 3D Vision
%Z date of event: 2018-09-05 - 2018-09-08
%C Verona, Italy
%B 3DV 2018
%P 484 - 494
%I IEEE
%@ 978-1-5386-8425-2 978-1-5386-8426-9
Joint Graph Decomposition and Node Labeling: Problem, Algorithms, Applications
E. Levinkov, J. Uhrig, S. Tang, M. Omran, E. Insafutdinov, A. Kirillov, C. Rother, T. Brox, B. Schiele and B. Andres
30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), 2017
E. Levinkov, J. Uhrig, S. Tang, M. Omran, E. Insafutdinov, A. Kirillov, C. Rother, T. Brox, B. Schiele and B. Andres
30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017), 2017
Export
BibTeX
@inproceedings{levinkov-2017-cvpr,
TITLE = {Joint Graph Decomposition and Node Labeling: {P}roblem, Algorithms, Applications},
AUTHOR = {Levinkov, Evgeny and Uhrig, Jonas and Tang, Siyu and Omran, Mohamed and Insafutdinov, Eldar and Kirillov, Alexander and Rother, Carsten and Brox, Thomas and Schiele, Bernt and Andres, Bjoern},
LANGUAGE = {eng},
ISBN = {978-1-5386-0458-8},
DOI = {10.1109/CVPR.2017.206},
PUBLISHER = {IEEE Computer Society},
YEAR = {2017},
DATE = {2017},
BOOKTITLE = {30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017)},
PAGES = {1904--1912},
ADDRESS = {Honolulu, HI, USA},
}
Endnote
%0 Conference Proceedings
%A Levinkov, Evgeny
%A Uhrig, Jonas
%A Tang, Siyu
%A Omran, Mohamed
%A Insafutdinov, Eldar
%A Kirillov, Alexander
%A Rother, Carsten
%A Brox, Thomas
%A Schiele, Bernt
%A Andres, Bjoern
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Joint Graph Decomposition and Node Labeling: Problem, Algorithms, Applications :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002D-05DB-2
%R 10.1109/CVPR.2017.206
%D 2017
%B 30th IEEE Conference on Computer Vision and Pattern Recognition
%Z date of event: 2017-07-22 - 2017-07-25
%C Honolulu, HI, USA
%B 30th IEEE Conference on Computer Vision and Pattern Recognition
%P 1904 - 1912
%I IEEE Computer Society
%@ 978-1-5386-0458-8
How Far are We from Solving Pedestrian Detection?
S. Zhang, R. Benenson, M. Omran, J. Hosang and B. Schiele
29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), 2016
S. Zhang, R. Benenson, M. Omran, J. Hosang and B. Schiele
29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), 2016
Export
BibTeX
@inproceedings{Shanshan2016CVPR,
TITLE = {How Far are We from Solving Pedestrian Detection?},
AUTHOR = {Zhang, Shanshan and Benenson, Rodrigo and Omran, Mohamed and Hosang, Jan and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-1-4673-8852-8},
DOI = {10.1109/CVPR.2016.141},
PUBLISHER = {IEEE Computer Society},
YEAR = {2016},
DATE = {2016},
BOOKTITLE = {29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016)},
PAGES = {1259--1267},
ADDRESS = {Las Vegas, NV, USA},
}
Endnote
%0 Conference Proceedings
%A Zhang, Shanshan
%A Benenson, Rodrigo
%A Omran, Mohamed
%A Hosang, Jan
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T How Far are We from Solving Pedestrian Detection? :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002A-D905-C
%R 10.1109/CVPR.2016.141
%D 2016
%B 29th IEEE Conference on Computer Vision and Pattern Recognition
%Z date of event: 2016-06-26 - 2016-07-01
%C Las Vegas, NV, USA
%B 29th IEEE Conference on Computer Vision and Pattern Recognition
%P 1259 - 1267
%I IEEE Computer Society
%@ 978-1-4673-8852-8
The Cityscapes Dataset for Semantic Urban Scene Understanding
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth and B. Schiele
29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), 2016
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth and B. Schiele
29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), 2016
Export
BibTeX
@inproceedings{Cordts2016,
TITLE = {The Cityscapes Dataset for Semantic Urban Scene Understanding},
AUTHOR = {Cordts, Marius and Omran, Mohamed and Ramos, Sebastian and Rehfeld, Timo and Enzweiler, Markus and Benenson, Rodrigo and Franke, Uwe and Roth, Stefan and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-1-4673-8852-8},
DOI = {10.1109/CVPR.2016.350},
PUBLISHER = {IEEE Computer Society},
YEAR = {2016},
DATE = {2016},
BOOKTITLE = {29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016)},
PAGES = {3213--3223},
ADDRESS = {Las Vegas, NV, USA},
}
Endnote
%0 Conference Proceedings
%A Cordts, Marius
%A Omran, Mohamed
%A Ramos, Sebastian
%A Rehfeld, Timo
%A Enzweiler, Markus
%A Benenson, Rodrigo
%A Franke, Uwe
%A Roth, Stefan
%A Schiele, Bernt
%+ External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T The Cityscapes Dataset for Semantic Urban Scene Understanding :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002A-FCED-A
%R 10.1109/CVPR.2016.350
%D 2016
%B 29th IEEE Conference on Computer Vision and Pattern Recognition
%Z date of event: 2016-06-26 - 2016-07-01
%C Las Vegas, NV, USA
%B 29th IEEE Conference on Computer Vision and Pattern Recognition
%P 3213 - 3223
%I IEEE Computer Society
%@ 978-1-4673-8852-8
Weakly Supervised Object Boundaries
A. Khoreva, R. Benenson, M. Omran, M. Hein and B. Schiele
29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), 2016
A. Khoreva, R. Benenson, M. Omran, M. Hein and B. Schiele
29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), 2016
Abstract
State-of-the-art learning based boundary detection methods require extensive
training data. Since labelling object boundaries is one of the most expensive
types of annotations, there is a need to relax the requirement to carefully
annotate images to make both the training more affordable and to extend the
amount of training data. In this paper we propose a technique to generate
weakly supervised annotations and show that bounding box annotations alone
suffice to reach high-quality object boundaries without using any
object-specific boundary annotations. With the proposed weak supervision
techniques we achieve the top performance on the object boundary detection
task, outperforming by a large margin the current fully supervised
state-of-the-art methods.
Export
BibTeX
@inproceedings{khoreva_cvpr16,
TITLE = {Weakly Supervised Object Boundaries},
AUTHOR = {Khoreva, Anna and Benenson, Rodrigo and Omran, Mohamed and Hein, Matthias and Schiele, Bernt},
ISBN = {978-1-4673-8852-8},
DOI = {10.1109/CVPR.2016.27},
PUBLISHER = {IEEE Computer Society},
YEAR = {2016},
DATE = {2016},
ABSTRACT = {State-of-the-art learning based boundary detection methods require extensive training data. Since labelling object boundaries is one of the most expensive types of annotations, there is a need to relax the requirement to carefully annotate images to make both the training more affordable and to extend the amount of training data. In this paper we propose a technique to generate weakly supervised annotations and show that bounding box annotations alone suffice to reach high-quality object boundaries without using any object-specific boundary annotations. With the proposed weak supervision techniques we achieve the top performance on the object boundary detection task, outperforming by a large margin the current fully supervised state-of-the-art methods.},
BOOKTITLE = {29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016)},
PAGES = {183--192},
ADDRESS = {Las Vegas, NV, USA},
}
Endnote
%0 Conference Proceedings
%A Khoreva, Anna
%A Benenson, Rodrigo
%A Omran, Mohamed
%A Hein, Matthias
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Weakly Supervised Object Boundaries :
%U http://hdl.handle.net/11858/00-001M-0000-002B-2645-2
%R 10.1109/CVPR.2016.27
%D 2016
%B 29th IEEE Conference on Computer Vision and Pattern Recognition
%Z date of event: 2016-06-26 - 2016-07-01
%C Las Vegas, NV, USA
%X State-of-the-art learning based boundary detection methods require extensive
training data. Since labelling object boundaries is one of the most expensive
types of annotations, there is a need to relax the requirement to carefully
annotate images to make both the training more affordable and to extend the
amount of training data. In this paper we propose a technique to generate
weakly supervised annotations and show that bounding box annotations alone
suffice to reach high-quality object boundaries without using any
object-specific boundary annotations. With the proposed weak supervision
techniques we achieve the top performance on the object boundary detection
task, outperforming by a large margin the current fully supervised
state-of-the-art methods.
%K Computer Science, Computer Vision and Pattern Recognition, cs.CV
%B 29th IEEE Conference on Computer Vision and Pattern Recognition
%P 183 - 192
%I IEEE Computer Society
%@ 978-1-4673-8852-8
The Cityscapes Dataset
M. Cordts, M. Omran, S. Ramos, T. Scharwächter, M. Enzweiler, R. Benenson, U. Franke, S. Roth and B. Schiele
The Future of Datasets in Vision 2015 (CVPR 2015 Workshop), 2015
M. Cordts, M. Omran, S. Ramos, T. Scharwächter, M. Enzweiler, R. Benenson, U. Franke, S. Roth and B. Schiele
The Future of Datasets in Vision 2015 (CVPR 2015 Workshop), 2015
Export
BibTeX
@inproceedings{cordts2015cityscapes,
TITLE = {The Cityscapes Dataset},
AUTHOR = {Cordts, Marius and Omran, Mohamed and Ramos, Sebastian and Scharw{\"a}chter, Timo and Enzweiler, Markus and Benenson, Rodrigo and Franke, Uwe and Roth, Stefan and Schiele, Bernt},
LANGUAGE = {eng},
YEAR = {2015},
BOOKTITLE = {The Future of Datasets in Vision 2015 (CVPR 2015 Workshop)},
ADDRESS = {Boston, MA, USA},
}
Endnote
%0 Generic
%A Cordts, Marius
%A Omran, Mohamed
%A Ramos, Sebastian
%A Scharwächter, Timo
%A Enzweiler, Markus
%A Benenson, Rodrigo
%A Franke, Uwe
%A Roth, Stefan
%A Schiele, Bernt
%+ External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T The Cityscapes Dataset :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-002A-E733-4
%D 2015
%Z name of event: CVPR Workshop on The Future of Datasets in Vision
%Z date of event: 2015-06-11 - 2015-07-11
%Z place of event: Boston, MA, USA
%B The Future of Datasets in Vision 2015
Detecting Surgical Tools by Modelling Local Appearance and Global Shape
D. Bouget, R. Benenson, M. Omran, L. Riffaud, B. Schiele and P. Jannin
IEEE Transactions on Medical Imaging, Volume 34, Number 12, 2015
D. Bouget, R. Benenson, M. Omran, L. Riffaud, B. Schiele and P. Jannin
IEEE Transactions on Medical Imaging, Volume 34, Number 12, 2015
Export
BibTeX
@article{958,
TITLE = {Detecting Surgical Tools by Modelling Local Appearance and Global Shape},
AUTHOR = {Bouget, David and Benenson, Rodrigo and Omran, Mohamed and Riffaud, Laurent and Schiele, Bernt and Jannin, Pierre},
LANGUAGE = {eng},
ISSN = {0278-0062},
DOI = {10.1109/TMI.2015.2450831},
PUBLISHER = {IEEE},
ADDRESS = {Piscataway, NJ},
YEAR = {2015},
DATE = {2015},
JOURNAL = {IEEE Transactions on Medical Imaging},
VOLUME = {34},
NUMBER = {12},
PAGES = {2603--2617},
}
Endnote
%0 Journal Article
%A Bouget, David
%A Benenson, Rodrigo
%A Omran, Mohamed
%A Riffaud, Laurent
%A Schiele, Bernt
%A Jannin, Pierre
%+ External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
External Organizations
%T Detecting Surgical Tools by Modelling Local Appearance and Global Shape :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0028-E35A-6
%R 10.1109/TMI.2015.2450831
%7 2015
%D 2015
%J IEEE Transactions on Medical Imaging
%O IEEE Trans. Med. Imaging
%V 34
%N 12
%& 2603
%P 2603 - 2617
%I IEEE
%C Piscataway, NJ
%@ false
Ten Years of Pedestrian Detection, What Have We Learned?
R. Benenson, M. Omran, J. Hosang and B. Schiele
Computer Vision - ECCV 2014 Workshops (ECCV 2014 Workshop CVRSUAD), 2014
R. Benenson, M. Omran, J. Hosang and B. Schiele
Computer Vision - ECCV 2014 Workshops (ECCV 2014 Workshop CVRSUAD), 2014
Export
BibTeX
@inproceedings{890,
TITLE = {Ten Years of Pedestrian Detection, What Have We Learned?},
AUTHOR = {Benenson, Rodrigo and Omran, Mohamed and Hosang, Jan and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-3-319-16180-8},
DOI = {10.1007/978-3-319-16181-5_47},
PUBLISHER = {Springer},
YEAR = {2014},
DATE = {2015},
BOOKTITLE = {Computer Vision -- ECCV 2014 Workshops (ECCV 2014 Workshop CVRSUAD)},
EDITOR = {Agapito, Lourdes and Bronstein, Michael M. and Rother, Carsten},
PAGES = {613--627},
SERIES = {Lecture Notes in Computer Science},
VOLUME = {8926},
ADDRESS = {Z{\"u}rich, Switzerland},
}
Endnote
%0 Conference Proceedings
%A Benenson, Rodrigo
%A Omran, Mohamed
%A Hosang, Jan
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Ten Years of Pedestrian Detection, What Have We Learned? :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-4C6A-8
%R 10.1007/978-3-319-16181-5_47
%D 2015
%B 2nd Workshop on Computer Vision for Road Scene Understanding and Autonomous Driving
%Z date of event: 2014-09-07 - 2014-09-07
%C Zürich, Switzerland
%B Computer Vision - ECCV 2014 Workshops
%E Agapito, Lourdes; Bronstein, Michael M.; Rother, Carsten
%P 613 - 627
%I Springer
%@ 978-3-319-16180-8
%B Lecture Notes in Computer Science
%N 8926
Taking a Deeper Look at Pedestrians
J. Hosang, M. Omran, R. Benenson and B. Schiele
IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), 2015
J. Hosang, M. Omran, R. Benenson and B. Schiele
IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), 2015
Export
BibTeX
@inproceedings{Hosang15cvpr,
TITLE = {Taking a Deeper Look at Pedestrians},
AUTHOR = {Hosang, Jan and Omran, Mohamed and Benenson, Rodrigo and Schiele, Bernt},
LANGUAGE = {eng},
DOI = {10.1109/CVPR.2015.7299034},
PUBLISHER = {IEEE Computer Society},
YEAR = {2015},
DATE = {2015},
BOOKTITLE = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015)},
PAGES = {4073--4082},
ADDRESS = {Boston, MA, USA},
}
Endnote
%0 Conference Proceedings
%A Hosang, Jan
%A Omran, Mohamed
%A Benenson, Rodrigo
%A Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Taking a Deeper Look at Pedestrians :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0025-01BF-0
%R 10.1109/CVPR.2015.7299034
%D 2015
%B IEEE Conference on Computer Vision and Pattern Recognition
%Z date of event: 2015-06-08 - 2015-06-10
%C Boston, MA, USA
%B IEEE Conference on Computer Vision and Pattern Recognition
%P 4073 - 4082
%I IEEE Computer Society
Pedestrian Detection Meets Stuff
M. Omran
PhD Thesis, Universität des Saarlandes, 2014
M. Omran
PhD Thesis, Universität des Saarlandes, 2014
Export
BibTeX
@mastersthesis{OmranMaster,
TITLE = {Pedestrian Detection Meets Stuff},
AUTHOR = {Omran, Mohamed},
LANGUAGE = {eng},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2014},
DATE = {2014},
}
Endnote
%0 Thesis
%A Omran, Mohamed
%Y Benenson, Rodrigo
%A referee: Schiele, Bernt
%+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society
%T Pedestrian Detection Meets Stuff :
%G eng
%U http://hdl.handle.net/11858/00-001M-0000-0024-5477-D
%I Universität des Saarlandes
%C Saarbrücken
%D 2014
%P VII, 75 p.
%V master
%9 master