Gerard Pons-Moll (Research Leader)

Dr. Gerard Pons-Moll

Address
Max-Planck-Institut für Informatik
Saarland Informatics Campus
Campus E1 4
66123 Saarbrücken
Location
E1 4 - Room 605
Phone
+49 681 9325 2135
Fax
+49 681 9325 2099
Email
Get email via email

Group Homepage

 

Please my group website: http://virtualhumans.mpi-inf.mpg.de

 

    Offers

    We have two open positions for PhD in related areas. If you are interested contact me direclty or send your application to d2-application@mpi-inf.mpg.de. We also have projects for bachelor and master theses and research internships of 6 months.

    Publications -- MPII only

    Omran, M., Lassner,, C., Pons-Moll, G., Gehler, P., & Schiele, B. (n.d.). Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation. In International Conference on 3D Vision. Verona, Italy: IEEE.
    (Accepted/in press)
    Export
    BibTeX
    @inproceedings{omran2018nbf, TITLE = {Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation}, AUTHOR = {Omran, Mohamed and Lassner,, Christoph and Pons-Moll, Gerard and Gehler, Peter and Schiele, Bernt}, LANGUAGE = {eng}, PUBLISHER = {IEEE}, YEAR = {2018}, PUBLREMARK = {Accepted}, BOOKTITLE = {International Conference on 3D Vision}, ADDRESS = {Verona, Italy}, }
    Endnote
    %0 Conference Proceedings %A Omran, Mohamed %A Lassner,, Christoph %A Pons-Moll, Gerard %A Gehler, Peter %A Schiele, Bernt %+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society External Organizations Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society External Organizations Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society %T Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation : %G eng %U http://hdl.handle.net/21.11116/0000-0001-E564-C %D 2018 %B International Conference on 3D Vision %Z date of event: 2018-09-05 - 2018-09-08 %C Verona, Italy %B International Conference on 3D Vision %I IEEE
    Yu, T., Zheng, Z., Guo, K., Zhao, J., Dai, Q., Li, H., … Liu, Y. (n.d.). DoubleFusion: Real-time Capture of Human Performances with Inner Body Shapes from a Single Depth Sensor. In 31st IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018). Salt Lake City, UT, USA.
    (Accepted/in press)
    Export
    BibTeX
    @inproceedings{DoubleFusion, TITLE = {{DoubleFusion}: {R}eal-time Capture of Human Performances with Inner Body Shapes from a Single Depth Sensor}, AUTHOR = {Yu, Tao and Zheng, Zherong and Guo, Kaiwen and Zhao, Jianhui and Dai, Qionghai and Li, Hao and Pons-Moll, Gerard and Liu, Yebin}, LANGUAGE = {eng}, YEAR = {2018}, PUBLREMARK = {Accepted}, BOOKTITLE = {31st IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018)}, ADDRESS = {Salt Lake City, UT, USA}, }
    Endnote
    %0 Conference Proceedings %A Yu, Tao %A Zheng, Zherong %A Guo, Kaiwen %A Zhao, Jianhui %A Dai, Qionghai %A Li, Hao %A Pons-Moll, Gerard %A Liu, Yebin %+ External Organizations External Organizations External Organizations External Organizations External Organizations External Organizations Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society External Organizations %T DoubleFusion: Real-time Capture of Human Performances with Inner Body Shapes from a Single Depth Sensor : %G eng %U http://hdl.handle.net/21.11116/0000-0001-1E30-8 %D 2018 %B 31st IEEE Conference on Computer Vision and Pattern Recognition %Z date of event: 2018-06-18 - 2018-06-22 %C Salt Lake City, UT, USA %B 31st IEEE Conference on Computer Vision and Pattern Recognition
    Alldieck, T., Magnor, M. A., Xu, W., Theobalt, C., & Pons-Moll, G. (n.d.). Video Based Reconstruction of 3D People Models. In 31st IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018). Salt Lake City, UT, USA.
    (Accepted/in press)
    Export
    BibTeX
    @inproceedings{alldieck2018video, TITLE = {Video Based Reconstruction of {3D} People Models}, AUTHOR = {Alldieck, Thiemo and Magnor, Marcus A. and Xu, Weipeng and Theobalt, Christian and Pons-Moll, Gerard}, LANGUAGE = {eng}, YEAR = {2018}, PUBLREMARK = {Accepted}, BOOKTITLE = {31st IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018)}, ADDRESS = {Salt Lake City, UT, USA}, }
    Endnote
    %0 Conference Proceedings %A Alldieck, Thiemo %A Magnor, Marcus A. %A Xu, Weipeng %A Theobalt, Christian %A Pons-Moll, Gerard %+ External Organizations External Organizations External Organizations Computer Graphics, MPI for Informatics, Max Planck Society Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society %T Video Based Reconstruction of 3D People Models : %G eng %U http://hdl.handle.net/21.11116/0000-0001-1E24-6 %D 2018 %B 31st IEEE Conference on Computer Vision and Pattern Recognition %Z date of event: 2018-06-18 - 2018-06-22 %C Salt Lake City, UT, USA %B 31st IEEE Conference on Computer Vision and Pattern Recognition
    Sattar, H., Pons-Moll, G., & Fritz, M. (2018). Fashion is Taking Shape: Understanding Clothing Preference Based on Body Shape From Online Sources. Retrieved from http://arxiv.org/abs/1807.03235
    (arXiv: 1807.03235)
    Abstract
    To study the correlation between clothing garments and body shape, we collected a new dataset (Fashion Takes Shape), which includes images of users with clothing category annotations. We employ our multi-photo approach to estimate body shapes of each user and build a conditional model of clothing categories given body-shape. We demonstrate that in real-world data, clothing categories and body-shapes are correlated and show that our multi-photo approach leads to a better predictive model for clothing categories compared to models based on single-view shape estimates or manually annotated body types. We see our method as the first step towards the large-scale understanding of clothing preferences from body shape.
    Export
    BibTeX
    @online{arxiv1807.03235, TITLE = {Fashion is Taking Shape: Understanding Clothing Preference Based on Body Shape From Online Sources}, AUTHOR = {Sattar, Hosnieh and Pons-Moll, Gerard and Fritz, Mario}, LANGUAGE = {eng}, URL = {http://arxiv.org/abs/1807.03235}, EPRINT = {1807.03235}, EPRINTTYPE = {arXiv}, YEAR = {2018}, ABSTRACT = {To study the correlation between clothing garments and body shape, we collected a new dataset (Fashion Takes Shape), which includes images of users with clothing category annotations. We employ our multi-photo approach to estimate body shapes of each user and build a conditional model of clothing categories given body-shape. We demonstrate that in real-world data, clothing categories and body-shapes are correlated and show that our multi-photo approach leads to a better predictive model for clothing categories compared to models based on single-view shape estimates or manually annotated body types. We see our method as the first step towards the large-scale understanding of clothing preferences from body shape.}, }
    Endnote
    %0 Report %A Sattar, Hosnieh %A Pons-Moll, Gerard %A Fritz, Mario %+ Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society %T Fashion is Taking Shape: Understanding Clothing Preference Based on Body Shape From Online Sources : %G eng %U http://hdl.handle.net/21.11116/0000-0001-B309-B %U http://arxiv.org/abs/1807.03235 %D 2018 %X To study the correlation between clothing garments and body shape, we collected a new dataset (Fashion Takes Shape), which includes images of users with clothing category annotations. We employ our multi-photo approach to estimate body shapes of each user and build a conditional model of clothing categories given body-shape. We demonstrate that in real-world data, clothing categories and body-shapes are correlated and show that our multi-photo approach leads to a better predictive model for clothing categories compared to models based on single-view shape estimates or manually annotated body types. We see our method as the first step towards the large-scale understanding of clothing preferences from body shape. %K Computer Science, Computer Vision and Pattern Recognition, cs.CV,Computer Science, Graphics, cs.GR
    Mehta, D., Sotnychenko, O., Mueller, F., Xu, W., Sridhar, S., Pons-Moll, G., & Theobalt, C. (2017). Single-Shot Multi-Person 3D Body Pose Estimation From Monocular RGB Input. Retrieved from http://arxiv.org/abs/1712.03453
    (arXiv: 1712.03453)
    Abstract
    We propose a new efficient single-shot method for multi-person 3D pose estimation in general scenes from a monocular RGB camera. Our fully convolutional DNN-based approach jointly infers 2D and 3D joint locations on the basis of an extended 3D location map supported by body part associations. This new formulation enables the readout of full body poses at a subset of visible joints without the need for explicit bounding box tracking. It therefore succeeds even under strong partial body occlusions by other people and objects in the scene. We also contribute the first training data set showing real images of sophisticated multi-person interactions and occlusions. To this end, we leverage multi-view video-based performance capture of individual people for ground truth annotation and a new image compositing for user-controlled synthesis of large corpora of real multi-person images. We also propose a new video-recorded multi-person test set with ground truth 3D annotations. Our method achieves state-of-the-art performance on challenging multi-person scenes.
    Export
    BibTeX
    @online{Mehta1712.03453, TITLE = {Single-Shot Multi-Person {3D} Body Pose Estimation From Monocular {RGB} Input}, AUTHOR = {Mehta, Dushyant and Sotnychenko, Oleksandr and Mueller, Franziska and Xu, Weipeng and Sridhar, Srinath and Pons-Moll, Gerard and Theobalt, Christian}, LANGUAGE = {eng}, URL = {http://arxiv.org/abs/1712.03453}, EPRINT = {1712.03453}, EPRINTTYPE = {arXiv}, YEAR = {2017}, MARGINALMARK = {$\bullet$}, ABSTRACT = {We propose a new efficient single-shot method for multi-person 3D pose estimation in general scenes from a monocular RGB camera. Our fully convolutional DNN-based approach jointly infers 2D and 3D joint locations on the basis of an extended 3D location map supported by body part associations. This new formulation enables the readout of full body poses at a subset of visible joints without the need for explicit bounding box tracking. It therefore succeeds even under strong partial body occlusions by other people and objects in the scene. We also contribute the first training data set showing real images of sophisticated multi-person interactions and occlusions. To this end, we leverage multi-view video-based performance capture of individual people for ground truth annotation and a new image compositing for user-controlled synthesis of large corpora of real multi-person images. We also propose a new video-recorded multi-person test set with ground truth 3D annotations. Our method achieves state-of-the-art performance on challenging multi-person scenes.}, }
    Endnote
    %0 Report %A Mehta, Dushyant %A Sotnychenko, Oleksandr %A Mueller, Franziska %A Xu, Weipeng %A Sridhar, Srinath %A Pons-Moll, Gerard %A Theobalt, Christian %+ Computer Graphics, MPI for Informatics, Max Planck Society Computer Graphics, MPI for Informatics, Max Planck Society Computer Graphics, MPI for Informatics, Max Planck Society Computer Graphics, MPI for Informatics, Max Planck Society External Organizations Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society Computer Graphics, MPI for Informatics, Max Planck Society %T Single-Shot Multi-Person 3D Body Pose Estimation From Monocular RGB Input : %G eng %U http://hdl.handle.net/21.11116/0000-0000-438F-4 %U http://arxiv.org/abs/1712.03453 %D 2017 %X We propose a new efficient single-shot method for multi-person 3D pose estimation in general scenes from a monocular RGB camera. Our fully convolutional DNN-based approach jointly infers 2D and 3D joint locations on the basis of an extended 3D location map supported by body part associations. This new formulation enables the readout of full body poses at a subset of visible joints without the need for explicit bounding box tracking. It therefore succeeds even under strong partial body occlusions by other people and objects in the scene. We also contribute the first training data set showing real images of sophisticated multi-person interactions and occlusions. To this end, we leverage multi-view video-based performance capture of individual people for ground truth annotation and a new image compositing for user-controlled synthesis of large corpora of real multi-person images. We also propose a new video-recorded multi-person test set with ground truth 3D annotations. Our method achieves state-of-the-art performance on challenging multi-person scenes. %K Computer Science, Computer Vision and Pattern Recognition, cs.CV