Generative Models of 3D People (Images and 3D Meshes)

(Left, top to bottom) Image-based generative model of people in clothing (ClothNet) [1]; estimating 3D human body shape under clothing (BUFF) [2]; 4D clothing capture for garment modeling, retargeting, and virtual try-on (ClothCap) [3]. (Right) Virtual try-on: a) Scan of a subject wearing clothes; b) image with a new body shape; c) estimated new body shape and pose; d) new body wearing the captured clothes.

NOTE: For more information and full list of publications visit the Real Virtual Humans site: https://virtualhumans.mpi-inf.mpg.de

NOTE: Most of this work was performaed while Gerard Pons-Moll was at MPI for Intelligent Systems, Tuebingen.

While our detailed models of the body capture important aspects of body shape, they are missing something important -- clothing. Most people appear in images in clothing and to analyze this we seek models of the body and clothing. Modeling clothing is hard, however, because of the variety of garments, varied topology of clothing, varied appearance, and the complex physical properties of cloth.

Standard methods for clothing 3D bodies rely on 2D patterns and physics simulation. These require expert knowlege, do not work for all types of clothing, and can be difficult to apply to arbitrary body shapes and poses. Consequently, we take a data-driven approach to learn the shape, movement, and appearance of clothing.

ClothNet [1] is a conditional generative model that is directly learned from images of people in clothing. Given a body silhouette, the model produces different people with similar pose and shape in different clothing styles by using a variational autoencoder, followed by an image-to-image translation network that generates the texture of the outfit.

To dress people in 3D, the minimally-clothed body shape is needed. To estimate this from clothed bodies, our BUFF method [2] estimates body shape under clothing from a sequence of 3D scans. In a scan sequence, different poses will make the clothing tight on body in different regions. All frames in a sequence are brought into an unposed canonical space and fused into a single point cloud. We optimize for the body shape to robustly fit the inside of this fused point cloud. This produces a remarkably accurate personalized body shape.

Given the underlying body shape, we can then model how clothing deviates from the body by capturing 3D and 4D scans of clothed people. ClothCap [3] is a pipeline that captures dynamic clothing on humans from 4D scans, segments the clothing from the body, segments it into pieces, and models how the clothing deviates from the body. ClothCap can then retarget the captured clothing to different body shapes paving the way towards virtual clothing try-on.

References

[1] A Generative Model of People in Clothing
Christoph Lassner, Gerard Pons-Moll, Peter V. Gehler
in Proceedings IEEE International Conference on Computer Vision (ICCV), IEEE, 2017. Spotlight.

[2] Detailed, accurate, human shape estimation from clothed 3D scan sequences
Chao Zhang, Sergi Pujades, Michael Black, Gerard Pons-Moll
in IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2017. Spotlight.

[3] ClothCap: Seamless 4D Clothing Capture and Retargeting
Gerard Pons-Moll, Sergi Pujades, Sonny Hu, Michael Black
in ACM Transactions on Graphics (SIGGRAPH)