D2
Computer Vision and Machine Learning

Bharat Lal Bhatnagar (PhD Student)

MSc Bharat Lal Bhatnagar

Address
Max-Planck-Institut für Informatik
Saarland Informatics Campus
Campus E1 4
66123 Saarbrücken
Location
E1 4 - 628
Phone
+49 681 9325 2128
Fax
+49 681 9325 2099

Publications

2022
TOCH: Spatio-Temporal Object Correspondence to Hand for Motion Refinement
K. Zhou, B. L. Bhatnagar, J. E. Lenssen and G. Pons-Moll
European Conference on Computer Vision (ECCV 2022), 2022
(Accepted/in press)
Abstract
We present TOCH, a method for refining incorrect 3D hand-object interaction<br>sequences using a data prior. Existing hand trackers, especially those that<br>rely on very few cameras, often produce visually unrealistic results with<br>hand-object intersection or missing contacts. Although correcting such errors<br>requires reasoning about temporal aspects of interaction, most previous work<br>focus on static grasps and contacts. The core of our method are TOCH fields, a<br>novel spatio-temporal representation for modeling correspondences between hands<br>and objects during interaction. The key component is a point-wise<br>object-centric representation which encodes the hand position relative to the<br>object. Leveraging this novel representation, we learn a latent manifold of<br>plausible TOCH fields with a temporal denoising auto-encoder. Experiments<br>demonstrate that TOCH outperforms state-of-the-art (SOTA) 3D hand-object<br>interaction models, which are limited to static grasps and contacts. More<br>importantly, our method produces smooth interactions even before and after<br>contact. Using a single trained TOCH model, we quantitatively and qualitatively<br>demonstrate its usefulness for 1) correcting erroneous reconstruction results<br>from off-the-shelf RGB/RGB-D hand-object reconstruction methods, 2) de-noising,<br>and 3) grasp transfer across objects. We will release our code and trained<br>model on our project page at http://virtualhumans.mpi-inf.mpg.de/toch/<br>
BEHAVE: Dataset and Method for Tracking Human Object Interactions
B. L. Bhatnagar, X. Xie, I. Petrov, C. Sminchisescu, C. Theobalt and G. Pons-Moll
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), 2022
TOCH: Spatio-Temporal Object Correspondence to Hand for Motion Refinement
K. Zhou, B. Lal Bhatnagar, J. E. Lenssen and G. Pons-Moll
Technical Report, 2022
(arXiv: 2205.07982)
Abstract
We present TOCH, a method for refining incorrect 3D hand-object interaction<br>sequences using a data prior. Existing hand trackers, especially those that<br>rely on very few cameras, often produce visually unrealistic results with<br>hand-object intersection or missing contacts. Although correcting such errors<br>requires reasoning about temporal aspects of interaction, most previous work<br>focus on static grasps and contacts. The core of our method are TOCH fields, a<br>novel spatio-temporal representation for modeling correspondences between hands<br>and objects during interaction. The key component is a point-wise<br>object-centric representation which encodes the hand position relative to the<br>object. Leveraging this novel representation, we learn a latent manifold of<br>plausible TOCH fields with a temporal denoising auto-encoder. Experiments<br>demonstrate that TOCH outperforms state-of-the-art (SOTA) 3D hand-object<br>interaction models, which are limited to static grasps and contacts. More<br>importantly, our method produces smooth interactions even before and after<br>contact. Using a single trained TOCH model, we quantitatively and qualitatively<br>demonstrate its usefulness for 1) correcting erroneous reconstruction results<br>from off-the-shelf RGB/RGB-D hand-object reconstruction methods, 2) de-noising,<br>and 3) grasp transfer across objects. We will release our code and trained<br>model on our project page at http://virtualhumans.mpi-inf.mpg.de/toch/<br>
2021
Adjoint Rigid Transform Network: Task-conditioned Alignment of 3D Shapes
K. Zhou, B. L. Bhatnagar, B. Schiele and G. Pons-Moll
Technical Report, 2021
(arXiv: 2102.01161)
Abstract
Most learning methods for 3D data (point clouds, meshes) suffer significant<br>performance drops when the data is not carefully aligned to a canonical<br>orientation. Aligning real world 3D data collected from different sources is<br>non-trivial and requires manual intervention. In this paper, we propose the<br>Adjoint Rigid Transform (ART) Network, a neural module which can be integrated<br>with a variety of 3D networks to significantly boost their performance. ART<br>learns to rotate input shapes to a learned canonical orientation, which is<br>crucial for a lot of tasks such as shape reconstruction, interpolation,<br>non-rigid registration, and latent disentanglement. ART achieves this with<br>self-supervision and a rotation equivariance constraint on predicted rotations.<br>The remarkable result is that with only self-supervision, ART facilitates<br>learning a unique canonical orientation for both rigid and nonrigid shapes,<br>which leads to a notable boost in performance of aforementioned tasks. We will<br>release our code and pre-trained models for further research.<br>
2020
LoopReg: Self-supervised Learning of Implicit Surface Correspondences, Pose and Shape for 3D Human Mesh Registration
B. L. Bhatnagar, C. Sminchisescu, C. Theobalt and G. Pons-Moll
Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 2020
Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction
B. L. Bhatnagar, C. Sminchisescu, C. Theobalt and G. Pons-Moll
Computer Vision -- ECCV 2020, 2020
SIZER: A Dataset and Model for Parsing 3D Clothing and Learning Size Sensitive 3D Clothing
G. Tiwari, B. L. Bhatnagar, T. Tung and G. Pons-Moll
Computer Vision -- ECCV 2020, 2020
Unsupervised Shape and Pose Disentanglement for 3D Meshes
K. Zhou, B. L. Bhatnagar and G. Pons-Moll
Computer Vision -- ECCV 2020, 2020
2019
Learning to Reconstruct People in Clothing from a Single RGB Camera
T. Alldieck, M. A. Magnor, B. L. Bhatnagar, C. Theobalt and G. Pons-Moll
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019), 2019
Multi-Garment Net: Learning to Dress 3D People from Images
B. L. Bhatnagar, G. Tiwari, C. Theobalt and G. Pons-Moll
International Conference on Computer Vision (ICCV 2019), 2019