Research Engineer Facebook AI Research London, England, United Kingdom
Gestures and body pose
3D from a single image and shape-from-x; Transfer/Low-shot/Semi/Unsupervised Learning
We tackle the problem of monocular 3D reconstruction of articulated objects like humans and animals. Our key contribution is DensePose 3D, a novel parametric model of an articulated mesh, which can be learned in a self-supervised fashion from 2D image annotations only. This is in stark contrast with previous human body reconstruction methods that utilize a parametric model like SMPL pre-trained on a large dataset of 3D body scans that had to be obtained in a controlled environment. DensePose 3D can thus be applied for modelling broad range of articulated categories such as animal species. In an end-to-end fashion, it automatically learns to softly assign each vertex of a category-specific 3D template mesh to one of the rigidly moving latent parts and trains a single-view network predicting rigid motions of the parts to deform the template so that it re-projects correctly to the dense 2D surface annotations of objects (such as DensePose). In order to prevent unrealistic template deformations, we further propose to align the motions of nearby mesh vertices by expressing the part assignment as a function of the smooth eigenfunctions of the Laplace--Beltrami operator computed on the template mesh. Our experiments demonstrate improvements over the state-of-the-art non-rigid structure-from-motion baselines on both synthetic and real data on categories of humans and animals.