AvatarsFTW: 3D Human Avatars From The Wild

Carnegie Mellon University
AvatarsFTW Overview Image

We propose a two-part, inpainting and body fitting pipeline that alleviates 3D human reconstruction issues with human-object interactions, occlusions, and dynamic poses. The inpainting pipeline uses keypoint detection and a novel keypoint estimation technique, uses LaMa for occluding object removal, Stable Diffusion with ControlNets for generation of missing areas, and a GAN inversion step to create a seamless, plausible human reconstruction. The body fitting pipeline uses an improved regressor and adds more losses to the iterative fitting stage to achieve a better human mesh fit in dynamic poses. The figure above demonstrates our work's ability to inpaint human images, generate improved meshes for incomplete images, and fit better human meshes to a variety of highly dynamic poses.

Abstract

Recently, a plethora of pipelines have emerged to generate 3D clothed human avatars from single, in-the-wild images. However, all of them are limited to full-body, front- facing human images with minimal occlusions, objects, and simple poses. To address these limitations, we propose a two-part, inpainting and body fitting pipeline that addresses these issues. The inpainting pipeline uses keypoint detection and a novel keypoint estimation technique, uses LaMa for occluding object removal, Stable Diffusion with ControlNets for generation of missing areas, and a GAN inversion step to create a seamless, plausible human reconstruction. The body fitting pipeline uses an improved regressor and adds more losses to the iterative fitting stage to achieve a better human mesh fit in dynamic poses. Through qualitative comparisons, our pipeline shows improvements in mesh textures and SMPL-X fit over previous methods.

SIFU Failure Cases

SIFU failure case for humna-object interactions and large occlusions. SIFU failure case for highly dynamic pose.

Seen here are key failure cases for SIFU: human-object interactions, large occlusions, and highly dynamic pose estimation.

Inpainting Results

Inpainting results on normal maps.

More outputs from the inpainting process and the resultant improvements to the final normal maps.

Improved Body Fitting

SIFU vs. improved body fits. More successful examples.

Results from the improvements to the body fitting pipeline.

Report