MrNeRF (@janusch_patas)

2025-07-18 | โค๏ธ 452 | ๐Ÿ” 57


Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Contributions: โ€ข We introduce Diffuman4D, a novel diffusion model that generates spatio-temporally consistent and high-resolution (1024p) human videos from sparse-view video inputs.

โ€ข We propose a sliding iterative denoising mechanism that enhances both the spatial and temporal consistency of generated long-term videos while maintaining efficient inference.

โ€ข We design a human pose conditioning scheme to enhance the appearance quality and motion accuracy of generated human videos.

โ€ข We plan to release our processed version of the DNA-Rendering dataset, which we believe will benefit future research in this area.


See similar notes in domain-rendering, domain-genai

Tags

type-paper domain-rendering, domain-genai