MrNeRF (@janusch_patas)
2025-07-18 | โค๏ธ 452 | ๐ 57
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
Contributions: โข We introduce Diffuman4D, a novel diffusion model that generates spatio-temporally consistent and high-resolution (1024p) human videos from sparse-view video inputs.
โข We propose a sliding iterative denoising mechanism that enhances both the spatial and temporal consistency of generated long-term videos while maintaining efficient inference.
โข We design a human pose conditioning scheme to enhance the appearance quality and motion accuracy of generated human videos.
โข We plan to release our processed version of the DNA-Rendering dataset, which we believe will benefit future research in this area.
๐ Related
See similar notes in domain-rendering, domain-genai