'HunyuanWorld-Voyager' can generate videos in which the viewpoint moves within a 3D scene generated from a single image



Tencent, a major Chinese IT company, has released ' HunyuanWorld-Voyager, ' an AI framework that generates coherent 3D scenes from a single image, on GitHub. HunyuanWorld-Voyager achieves scene augmentation while preserving context, and can generate videos of moving viewpoints within the generated 3D scene.

GitHub - Tencent-Hunyuan/HunyuanWorld-Voyager: Voyager is an interactive RGBD video generation model conditioned on camera trajectory, and supports real-time 3D reconstruction.

https://github.com/Tencent-Hunyuan/HunyuanWorld-Voyager



HunyuanWorld-Voyager is a 3D scene generation AI framework trained on a dataset of over 100,000 video clips, combining real-world captured images with synthetically rendered images in Unreal Engine, using a reconstruction pipeline that automates camera pose estimation and metric depth prediction for any video.

HunyuanWorld-Voyager consists of two main components:

1: A unified architecture that generates RGB and depth-aligned video sequences based on input images, ensuring consistency.
2: Autoregressive inference with smooth video sampling for efficient world caching and point removal, as well as iterative scene augmentation with context-aware consistency.

These components enable HunyuanWorld-Voyager to generate a coherent 3D scene from a single image, generate video of the scene as the camera moves, and reconstruct a 3D point cloud from the generated 3D scene.

On GitHub, the actual images input to HunyuanWorld-Voyager and the video generated based on them are publicly available. Below is the image input to HunyuanWorld-Voyager, and the image on the bottom right shows the camera movement within the 3D scene. The camera movement can be specified by the user.



The generated video is below:

Camera movement in the 3D scene generated by 'HunyuanWorld-Voyager' 01 - YouTube


Next, enter the following image:



The generated video looks like this:

Camera movement in the 3D scene generated by 'HunyuanWorld-Voyager' 02 - YouTube


Also, below is a 3D point cloud reconstructed from the video generated by HunyuanWorld-Voyager. Although it is rough, you can see that the 3D point cloud has been reconstructed.

3D point cloud reconstructed from video generated by 'HunyuanWorld-Voyager' - YouTube


in AI,   Video,   Software, Posted by log1h_ik