"
WHAT I CANNOT CREATE, I DO NOT UNDERSTAND
Droplet-3D
A large-scale video dataset (Droplet3D-4M) with multi-view annotations and a generative model (Droplet3D) supporting image and dense text input. Produces spatially consistent and semantically plausible 3D content.
Droplet-Video
Explores integral spatio-temporal consistency in video generation. DropletVideo-10M contains 10M videos with camera movements. The model excels at preserving spatio-temporal coherence.