VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches
via In-Context Conditioning
Please stay tuned for all video loading...
More AnyP2V Showcases
Here we show our application on any-timestamp patches to video with arbitrary spatial layout frames
























More AnyI2V Showcases
Here we show our application on any-timestamp images to video with full images, including I2V, FLF2V and other flexible AnyI2V tasks
Condition
Generated Video
Frame 0

Condition
Generated Video
Frame 40

Condition
Generated Video
Frame 76



















More AnyV2V Showcases
Here we show our application on video inpainting, outpainting, transition, extensition and video camera control
Video InpaintingSource
Generated Video
Source
Generated Video
Video Outpainting
Source
Generated Video
Source
Generated Video
Video Camera Control
This method simulates camera movements by frame-level progressively shifting or scaling the original video content on a spatiotemporal canvas. It unlocks a range of common camera motions, including zoom in/out and panning left, right, up, or down.
Source
Generated Video
Source
Generated Video
Video Extension (Minute-Level Looping Video)
Our model enables interactive shot extension without cuts, extending short videos to minute-long durations while maintaining visual consistency. For a seamless loop, we transition the final generated segment back to the initial clip. Unlike simple First-Last frame-looping methods that often cause stuttering due to mismatched motion, our approach uses the last clip's motion information to ensure a smooth, consistent, and truly seamless loop.
The diagram above demonstrates how we can extend the narrative from an initial Sora video by interactively providing text prompts. The complete extended looping video is presented below:
Generated Video (>1K frames)
Video Transition
Source
Generated Video
Source
Generated Video
Source
Generated Video
Source
Generated Video
Source
Generated Video
Source
Generated Video
Method Comparison
Contact Us
Feel free to contact Minghong Cai at minghongcai@link.cuhk.edu.hk for any question,cooperation, and communication.
If you find this work useful, please consider citing:
@article{cai2025videocanvas, title={VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning}, author={Minghong Cai, Qiulin Wang, Zongli Ye, Wenze Liu, Quande Liu, Weicai Ye, Xintao Wang, Pengfei Wan, Kun Gai, Xiangyu Yue}, journal={arXiv preprint}, year={2025} }