Skip to content

Latest commit

 

History

History
52 lines (38 loc) · 2.11 KB

README.md

File metadata and controls

52 lines (38 loc) · 2.11 KB

Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling

by Xiaoyu Shi1*, Zhaoyang Huang1*, Fu-Yun Wang1*, Weikang Bian1*, Dasong Li 1, Yi Zhang1, Manyuan Zhang1, Ka Chun Cheung2, Simon See2, Hongwei Qin3, Jifeng Dai4, Hongsheng Li1

1CUHK-MMLab 2NVIDIA 3SenseTime 4 Tsinghua University

@article{shi2024motion,
            title={Motion-i2v: Consistent and controllable image-to-video generation with explicit motion modeling},
            author={Shi, Xiaoyu and Huang, Zhaoyang and Wang, Fu-Yun and Bian, Weikang and Li, Dasong and Zhang, Yi and Zhang, Manyuan and Cheung, Ka Chun and See, Simon and Qin, Hongwei and others},
            journal={SIGGRAPH 2024},
            year={2024}
            }
}
Image description Overview of Motion-I2V. The first stage of Motion-I2V targets at deducing the motions that can plausibly animate the reference image. It is conditioned on the reference image and text prompt, and predicts the motion field maps between the reference frame and all the future frames. The second stage propagates reference image’s content to synthesize frames. A novel motion-augmented temporal layer enhances 1-D temporal attention with warped features. This operation enlarges the temporal receptive field and alleviates the complexity of directly learning the complicated spatial-temporal patterns.

Usage

  1. Install environments
conda env create -f environment.yaml
  1. Download models
git clone https://huggingface.co/wangfuyun/Motion-I2V
  1. Run the code
python -m scripts.app 

ComfyUI

ComfyUI-IG-Motion-I2V

arch