--- license: apache-2.0 language: - en pipeline_tag: image-to-video datasets: - BestWishYsh/ConsisID-preview-Data base_model: - THUDM/CogVideoX-5b - THUDM/CogVideoX1.5-5B-I2V base_model_relation: finetune library_name: diffusers tags: - IPT2V ---

Identity-Preserving Text-to-Video Generation by Frequency Decomposition

If you like our project, please give us a star ⭐ on GitHub for the latest update.

## 😍 Gallery Identity-Preserving Text-to-Video Generation. [![Demo Video of ConsisID](https://github.com/user-attachments/assets/634248f6-1b54-4963-88d6-34fa7263750b)](https://www.youtube.com/watch?v=PhlgC-bI5SQ) or you can click here to watch the video. ## Description - **Repository:** [Code](https://github.com/PKU-YuanGroup/ConsisID), [Page](https://pku-yuangroup.github.io/ConsisID/), [Data](https://huggingface.co/datasets/BestWishYsh/ConsisID-preview-Data) - **Paper:** arxiv.org/abs/2411.17440 - **Point of Contact:** [Shenghai Yuan](shyuan-cs@hotmail.com) ## ✏️ Citation If you find our paper and code useful in your research, please consider giving a star and citation. ```BibTeX @article{yuan2024identity, title={Identity-Preserving Text-to-Video Generation by Frequency Decomposition}, author={Yuan, Shenghai and Huang, Jinfa and He, Xianyi and Ge, Yunyuan and Shi, Yujun and Chen, Liuhan and Luo, Jiebo and Yuan, Li}, journal={arXiv preprint arXiv:2411.17440}, year={2024} } ```