361-cogvideox-text-to-video-diffusion-models-with-an-expert-transformer

https://github.com/szu-advtech-2024/361-cogvideox-text-to-video-diffusion-models-with-an-expert-transformer

Science Score: 41.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (5.0%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: SZU-AdvTech-2024
Default Branch: main
Size: 0 Bytes

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed over 1 year ago

Metadata Files

Citation

https://github.com/SZU-AdvTech-2024/361-CogVideoX-Text-to-Video-Diffusion-Models-with-An-Expert-Transformer/blob/main/

# CogVideoX

![](./overview.png)

This code contains CogVideoX and the evaluation metric Vbench.
## Usage: 
For generating video.
 First download the pretrain model from huggingface https://huggingface.co/spaces/THUDM/CogVideoX-5B-Space
 Then run cli_demo.py under inference in CogVideoX-main with prompt.
```matlab
python cli_demo.py --prompt "A girl riding a bike." --model_path THUDM/CogVideoX-5b
```

For evaluating human perceptions, using prompt list under prompts in Vbench_master to generate videos.
Then using the video generated for evaluation
```matlab
python evaluate.py \
    --dimension $DIMENSION \
    --videos_path /path/to/folder_or_video/ \
    --mode=custom_input
```

## Citation
```bib
@misc{yang2024cogvideoxtexttovideodiffusionmodels,
      title={CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer}, 
      author={Zhuoyi Yang and Jiayan Teng and Wendi Zheng and Ming Ding and Shiyu Huang and Jiazheng Xu and Yuanming Yang and Wenyi Hong and Xiaohan Zhang and Guanyu Feng and Da Yin and Xiaotao Gu and Yuxuan Zhang and Weihan Wang and Yean Cheng and Ting Liu and Bin Xu and Yuxiao Dong and Jie Tang},
      year={2024},
      eprint={2408.06072},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2408.06072}, 
}
```

Owner

Name: SZU-AdvTech-2024
Login: SZU-AdvTech-2024
Kind: organization

Repositories: 1
Profile: https://github.com/SZU-AdvTech-2024

Citation (citation.txt)

@article{REPO361,
    author = "Yang, Zhuoyi and Teng, Jiayan and Zheng, Wendi and Ding, Ming and Huang, Shiyu and Xu, Jiazheng and Yang, Yuanming and Hong, Wenyi and Zhang, Xiaohan and Feng, Guanyu and others",
    journal = "arXiv preprint arXiv:2408.06072",
    title = "{CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer}",
    year = "2024"
}

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science