vip
[ACCV 2024 Poster] official code for "VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model"
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (4.0%) to scientific vocabulary
Repository
[ACCV 2024 Poster] official code for "VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model"
Basic Info
Statistics
- Stars: 9
- Watchers: 1
- Forks: 1
- Open Issues: 2
- Releases: 0
Metadata Files
README.md
VIP-Versatile-Image-Outpainting-Empowered-by-Multimodal-Large-Language-Model
This repository is the official implementation of VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model
📜 News
🚀 [2024/9/26] Our paper has been accepted by ACCV 2024!
🚀 [2024/7/22] The training and inference code are released!
🚀 [2024/6/3] The paper is released!
🛠️ Usage
Requirements
shell
- torch==1.13.1
- torchvision==0.14.1
- transformers==4.39.3
Note that in out method, there are some changes of UNet2DConditionModel in diffusers, please don't download the official diffusers dependency package.
For training
shell
cd examples/VIP_ours/
bash train_on_enhanced_prompt.sh
For inference
shell
cd examples/VIP_ours/
python3 inference_*.py
Citation (CITATION.cff)
cff-version: 1.2.0
title: 'Diffusers: State-of-the-art diffusion models'
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Patrick
family-names: von Platen
- given-names: Suraj
family-names: Patil
- given-names: Anton
family-names: Lozhkov
- given-names: Pedro
family-names: Cuenca
- given-names: Nathan
family-names: Lambert
- given-names: Kashif
family-names: Rasul
- given-names: Mishig
family-names: Davaadorj
- given-names: Thomas
family-names: Wolf
repository-code: 'https://github.com/huggingface/diffusers'
abstract: >-
Diffusers provides pretrained diffusion models across
multiple modalities, such as vision and audio, and serves
as a modular toolbox for inference and training of
diffusion models.
keywords:
- deep-learning
- pytorch
- image-generation
- hacktoberfest
- diffusion
- text2image
- image2image
- score-based-generative-modeling
- stable-diffusion
- stable-diffusion-diffusers
license: Apache-2.0
version: 0.12.1
GitHub Events
Total
- Watch event: 2
- Issue comment event: 1
Last Year
- Watch event: 2
- Issue comment event: 1