vip

[ACCV 2024 Poster] official code for "VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model"

https://github.com/ucasyjz/vip

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (4.0%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

[ACCV 2024 Poster] official code for "VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model"

Basic Info

Host: GitHub
Owner: ucasyjz
License: apache-2.0
Language: Python
Default Branch: main
Homepage:
Size: 7.21 MB

Statistics

Stars: 9
Watchers: 1
Forks: 1
Open Issues: 2
Releases: 0

Created almost 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme Contributing License Code of conduct Citation

VIP-Versatile-Image-Outpainting-Empowered-by-Multimodal-Large-Language-Model

This repository is the official implementation of VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model

📜 News

🚀 [2024/9/26] Our paper has been accepted by ACCV 2024!

🚀 [2024/7/22] The training and inference code are released!

🚀 [2024/6/3] The paper is released!

🛠️ Usage

Requirements

shell - torch==1.13.1 - torchvision==0.14.1 - transformers==4.39.3 Note that in out method, there are some changes of UNet2DConditionModel in diffusers, please don't download the official diffusers dependency package.

For training

shell cd examples/VIP_ours/ bash train_on_enhanced_prompt.sh

For inference

shell cd examples/VIP_ours/ python3 inference_*.py

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'Diffusers: State-of-the-art diffusion models'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Patrick
    family-names: von Platen
  - given-names: Suraj
    family-names: Patil
  - given-names: Anton
    family-names: Lozhkov
  - given-names: Pedro
    family-names: Cuenca
  - given-names: Nathan
    family-names: Lambert
  - given-names: Kashif
    family-names: Rasul
  - given-names: Mishig
    family-names: Davaadorj
  - given-names: Thomas
    family-names: Wolf
repository-code: 'https://github.com/huggingface/diffusers'
abstract: >-
  Diffusers provides pretrained diffusion models across
  multiple modalities, such as vision and audio, and serves
  as a modular toolbox for inference and training of
  diffusion models.
keywords:
  - deep-learning
  - pytorch
  - image-generation
  - hacktoberfest
  - diffusion
  - text2image
  - image2image
  - score-based-generative-modeling
  - stable-diffusion
  - stable-diffusion-diffusers
license: Apache-2.0
version: 0.12.1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

vip