vip

[ACCV 2024 Poster] official code for "VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model"

https://github.com/ucasyjz/vip

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (4.0%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

[ACCV 2024 Poster] official code for "VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model"

Basic Info
  • Host: GitHub
  • Owner: ucasyjz
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 7.21 MB
Statistics
  • Stars: 9
  • Watchers: 1
  • Forks: 1
  • Open Issues: 2
  • Releases: 0
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

VIP-Versatile-Image-Outpainting-Empowered-by-Multimodal-Large-Language-Model

This repository is the official implementation of VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model

📜 News

🚀 [2024/9/26] Our paper has been accepted by ACCV 2024!

🚀 [2024/7/22] The training and inference code are released!

🚀 [2024/6/3] The paper is released!

🛠️ Usage

Requirements

shell - torch==1.13.1 - torchvision==0.14.1 - transformers==4.39.3 Note that in out method, there are some changes of UNet2DConditionModel in diffusers, please don't download the official diffusers dependency package.

For training

shell cd examples/VIP_ours/ bash train_on_enhanced_prompt.sh

For inference

shell cd examples/VIP_ours/ python3 inference_*.py

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'Diffusers: State-of-the-art diffusion models'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Patrick
    family-names: von Platen
  - given-names: Suraj
    family-names: Patil
  - given-names: Anton
    family-names: Lozhkov
  - given-names: Pedro
    family-names: Cuenca
  - given-names: Nathan
    family-names: Lambert
  - given-names: Kashif
    family-names: Rasul
  - given-names: Mishig
    family-names: Davaadorj
  - given-names: Thomas
    family-names: Wolf
repository-code: 'https://github.com/huggingface/diffusers'
abstract: >-
  Diffusers provides pretrained diffusion models across
  multiple modalities, such as vision and audio, and serves
  as a modular toolbox for inference and training of
  diffusion models.
keywords:
  - deep-learning
  - pytorch
  - image-generation
  - hacktoberfest
  - diffusion
  - text2image
  - image2image
  - score-based-generative-modeling
  - stable-diffusion
  - stable-diffusion-diffusers
license: Apache-2.0
version: 0.12.1

GitHub Events

Total
  • Watch event: 2
  • Issue comment event: 1
Last Year
  • Watch event: 2
  • Issue comment event: 1