Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.9%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: yaga1183
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 11.9 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 3 years ago · Last pushed over 3 years ago
Metadata Files
Readme Funding License Citation

README.md

DALL·E Mini

How to use it?

You can use the model on 🖍️ craiyon

How does it work?

Refer to our reports:

Development

Dependencies Installation

For inference only, use pip install dalle-mini.

For development, clone the repo and use pip install -e ".[dev]". Before making a PR, check style with make style.

You can experiment with the pipeline step by step through our inference pipeline notebook

Open In Colab

Training of DALL·E mini

Use tools/train/train.py.

You can also adjust the sweep configuration file if you need to perform a hyperparameter search.

FAQ

Where to find the latest models?

Trained models are on 🤗 Model Hub:

Where does the logo come from?

The "armchair in the shape of an avocado" was used by OpenAI when releasing DALL·E to illustrate the model's capabilities. Having successful predictions on this prompt represents a big milestone for us.

Contributing

Join the community on the LAION Discord. Any contribution is welcome, from reporting issues to proposing fixes/improvements or testing the model with cool prompts!

You can also use these great projects from the community:

Open In Colab

  • run on Replicate, in the browser or via API

Acknowledgements

Authors & Contributors

DALL·E mini was initially developed by:

Many thanks to the people who helped make it better:

Citing DALL·E mini

If you find DALL·E mini useful in your research or wish to refer, please use the following BibTeX entry.

text @misc{Dayma_DALL·E_Mini_2021, author = {Dayma, Boris and Patil, Suraj and Cuenca, Pedro and Saifullah, Khalid and Abraham, Tanishq and Lê Khắc, Phúc and Melas, Luke and Ghosh, Ritobrata}, doi = {10.5281/zenodo.5146400}, month = {7}, title = {DALL·E Mini}, url = {https://github.com/borisdayma/dalle-mini}, year = {2021} }

References

Original DALL·E from "Zero-Shot Text-to-Image Generation" with image quantization from "Learning Transferable Visual Models From Natural Language Supervision".

Image encoder from "Taming Transformers for High-Resolution Image Synthesis".

Sequence to sequence model based on "BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension" with implementation of a few variants:

Main optimizer (Distributed Shampoo) from "Scalable Second Order Optimization for Deep Learning".

Citations

text @misc{ title={Zero-Shot Text-to-Image Generation}, author={Aditya Ramesh and Mikhail Pavlov and Gabriel Goh and Scott Gray and Chelsea Voss and Alec Radford and Mark Chen and Ilya Sutskever}, year={2021}, eprint={2102.12092}, archivePrefix={arXiv}, primaryClass={cs.CV} }

text @misc{ title={Learning Transferable Visual Models From Natural Language Supervision}, author={Alec Radford and Jong Wook Kim and Chris Hallacy and Aditya Ramesh and Gabriel Goh and Sandhini Agarwal and Girish Sastry and Amanda Askell and Pamela Mishkin and Jack Clark and Gretchen Krueger and Ilya Sutskever}, year={2021}, eprint={2103.00020}, archivePrefix={arXiv}, primaryClass={cs.CV} }

text @misc{ title={Taming Transformers for High-Resolution Image Synthesis}, author={Patrick Esser and Robin Rombach and Björn Ommer}, year={2021}, eprint={2012.09841}, archivePrefix={arXiv}, primaryClass={cs.CV} }

text @misc{ title={BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension}, author={Mike Lewis and Yinhan Liu and Naman Goyal and Marjan Ghazvininejad and Abdelrahman Mohamed and Omer Levy and Ves Stoyanov and Luke Zettlemoyer}, year={2019}, eprint={1910.13461}, archivePrefix={arXiv}, primaryClass={cs.CL} }

text @misc{ title={Scalable Second Order Optimization for Deep Learning}, author={Rohan Anil and Vineet Gupta and Tomer Koren and Kevin Regan and Yoram Singer}, year={2021}, eprint={2002.09018}, archivePrefix={arXiv}, primaryClass={cs.LG} }

text @misc{ title={GLU Variants Improve Transformer}, author={Noam Shazeer}, year={2020}, url={https://arxiv.org/abs/2002.05202} }

text @misc{ title={DeepNet: Scaling transformers to 1,000 layers}, author={Wang, Hongyu and Ma, Shuming and Dong, Li and Huang, Shaohan and Zhang, Dongdong and Wei, Furu}, year={2022}, eprint={2203.00555} archivePrefix={arXiv}, primaryClass={cs.LG} }

text @misc{ title={NormFormer: Improved Transformer Pretraining with Extra Normalization}, author={Sam Shleifer and Jason Weston and Myle Ott}, year={2021}, eprint={2110.09456}, archivePrefix={arXiv}, primaryClass={cs.CL} }

text @inproceedings{ title={Swin Transformer V2: Scaling Up Capacity and Resolution}, author={Ze Liu and Han Hu and Yutong Lin and Zhuliang Yao and Zhenda Xie and Yixuan Wei and Jia Ning and Yue Cao and Zheng Zhang and Li Dong and Furu Wei and Baining Guo}, booktitle={International Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2022} }

text @misc{ title = {CogView: Mastering Text-to-Image Generation via Transformers}, author = {Ming Ding and Zhuoyi Yang and Wenyi Hong and Wendi Zheng and Chang Zhou and Da Yin and Junyang Lin and Xu Zou and Zhou Shao and Hongxia Yang and Jie Tang}, year = {2021}, eprint = {2105.13290}, archivePrefix = {arXiv}, primaryClass = {cs.CV} }

text @misc{ title = {Root Mean Square Layer Normalization}, author = {Biao Zhang and Rico Sennrich}, year = {2019}, eprint = {1910.07467}, archivePrefix = {arXiv}, primaryClass = {cs.LG} }

text @misc{ title = {Sinkformers: Transformers with Doubly Stochastic Attention}, url = {https://arxiv.org/abs/2110.11773}, author = {Sander, Michael E. and Ablin, Pierre and Blondel, Mathieu and Peyré, Gabriel}, publisher = {arXiv}, year = {2021}, }

text @misc{ title = {Smooth activations and reproducibility in deep networks}, url = {https://arxiv.org/abs/2010.09931}, author = {Shamir, Gil I. and Lin, Dong and Coviello, Lorenzo}, publisher = {arXiv}, year = {2020}, }

Owner

  • Name: Jay Wen
  • Login: yaga1183
  • Kind: user
  • Location: Taiwan
  • Company: Meiho University

Citation (CITATION.cff)

# YAML 1.2
---
abstract: "DALL·E mini is a JAX/Flax reimplementation of OpenAI's DALL·E that requires much smaller hardware resources. By simplifying the architecture and model memory requirements, as well as leveraging open-source code and pre-trained models, we were able to create a model that is 27 times smaller than the original DALL·E and train it on a single TPU v3-8 for only 3 days. DALL·E mini achieves impressive results, albeit of a lower quality than the original system. It can be used for exploration and further experimentation on commodity hardware."
authors: 
  -
    family-names: Dayma
    given-names: Boris
  -
    family-names: Patil
    given-names: Suraj
  -
    family-names: Cuenca
    given-names: Pedro
  -
    family-names: Saifullah
    given-names: Khalid
  -
    family-names: Abraham
    given-names: Tanishq
  -
    family-names: "Lê Khắc"
    given-names: "Phúc"
  -
    family-names: Melas
    given-names: Luke
  -
    family-names: Ghosh
    given-names: Ritobrata
cff-version: "1.1.0"
date-released: 2021-07-29
identifiers: 
keywords: 
  - dalle
  - "text-to-image generation"
  - transformer
  - "zero-shot"
  - JAX
license: "Apache-2.0"
doi: 10.5281/zenodo.5146400
message: "If you use this project, please cite it using these metadata."
repository-code: "https://github.com/borisdayma/dalle-mini"
title: "DALL·E Mini"
version: "v0.1-alpha"
...

GitHub Events

Total
Last Year

Dependencies

.github/workflows/check_size.yml actions
  • ActionsDesk/lfs-warning v2.0 composite
.github/workflows/pypi_release.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
  • pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite
.github/workflows/style.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • jamescurtin/isort-action master composite
  • psf/black stable composite
.github/workflows/sync_to_hub.yml.backup actions
  • actions/checkout v2 composite
.github/workflows/sync_to_hub_debug.yml actions
  • actions/checkout v2 composite
Docker/Dockerfile docker
  • nvidia/cuda 11.6.2-cudnn8-devel-ubuntu20.04 build
pyproject.toml pypi
setup.py pypi