https://github.com/kuleshov-group/awesome-discrete-diffusion-models
A curated list for awesome discrete diffusion models resources.
https://github.com/kuleshov-group/awesome-discrete-diffusion-models
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
✓Institutional organization owner
Organization kuleshov-group has institutional domain (www.cs.cornell.edu) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.8%) to scientific vocabulary
Repository
A curated list for awesome discrete diffusion models resources.
Statistics
- Stars: 474
- Watchers: 19
- Forks: 19
- Open Issues: 4
- Releases: 0
Metadata Files
README.md
awesome-discrete-diffusion-models
A curated list of awesome discrete diffusion models resources.
Contribution
This repo is maintained by Subham Sahoo, Yingheng Wang, and Yair Schiff. Feel free to send pull requests to add more papers! Papers must be added in a chronological sequence, with the most recent accepted papers taking precedence over unaccepted papers. Please use the following format:
{paper-name}, {conference} {year} [[link-to-the-abstract-page], [code-if-available]]
Table of Contents
- Introductory Materials
- Topic areas
Introductory Materials
Getting started with Diffusion Language Models, 2024.
Diffusion Language Models, 2023 [URL]
My notes on discrete denoising diffusion models (D3PMs), 2022 [URL]
Papers
Discrete Diffusion with Discrete Noise
- Simple and Effective Masked Diffusion Language Models, NeurIPS 2024 [arXiv, code]
- Simplified and Generalized Masked Diffusion for Discrete Data, NeurIPS 2024 [arXiv]
- Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution, ICML 2024 [arXiv, code]
- Think While You Generate: Discrete Diffusion with Planned Denoising, arXiv 2024 [arXiv, code]
- Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data, arXiv 2024 [arXiv, code]
- Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning, ICLR 2023 [arXiv, code]
- DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models, ICLR 2023 [arXiv, code]
- FiLM: Fill-in Language Models for Any-Order Generation, arXiv 2023 [arXiv, code]
- A Continuous Time Framework for Discrete Denoising Models, NeurIPS 2022 [arXiv, code]
- Autoregressive Diffusion Models, ICLR 2022 [arXiv]
- EdiT5: Semi-Autoregressive Text Editing with T5 Warm-Start, arXiv 2022 [arXiv, code]
- Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions, NeurIPS 2021 [arXiv, code]
- Structured Denoising Diffusion Models in Discrete State-Spaces, NeurIPS 2021 [arXiv, code]
Discrete Diffusion with Gaussian Noise
- SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control, ACL 2023 [arXiv, code]
- Diffusion-LM Improves Controllable Text Generation, NeurIPS 2022 [arXiv, code]
- Self-conditioned Embedding Diffusion for Text Generation, NeurIPS 2022 [arXiv]
- Continuous Diffusion for Categorical Data, arXiv 2022 [arXiv]
Inference Acceleration
- The Diffusion Duality, ICML 2025 [arXiv, code]
- Beyond Autoregression: Fast LLMs via Self-Distillation Through Time, ICLR 2025 [arXiv, code]
- Di[M]O: Distilling Masked Diffusion Models into One-step Generator, arXiv 2025 [arXiv, code]
Discrete Flows
- Discrete Flow Matching, NeurIPS 2024 [arXiv, code]
- Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design, ICML 2024 [arXiv]
Samplers
- Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling, arXiv 2024 [arXiv]
- Informed Correctors for Discrete Diffusion Models, arXiv 2024 [arXiv]
- Jump Your Steps: Optimizing Sampling Schedule of Discrete Diffusion Models, arXiv 2024 [arXiv]
Guidance Mechanisms
- PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion, arXiv 2024 [arXiv]
- Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction, arXiv 2024 [arXiv]
- Unlocking Guidance for Discrete State-Space Diffusion and Flow Models, arXiv 2024 [arXiv]
- Protein Design with Guided Discrete Diffusion, NeurIPS 2023 [arXiv, code]
Custom Noise Processes
- Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion, CORR 2024 [arXiv, code]
- DINOISER: Diffused Conditional Sequence Learning By Manipulating Noises, TACL 2024 [arXiv, code]
- DiffusER: Discrete Diffusion via Edit-based Reconstruction, ICLR 2023 [arXiv, code]
- A Cheaper and Better Diffusion Language Model with Soft-Masked Noise, EMNLP 2023 [arXiv, code]
- DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models, ACL 2023 [arXiv, code]
Theory
- Discrete Copula Diffusion, arXiv 2024 [arXiv]
- Formulating Discrete Probability Flow Through Optimal Transport, NeurIPS 2023 [arXiv, code]
- Categorical SDEs with Simplex Diffusion, arXiv 2022 [arXiv]
Applications
- Diffusion Language Models Are Versatile Protein Learners, ICML 2024 [arXiv, code]
- DPLM-2: A Multimodal Diffusion Protein Language Model, arXiv 2024 [arXiv]
- Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design, arXiv 2024 [arXiv, code]
- Scaling Diffusion Language Models via Adaptation from Autoregressive Models, arXiv 2024 [arXiv]
- Scaling up Masked Diffusion Models on Text, arXiv 2024 [arXiv, code]
- Likelihood-Based Diffusion Language Models, NeurIPS 2023 [arXiv, code]
- HouseDiffusion: Vector Floorplan Generation via a Diffusion Model with Discrete and Continuous Denoising, CVPR 2023 [CVPR 2023, code]
- Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning, arXiv 2023 [arXiv, code]
Surveys
Owner
- Name: Kuleshov Group @ Cornell Tech
- Login: kuleshov-group
- Kind: organization
- Website: https://www.cs.cornell.edu/~kuleshov/
- Twitter: volokuleshov
- Repositories: 7
- Profile: https://github.com/kuleshov-group
Research group at Cornell focused on machine learning, generative models, AI for science
GitHub Events
Total
- Issues event: 1
- Watch event: 343
- Issue comment event: 1
- Member event: 1
- Push event: 54
- Pull request review comment event: 1
- Pull request review event: 3
- Pull request event: 13
- Fork event: 16
Last Year
- Issues event: 1
- Watch event: 343
- Issue comment event: 1
- Member event: 1
- Push event: 54
- Pull request review comment event: 1
- Pull request review event: 3
- Pull request event: 13
- Fork event: 16
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 1
- Total pull requests: 4
- Average time to close issues: 19 minutes
- Average time to close pull requests: about 2 months
- Total issue authors: 1
- Total pull request authors: 3
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 4
- Average time to close issues: 19 minutes
- Average time to close pull requests: about 2 months
- Issue authors: 1
- Pull request authors: 3
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- Chrisqcwx (1)
Pull Request Authors
- albertotono (3)
- yegcjs (2)
- Koziev (1)
- yuanzhi-zhu (1)