Recent Releases of ppdiffusers

ppdiffusers - v3.0.0-beta

2025.05.09 发布PaddleMIX 3.0.0-beta

多模态理解 - 新增模型：Qwen2VL/Qwen2.5VL系列，DeepSeek-VL2, miniCPM-V 2.6, Janus系列，LLaVA-Critic, LLaVA-DenseConnector, LLaVA-OneVision, GOT-OCR2.0, mPLUG-Owl3 - PP系列模型：发布自研PP-DocBee文档理解多模态大模型，在学术界权威的英文文档理解评测榜单上达到同参数量级别模型SOTA - 工具链升级：完善高性能推理部署，新增支持Qwen2.5VL系列，A800推理性能较vllm领先11.5%。LLaVA、InternVL2模型训练和推理适配昇腾910B

多模态生成 - 新增模型：Open-MAGVIT2，文生视频模型CogVideoX, HunyuanVideo - PP系列模型：发布自研可控视频模型PP-VCtrl，支持在多种控制条件下的视频生成 - 工具链升级：发布ppdiffusers 0.29.1版本，新增对SD3 ControlNet和SD3.5的支持。SD3高性能推理性能打平TensorRT。SD3、SDXL模型LoRA训练和推理适配昇腾910B

- Python
Published by jerrywgz about 1 year ago

ppdiffusers - v2.1.0

更新内容

发布自研多模数据能力标签模型PP-InsCapTagger；可用于数据的分析和过滤，试验案例表明在保持模型效果的条件下可减少50%的数据量，大幅提高训练效率。
新增Qwen2-VL、InternVL2、Stable Diffusion 3 (SD3)等前沿模型。
多模态大模型InternVL2、LLaVA、SD3、SDXL适配昇腾910B，提供国产计算芯片上的训推能力。

What's Changed

【pir 】modify dy2static Sd and 3. Grounding DINO model by @xiaoguoguo626807 in https://github.com/PaddlePaddle/PaddleMIX/pull/689
fix llava pretrain config by @pkhk-1 in https://github.com/PaddlePaddle/PaddleMIX/pull/685
Re-network the DIT, fix some parameters, and simplify the model networking code by @chang-wenbin in https://github.com/PaddlePaddle/PaddleMIX/pull/632
update DIT doc by @chang-wenbin in https://github.com/PaddlePaddle/PaddleMIX/pull/693
[NPU] Add llava npu doc by @Birdylx in https://github.com/PaddlePaddle/PaddleMIX/pull/694
sd3推理优化——避免同步 by @chang-wenbin in https://github.com/PaddlePaddle/PaddleMIX/pull/695
减少重复拷贝，修复BUG by @chang-wenbin in https://github.com/PaddlePaddle/PaddleMIX/pull/699
Add Qwen2-VL infer codes by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/698
[doc] Update requirements by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/703
Llava bug by @LokeZhou in https://github.com/PaddlePaddle/PaddleMIX/pull/704
Fix is inference mode by @zhoutianzi666 in https://github.com/PaddlePaddle/PaddleMIX/pull/711
update readme by @lyuwenyu in https://github.com/PaddlePaddle/PaddleMIX/pull/705
update opensora video save method by @westfish in https://github.com/PaddlePaddle/PaddleMIX/pull/712
Limit the installed version of paddlenlp and fix bugs of llava-next. by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/716
SD3 transformer部分的优化 by @zhoutianzi666 in https://github.com/PaddlePaddle/PaddleMIX/pull/713
[wip] add mix scheme by @lyuwenyu in https://github.com/PaddlePaddle/PaddleMIX/pull/664
[NPU] InternVL2 supports npu training by @Birdylx in https://github.com/PaddlePaddle/PaddleMIX/pull/714
Add SD3 DreamBooth by @westfish in https://github.com/PaddlePaddle/PaddleMIX/pull/686
remove phi3 in internvl2 and refine format by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/715
add flash_atten for qw2vl by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/723
[NPU] sdxl support NPU training by @wangna11BD in https://github.com/PaddlePaddle/PaddleMIX/pull/719
[NPU] sdxl lora support NPU training by @warrentdrew in https://github.com/PaddlePaddle/PaddleMIX/pull/718
Adapt fa for npu by @LielinJiang in https://github.com/PaddlePaddle/PaddleMIX/pull/706
[NPU] fix readme doc for SDXL LoRA training by @warrentdrew in https://github.com/PaddlePaddle/PaddleMIX/pull/724
[npu]sd3 dreambooth adapt for npu by @LielinJiang in https://github.com/PaddlePaddle/PaddleMIX/pull/726
add pp-inscaptagger by @pkhk-1 in https://github.com/PaddlePaddle/PaddleMIX/pull/727
ADD SD3 batch_parallel by @chang-wenbin in https://github.com/PaddlePaddle/PaddleMIX/pull/731
support auto parallel in dit and largedit by @jeff41404 in https://github.com/PaddlePaddle/PaddleMIX/pull/551
add env_run.sh and correct packages version by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/733
[NPU] Fix typo by @Birdylx in https://github.com/PaddlePaddle/PaddleMIX/pull/696
paddlemix v2.1 readme by @lyuwenyu in https://github.com/PaddlePaddle/PaddleMIX/pull/734
修复paddlenlp develop版本适配错误_10-11 by @Xiaobin-Lu in https://github.com/PaddlePaddle/PaddleMIX/pull/735
修复qwen2vl视频图像预处理 by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/737
[wip] update v2.1 readme by @lyuwenyu in https://github.com/PaddlePaddle/PaddleMIX/pull/736
fix internvl2 minimonkey dataset docs by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/741
fix tests of evaclip and internvl2 by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/746
image2textgeneration rm usefast by @LokeZhou in https://github.com/PaddlePaddle/PaddleMIX/pull/744
fix readme for llavanextinterleave by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/748
support Qwen2-VL sft training by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/739
fix dit training by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/752
fix tests by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/753
remove use_fast in AutoTokenizer by @warrentdrew in https://github.com/PaddlePaddle/PaddleMIX/pull/747
fix dit weights convert to ppdiffusers by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/759
[PPDiffusers]fix bugs and release 0.29.0 by @westfish in https://github.com/PaddlePaddle/PaddleMIX/pull/742
autolabel fix nltk download by @LokeZhou in https://github.com/PaddlePaddle/PaddleMIX/pull/763
[NPU] fix npu llava infer by @Birdylx in https://github.com/PaddlePaddle/PaddleMIX/pull/757
Add npu model list by @nepeplwu in https://github.com/PaddlePaddle/PaddleMIX/pull/758
Fix docs of by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/767
merge upstream readme by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/766
correct huggingface_hub version by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/771
[NPU] Refine doc by @Birdylx in https://github.com/PaddlePaddle/PaddleMIX/pull/774

New Contributors

@xiaoguoguo626807 made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/689
@chang-wenbin made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/632
@wangna11BD made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/719
@LielinJiang made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/706
@jeff41404 made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/551
@Xiaobin-Lu made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/735
@nepeplwu made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/758

Full Changelog: https://github.com/PaddlePaddle/PaddleMIX/commits/v2.1.0

- Python
Published by lyuwenyu over 1 year ago

ppdiffusers - v2.0.0

多模态理解

新增模型：LLaVA: v1.5-7b, v1.5-13b, v1,6-7b，CogAgent, CogVLM, Qwen-VL, InternLM-XComposer2
数据集增强：新增chatmldataset图文对话数据读取方案，可自定义chattemplate文件适配，支持混合数据集
工具链升级：新增Auto模块，统一SFT训练流程，兼容全参数、lora训练。新增mixtoken训练策略，SFT吞吐量提升5.6倍。支持Qwen-VL，LLaVA推理部署，较torch推理性能提升2.38倍

多模态生成

视频生成能力：支持Sora相关技术，支持DiT、SiT、UViT训练推理，新增NaViT、MAGVIT-v2模型；新增视频生成模型SVD、Open Sora，支持模型微调和推理；新增姿态可控视频生成模型AnimateAnyone、即插即用视频生成模型AnimateDiff、GIF视频生成模型Hotshot-XL；
文生图模型库：新增高速推理文图生成模型LCM，适配SD/SDXL训练和推理；
工具链升级：发布ppdiffusers 0.24.1版本，新增peft，accelerate后端；权重加载/保存全面升级，支持分布式、模型切片、safetensors等场景。
生态兼容：提供基于ppdiffusers开发的ComfyUI插件，支持了常见的模型加载转换、文生图、图生图、图像局部修改等任务。新增Stable Diffusion 1.5系列节点；新增Stable Diffusion XL系列节点。新增4个图像生成的workflow案例。

DataCopilot（多模态数据处理工具箱）

多模态数据集类型MMDataset，支持加载和导出Json、H5、Jsonl等多种数据存储格式，内置并发（map, filter）数据处理接口等
多模态数据格式工具，支持自定义数据结构，数据转换，离线格式检查
多模态数据分析工具，支持基本的统计信息，数据可视化功能，以及注册自定义功能

- Python
Published by jerrywgz almost 2 years ago

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science