Recent Releases of ppdiffusers

ppdiffusers - v3.0.0-beta

2025.05.09 发布PaddleMIX 3.0.0-beta

多模态理解 - 新增模型:Qwen2VL/Qwen2.5VL系列,DeepSeek-VL2, miniCPM-V 2.6, Janus系列,LLaVA-Critic, LLaVA-DenseConnector, LLaVA-OneVision, GOT-OCR2.0, mPLUG-Owl3 - PP系列模型:发布自研PP-DocBee文档理解多模态大模型,在学术界权威的英文文档理解评测榜单上达到同参数量级别模型SOTA - 工具链升级:完善高性能推理部署,新增支持Qwen2.5VL系列,A800推理性能较vllm领先11.5%。LLaVA、InternVL2模型训练和推理适配昇腾910B

多模态生成 - 新增模型:Open-MAGVIT2,文生视频模型CogVideoX, HunyuanVideo - PP系列模型:发布自研可控视频模型PP-VCtrl,支持在多种控制条件下的视频生成 - 工具链升级:发布ppdiffusers 0.29.1版本,新增对SD3 ControlNet和SD3.5的支持。SD3高性能推理性能打平TensorRT。SD3、SDXL模型LoRA训练和推理适配昇腾910B

- Python
Published by jerrywgz 10 months ago

ppdiffusers - v2.1.0

更新内容

  • 发布自研多模数据能力标签模型PP-InsCapTagger;可用于数据的分析和过滤,试验案例表明在保持模型效果的条件下可减少50%的数据量,大幅提高训练效率。

  • 新增Qwen2-VLInternVL2Stable Diffusion 3 (SD3)等前沿模型。

  • 多模态大模型InternVL2、LLaVA、SD3、SDXL适配昇腾910B,提供国产计算芯片上的训推能力。

What's Changed

  • 【pir 】modify dy2static Sd and 3. Grounding DINO model by @xiaoguoguo626807 in https://github.com/PaddlePaddle/PaddleMIX/pull/689
  • fix llava pretrain config by @pkhk-1 in https://github.com/PaddlePaddle/PaddleMIX/pull/685
  • Re-network the DIT, fix some parameters, and simplify the model networking code by @chang-wenbin in https://github.com/PaddlePaddle/PaddleMIX/pull/632
  • update DIT doc by @chang-wenbin in https://github.com/PaddlePaddle/PaddleMIX/pull/693
  • [NPU] Add llava npu doc by @Birdylx in https://github.com/PaddlePaddle/PaddleMIX/pull/694
  • sd3推理优化——避免同步 by @chang-wenbin in https://github.com/PaddlePaddle/PaddleMIX/pull/695
  • 减少重复拷贝,修复BUG by @chang-wenbin in https://github.com/PaddlePaddle/PaddleMIX/pull/699
  • Add Qwen2-VL infer codes by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/698
  • [doc] Update requirements by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/703
  • Llava bug by @LokeZhou in https://github.com/PaddlePaddle/PaddleMIX/pull/704
  • Fix is inference mode by @zhoutianzi666 in https://github.com/PaddlePaddle/PaddleMIX/pull/711
  • update readme by @lyuwenyu in https://github.com/PaddlePaddle/PaddleMIX/pull/705
  • update opensora video save method by @westfish in https://github.com/PaddlePaddle/PaddleMIX/pull/712
  • Limit the installed version of paddlenlp and fix bugs of llava-next. by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/716
  • SD3 transformer部分的优化 by @zhoutianzi666 in https://github.com/PaddlePaddle/PaddleMIX/pull/713
  • [wip] add mix scheme by @lyuwenyu in https://github.com/PaddlePaddle/PaddleMIX/pull/664
  • [NPU] InternVL2 supports npu training by @Birdylx in https://github.com/PaddlePaddle/PaddleMIX/pull/714
  • Add SD3 DreamBooth by @westfish in https://github.com/PaddlePaddle/PaddleMIX/pull/686
  • remove phi3 in internvl2 and refine format by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/715
  • add flash_atten for qw2vl by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/723
  • [NPU] sdxl support NPU training by @wangna11BD in https://github.com/PaddlePaddle/PaddleMIX/pull/719
  • [NPU] sdxl lora support NPU training by @warrentdrew in https://github.com/PaddlePaddle/PaddleMIX/pull/718
  • Adapt fa for npu by @LielinJiang in https://github.com/PaddlePaddle/PaddleMIX/pull/706
  • [NPU] fix readme doc for SDXL LoRA training by @warrentdrew in https://github.com/PaddlePaddle/PaddleMIX/pull/724
  • [npu]sd3 dreambooth adapt for npu by @LielinJiang in https://github.com/PaddlePaddle/PaddleMIX/pull/726
  • add pp-inscaptagger by @pkhk-1 in https://github.com/PaddlePaddle/PaddleMIX/pull/727
  • ADD SD3 batch_parallel by @chang-wenbin in https://github.com/PaddlePaddle/PaddleMIX/pull/731
  • support auto parallel in dit and largedit by @jeff41404 in https://github.com/PaddlePaddle/PaddleMIX/pull/551
  • add env_run.sh and correct packages version by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/733
  • [NPU] Fix typo by @Birdylx in https://github.com/PaddlePaddle/PaddleMIX/pull/696
  • paddlemix v2.1 readme by @lyuwenyu in https://github.com/PaddlePaddle/PaddleMIX/pull/734
  • 修复paddlenlp develop版本适配错误_10-11 by @Xiaobin-Lu in https://github.com/PaddlePaddle/PaddleMIX/pull/735
  • 修复qwen2vl视频图像预处理 by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/737
  • [wip] update v2.1 readme by @lyuwenyu in https://github.com/PaddlePaddle/PaddleMIX/pull/736
  • fix internvl2 minimonkey dataset docs by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/741
  • fix tests of evaclip and internvl2 by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/746
  • image2textgeneration rm usefast by @LokeZhou in https://github.com/PaddlePaddle/PaddleMIX/pull/744
  • fix readme for llavanextinterleave by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/748
  • support Qwen2-VL sft training by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/739
  • fix dit training by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/752
  • fix tests by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/753
  • remove use_fast in AutoTokenizer by @warrentdrew in https://github.com/PaddlePaddle/PaddleMIX/pull/747
  • fix dit weights convert to ppdiffusers by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/759
  • [PPDiffusers]fix bugs and release 0.29.0 by @westfish in https://github.com/PaddlePaddle/PaddleMIX/pull/742
  • autolabel fix nltk download by @LokeZhou in https://github.com/PaddlePaddle/PaddleMIX/pull/763
  • [NPU] fix npu llava infer by @Birdylx in https://github.com/PaddlePaddle/PaddleMIX/pull/757
  • Add npu model list by @nepeplwu in https://github.com/PaddlePaddle/PaddleMIX/pull/758
  • Fix docs of by @nemonameless in https://github.com/PaddlePaddle/PaddleMIX/pull/767
  • merge upstream readme by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/766
  • correct huggingface_hub version by @luyao-cv in https://github.com/PaddlePaddle/PaddleMIX/pull/771
  • [NPU] Refine doc by @Birdylx in https://github.com/PaddlePaddle/PaddleMIX/pull/774

New Contributors

  • @xiaoguoguo626807 made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/689
  • @chang-wenbin made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/632
  • @wangna11BD made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/719
  • @LielinJiang made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/706
  • @jeff41404 made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/551
  • @Xiaobin-Lu made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/735
  • @nepeplwu made their first contribution in https://github.com/PaddlePaddle/PaddleMIX/pull/758

Full Changelog: https://github.com/PaddlePaddle/PaddleMIX/commits/v2.1.0

- Python
Published by lyuwenyu over 1 year ago

ppdiffusers - v2.0.0

多模态理解

  1. 新增模型:LLaVA: v1.5-7b, v1.5-13b, v1,6-7b,CogAgent, CogVLM, Qwen-VL, InternLM-XComposer2
  2. 数据集增强:新增chatmldataset图文对话数据读取方案,可自定义chattemplate文件适配,支持混合数据集
  3. 工具链升级:新增Auto模块,统一SFT训练流程,兼容全参数、lora训练。新增mixtoken训练策略,SFT吞吐量提升5.6倍。支持Qwen-VL,LLaVA推理部署,较torch推理性能提升2.38倍

多模态生成

  1. 视频生成能力:支持Sora相关技术,支持DiT、SiT、UViT训练推理,新增NaViT、MAGVIT-v2模型; 新增视频生成模型SVD、Open Sora,支持模型微调和推理; 新增姿态可控视频生成模型AnimateAnyone、即插即用视频生成模型AnimateDiff、GIF视频生成模型Hotshot-XL;
  2. 文生图模型库:新增高速推理文图生成模型LCM,适配SD/SDXL训练和推理;
  3. 工具链升级:发布ppdiffusers 0.24.1版本,新增peft,accelerate后端; 权重加载/保存全面升级,支持分布式、模型切片、safetensors等场景。
  4. 生态兼容:提供基于ppdiffusers开发的ComfyUI插件,支持了常见的模型加载转换、文生图、图生图、图像局部修改等任务。新增Stable Diffusion 1.5系列节点;新增Stable Diffusion XL系列节点。新增4个图像生成的workflow案例。

DataCopilot(多模态数据处理工具箱)

  1. 多模态数据集类型MMDataset,支持加载和导出Json、H5、Jsonl等多种数据存储格式,内置并发(map, filter)数据处理接口等
  2. 多模态数据格式工具,支持自定义数据结构,数据转换,离线格式检查
  3. 多模态数据分析工具,支持基本的统计信息,数据可视化功能,以及注册自定义功能

- Python
Published by jerrywgz over 1 year ago