Recent Releases of kvpress

kvpress - v0.2.10

What's Changed

  • Migration to uv by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/108

Full Changelog: https://github.com/NVIDIA/kvpress/compare/v0.2.9...v0.2.10

- Python
Published by alessiodevoto 7 months ago

kvpress - v0.2.9

What's Changed

  • Refactor evaluation by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/96
  • Fix QFilters and DuotAttention when used with wrapper presses by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/97
  • Add HuggingFace leaderboard by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/98
  • Fix links in benchmarks directory by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/101
  • Add KVzipPress by @Janghyun1230 in https://github.com/NVIDIA/kvpress/pull/93
  • Test head-wise compression by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/103
  • run backbone model only for prefill by @giulio98 in https://github.com/NVIDIA/kvpress/pull/100
  • Transformers compatibility + evaluation by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/105

Full Changelog: https://github.com/NVIDIA/kvpress/compare/v0.2.8...v0.2.9

- Python
Published by alessiodevoto 7 months ago

kvpress - v0.2.8

What's Changed

🐛 Bug Fixes

  • Fix failing tests by @maxjeblick in https://github.com/NVIDIA/kvpress/pull/94 Reverts changes to CriticalKVPress performed in #90 that caused the press to initialize incorrectly. The PR also fixes some test logic.

Full Changelog: https://github.com/NVIDIA/kvpress/compare/v0.2.7...v0.2.8

- Python
Published by maxjeblick 8 months ago

kvpress - v0.2.7

What's Changed

🐛 Bug Fixes - Fix FinchPress for Qwen models family by @alessiodevoto in #82 Resolved compatibility issues with Qwen model architecture in FinchPress compression

✨ New Features - Add KeyDiffPress and BlockPress by @figuremout in #86 Introduces new compression methods based on key difference analysis - Fix for Qwen with Yarn by @giulio98 in #85 Enable Yarn scaling in FinchPress and KeyRerotationPress

📚 Documentation & Maintenance - Improve documentation by @maxjeblick in #90 Add docstrings to all presses, with their corresponding parameters and paper reference. - Add @alessiodevoto's to authors by @maxjeblick in #92 🚀

Full Changelog: https://github.com/NVIDIA/kvpress/compare/v0.2.6...v0.2.7

- Python
Published by maxjeblick 8 months ago

kvpress - v0.2.6

  • Improve packaging, #71 by @emmanuel-ferdman, #77 by @fanqiNO1, SDPX headers by @maxjeblick
  • Add LagKVPress, #77 by @JoelSeniorLiang
  • Support Qwen3 and Gemma3, #81 by @alessiodevoto

- Python
Published by SimJeg 8 months ago

kvpress - v0.2.5

  • Add PyramidKVPress, #65 by @figuremout
  • Fix style errors, #68 by @maxjeblick
  • Add FinchPress, #64 and #69, by @giulio98, @miriam-16, @FaureElia and @SimJeg

- Python
Published by SimJeg 10 months ago

kvpress - v0.2.4

  • Add QFilterPress, #54 by @NathanGodey
  • Update copyright dates and add citation file, #60 by @SimJeg
  • Add ChunkKVPress, #51 by @Dominic789654

- Python
Published by SimJeg 11 months ago

kvpress - v0.2.3

  • Fix distributed inference for the ExpectedAttentionPress, #49 by @SimJeg
  • Add DuoAttentionPress, #50 by @SimJeg

- Python
Published by SimJeg about 1 year ago

kvpress - v0.2.2

  • Fix style check, #48 by @maxjeblick
  • Add CriticalKVPress, #46 by @FFY0
  • Add epsilon to ExpectedAttentionPress, #47 by @SimJeg

- Python
Published by SimJeg about 1 year ago

kvpress - v0.2.1

  • Add ChunkPress, #40 by @maxjeblick and @giulio98
  • Update README, including new huggingface space, #41 and #42 by @SimJeg

- Python
Published by SimJeg about 1 year ago

kvpress - v0.2.0

Transformers v4.48 introduced breaking changes handled in this release. The release also features AdaKVPress, the first press allowing head-wise compression by patching the attention functions registered in ALL_ATTENTION_FUNCTIONS since v4.48. When combined with ExpectedAttentionPress, AdaKVPress achieved the best results observed yet on the RULER benchmark (see this post).

  • Add AdaKVPress, #38 by @SimJeg and @FFY0
  • Handle transformers 4.48, #39 by @SimJeg
  • Add InfiniteBench results, #11 by @maxjeblick

- Python
Published by SimJeg about 1 year ago

kvpress - v0.1.1

What's Changed

  • https://github.com/NVIDIA/kvpress/pull/33 by @SimJeg fixes a small bug in the pipeline
  • https://github.com/NVIDIA/kvpress/pull/36 by @maxjeblick sets transformers <4.48 as a dependency

Full Changelog: https://github.com/NVIDIA/kvpress/compare/v0.1.0...v0.1.1

- Python
Published by maxjeblick about 1 year ago

kvpress - v0.1.0

24 by @maxjeblick and #29 by @SimJeg introduce a non-breaking refactoring:

  • a press does not require the compression_ratio input argument anymore as some presses do not explicitly require it (e.g. ThinKPress, SimLayerKVPress). However every press must have a compression_ratio attribute after any forward pass (assertion added in tests) to allow average compression ratio measurement on a benchmark
  • the core compression logic has been moved from BasePress.forward_hook to BasePress.compress. BasePress.forward_hook now only checks if compress must be called (pre-filling vs decoding), de-quantize cache before compress and re-quantize it afterwards
  • the BasePress does not implement a score method anymore, this has been moved to the ScorerPress with the associated ScorerPress.compress method

Other features: - Add SimLayerKVPress, #28 by @SimJeg and @dame-cell - Add ComposedPress, #29 by @SimJeg - Add KeyReRotationPress, #31 by @maxjeblick and @giulio98 - Fix QuantizedCache, #30 by @maxjeblick - Add new tests, including an integration test on a sample from RULER

- Python
Published by SimJeg about 1 year ago

kvpress - v0.0.4

  • Add ThinKPress, #20 by @SimJeg

- Python
Published by SimJeg about 1 year ago

kvpress -

  • Update speed and memory plots, #10 by @maxjeblick
  • Add TOVAPress, #12 by @SimJeg

- Python
Published by SimJeg about 1 year ago

kvpress - Release v0.0.2

Release v0.0.2

  • Add support for QuantizedCache, #5 by @SimJeg
  • Add colab demo notebook, #6 by @maxjeblick

- Python
Published by SimJeg over 1 year ago

kvpress - Initial release

- Python
Published by SimJeg over 1 year ago