What's Changed

Migration to uv by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/108

Full Changelog: https://github.com/NVIDIA/kvpress/compare/v0.2.9...v0.2.10

- Python
Published by alessiodevoto 10 months ago

What's Changed

Refactor evaluation by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/96
Fix QFilters and DuotAttention when used with wrapper presses by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/97
Add HuggingFace leaderboard by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/98
Fix links in benchmarks directory by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/101
Add KVzipPress by @Janghyun1230 in https://github.com/NVIDIA/kvpress/pull/93
Test head-wise compression by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/103
run backbone model only for prefill by @giulio98 in https://github.com/NVIDIA/kvpress/pull/100
Transformers compatibility + evaluation by @alessiodevoto in https://github.com/NVIDIA/kvpress/pull/105

Full Changelog: https://github.com/NVIDIA/kvpress/compare/v0.2.8...v0.2.9

- Python
Published by alessiodevoto 10 months ago

What's Changed

🐛 Bug Fixes

Fix failing tests by @maxjeblick in https://github.com/NVIDIA/kvpress/pull/94 Reverts changes to CriticalKVPress performed in #90 that caused the press to initialize incorrectly. The PR also fixes some test logic.

Full Changelog: https://github.com/NVIDIA/kvpress/compare/v0.2.7...v0.2.8

- Python
Published by maxjeblick 11 months ago

kvpress - v0.2.7

What's Changed

🐛 Bug Fixes - Fix FinchPress for Qwen models family by @alessiodevoto in #82 Resolved compatibility issues with Qwen model architecture in FinchPress compression

✨ New Features - Add KeyDiffPress and BlockPress by @figuremout in #86 Introduces new compression methods based on key difference analysis - Fix for Qwen with Yarn by @giulio98 in #85 Enable Yarn scaling in FinchPress and KeyRerotationPress

📚 Documentation & Maintenance - Improve documentation by @maxjeblick in #90 Add docstrings to all presses, with their corresponding parameters and paper reference. - Add @alessiodevoto's to authors by @maxjeblick in #92 🚀

Full Changelog: https://github.com/NVIDIA/kvpress/compare/v0.2.6...v0.2.7

- Python
Published by maxjeblick 11 months ago

kvpress - v0.2.6

Improve packaging, #71 by @emmanuel-ferdman, #77 by @fanqiNO1, SDPX headers by @maxjeblick
Add LagKVPress, #77 by @JoelSeniorLiang
Support Qwen3 and Gemma3, #81 by @alessiodevoto

- Python
Published by SimJeg 12 months ago

kvpress - v0.2.5

Add PyramidKVPress, #65 by @figuremout
Fix style errors, #68 by @maxjeblick
Add FinchPress, #64 and #69, by @giulio98, @miriam-16, @FaureElia and @SimJeg

- Python
Published by SimJeg about 1 year ago

kvpress - v0.2.4

Add QFilterPress, #54 by @NathanGodey
Update copyright dates and add citation file, #60 by @SimJeg
Add ChunkKVPress, #51 by @Dominic789654

- Python
Published by SimJeg about 1 year ago

kvpress - v0.2.3

Fix distributed inference for the ExpectedAttentionPress, #49 by @SimJeg
Add DuoAttentionPress, #50 by @SimJeg

- Python
Published by SimJeg over 1 year ago

kvpress - v0.2.2

Fix style check, #48 by @maxjeblick
Add CriticalKVPress, #46 by @FFY0
Add epsilon to ExpectedAttentionPress, #47 by @SimJeg

- Python
Published by SimJeg over 1 year ago

kvpress - v0.2.1

Add ChunkPress, #40 by @maxjeblick and @giulio98
Update README, including new huggingface space, #41 and #42 by @SimJeg

- Python
Published by SimJeg over 1 year ago

kvpress - v0.2.0

Transformers v4.48 introduced breaking changes handled in this release. The release also features AdaKVPress, the first press allowing head-wise compression by patching the attention functions registered in ALL_ATTENTION_FUNCTIONS since v4.48. When combined with ExpectedAttentionPress, AdaKVPress achieved the best results observed yet on the RULER benchmark (see this post).

Add AdaKVPress, #38 by @SimJeg and @FFY0
Handle transformers 4.48, #39 by @SimJeg
Add InfiniteBench results, #11 by @maxjeblick

- Python
Published by SimJeg over 1 year ago

kvpress - v0.1.1

What's Changed

https://github.com/NVIDIA/kvpress/pull/33 by @SimJeg fixes a small bug in the pipeline
https://github.com/NVIDIA/kvpress/pull/36 by @maxjeblick sets transformers <4.48 as a dependency

Full Changelog: https://github.com/NVIDIA/kvpress/compare/v0.1.0...v0.1.1

- Python
Published by maxjeblick over 1 year ago

kvpress - v0.1.0

24 by @maxjeblick and #29 by @SimJeg introduce a non-breaking refactoring:

a press does not require the compression_ratio input argument anymore as some presses do not explicitly require it (e.g. ThinKPress, SimLayerKVPress). However every press must have a compression_ratio attribute after any forward pass (assertion added in tests) to allow average compression ratio measurement on a benchmark
the core compression logic has been moved from BasePress.forward_hook to BasePress.compress. BasePress.forward_hook now only checks if compress must be called (pre-filling vs decoding), de-quantize cache before compress and re-quantize it afterwards
the BasePress does not implement a score method anymore, this has been moved to the ScorerPress with the associated ScorerPress.compress method

Other features: - Add SimLayerKVPress, #28 by @SimJeg and @dame-cell - Add ComposedPress, #29 by @SimJeg - Add KeyReRotationPress, #31 by @maxjeblick and @giulio98 - Fix QuantizedCache, #30 by @maxjeblick - Add new tests, including an integration test on a sample from RULER

- Python
Published by SimJeg over 1 year ago

kvpress - v0.0.4

Add ThinKPress, #20 by @SimJeg

- Python
Published by SimJeg over 1 year ago

kvpress -

Update speed and memory plots, #10 by @maxjeblick
Add TOVAPress, #12 by @SimJeg

- Python
Published by SimJeg over 1 year ago

kvpress - Release v0.0.2

Release v0.0.2

Add support for QuantizedCache, #5 by @SimJeg
Add colab demo notebook, #6 by @maxjeblick

- Python
Published by SimJeg over 1 year ago

kvpress - Initial release

- Python
Published by SimJeg over 1 year ago

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

Recent Releases of kvpress

kvpress - v0.2.10

What's Changed

kvpress - v0.2.9

What's Changed

kvpress - v0.2.8

What's Changed

kvpress - v0.2.7

kvpress - v0.2.6

kvpress - v0.2.5

kvpress - v0.2.4

kvpress - v0.2.3

kvpress - v0.2.2

kvpress - v0.2.1

kvpress - v0.2.0

kvpress - v0.1.1

What's Changed

kvpress - v0.1.0

24 by @maxjeblick and #29 by @SimJeg introduce a non-breaking refactoring:

kvpress - v0.0.4

kvpress -

kvpress - Release v0.0.2

Release v0.0.2

kvpress - Initial release