Releases | Open Source Science

vidore-benchmark - v5.0.0: Package restructure

Description

This release is an extensive package restructure, making vidore-benchmark faster and easier to use. Among the main new features:

⚡ Faster eval with optimized data loading 💽 Support for BEIR format 🖥️ A cleaner, more intuitive CLI 🐍 Evaluate vision retrieval straight from the Python API

🚨 This releases introduces many breaking changes, thus it is NOT backward-compatible with previous versions.

Features

Added

Add CLI eval support for ColQwen2, DSEQwen2, Cohere, ColIdefics3 API embedding models
Add Pydantic models for storing the ViDoRe benchmark results and metadata (includes vidore-benchmark version)
Add option to create an EvalManager instance from ViDoReBenchmarkResults
Add num_workers argument when using dataloaders
Allow the creation of a VisionRetriever instance using a PyTorch model and a processor that implements a process_images and a process_queries methods, similarly to the ColVision processors
Add dataloader_prebatch_query and dataloader_prebatch_passage arguments to avoid loading the entire datasets in memory (used to cause RAM spikes when loading large image datasets)
Add QA-to-BEIR dataset format conversion script
Add support for the BEIR dataset format with ViDoReEvaluatorBEIR

Changed

[Breaking] Change the CLI argument names
Add option to load a specific checkpoint for Hf models with pretrained_model_name_or_path
Improve soft dependency handling in retriever classes (the colpali-engine is now optional)
[Breaking] Change the get_scores signature
[Breaking] Rename forward_documents to forward_passages to match the literature and reduce confusion
[Breaking] Rename DSERetriever into DSEQwen2Retriever
[Breaking] Rename args in the CLI script
When available, use processor.get_scores instead of custom scoring snippet
[Breaking] Rename ColQwenRetriever to ColQwen2Retriever
[Breaking] Rename BiQwenRetriever to BiQwen2Retriever
[Breaking] Revamp the evaluate module. Evaluation is now handled by the ViDoReEvaluatorQA class
[Breaking] Rename ViDoReEvaluator into BaseViDoReEvaluator. The new ViDoReEvaluator class allows to create retrievers using the Python API.
Set default num_workers to 0 in retrievers
Update default checkpoints for ColPali and ColQwen2 retrievers

Fixed

Fix evaluate_dataset when used with the BM25 retriever
Fix issue when no pretrained_model_name_or_path = None in load_vision_retriever_from_registry
Fix DummyRetriever's get_scores method
Fix processor output not being sent to the correct device in ColQwen2Retriever
Fix bugs in BiQwen2Retriever
Fix try-catch block for soft dep check in SigLIPRetriever

Removed

Remove experimental quantization module
Remove the interpretability module. The interpretability code has been moved and improved as part of the colpali-engine==0.3.2 release.
[Breaking] Remove support for token pooling. This feature will be re-introduced in colpali-engine>0.3.9
Replace loguru with built-in logging module
Remove the retrieve_on_dataset and retrieve_on_pdfs entrypoint scripts
Remove the pdf_utils module
Remove the get_top_k method from the evaluate module
Remove the plot_utils and test_utils modules
Remove the experiments directory

Tests

Add tests for all built-in vision retrievers
Add fixtures in retriever tests to speed up testing
Add tests for ViDoReBenchmarkResults
Add tests for EvalManager
Add tests and E2E tests for cli command evaluate-retriever
Add tests for ViDoReEvaluatorBEIR

Full Changelog: https://github.com/illuin-tech/vidore-benchmark/compare/v4.0.2...v5.0.0

- Python
Published by tonywu71 over 1 year ago

Features

Deprecated

Deprecate the interpretability module

Build

Fix and update conflicts for package dependencies

- Python
Published by tonywu71 over 1 year ago

Features

Changed

Rename ColPali model alias to match model name (use --model-name vidore/colpali instead of --model-name vidore/colpali-v1.2 with the vidore-benchmark evaluate-retriever CLI)
Use the ColPali model name to load ColPaliProcessor instead of the PaliGemma one

Fixed

Add missing model.eval() to all vision retrievers to make results deterministic

Full Changelog: https://github.com/illuin-tech/vidore-benchmark/compare/v4.0.0...v4.0.1

- Python
Published by tonywu71 over 1 year ago

Description

🚨 This release is breaking as it leverages the restructured colpali-engine package. In particular, the duplicate code logic was removed. Thus, it is NOT backward-compatible with previous versions.

Features

Added

Add "Ruff" and "Test" CI pipelines
Add upper bound for colpali-engine to prevent eventual breaking changes (Follow-up of https://github.com/illuin-tech/vidore-benchmark/pull/30)

Changed

Remove unused deps from pyproject
Clean pyproject
Bump colpali-engine to v0.3.0 and adapt code for the new API
Replace black with ruff linter
Add better ColPali model loading

Removed

Remove duplicate code with colpali-engine (e.g. remove ColPaliProcessor, ColPaliScorer...)

Fixed

Change typing to support Python 3.9
Fix the generate_similarity_maps CLI
Various fixes

Full Changelog: https://github.com/illuin-tech/vidore-benchmark/compare/v3.4.2...v4.0.0

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v3.4.2

Features

Fixed

Fix typo when making model_name configurable in previous release
Fix wrong image processing for ColPali

Full Changelog: https://github.com/illuin-tech/vidore-benchmark/compare/v3.4.1...v3.4.2

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v3.4.1

What's Changed

Make model_name configurable in ColPaliRetriever as optional arg
Tweak EvalManager
Tweak ColPaliProcessor
Improve tests

New Contributors

@ByeongkiJeong made their first contribution in https://github.com/illuin-tech/vidore-benchmark/pull/26

Full Changelog: https://github.com/illuin-tech/vidore-benchmark/compare/v3.4.0...v3.4.1

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v3.4.0

Features

Bump ColPali to v1.2 [model card]

Build

Tweak gitignore
Add sentence-transformers to compulsory dependencies
Remove the baselines dependency group
Add the colpali-engine dependency

Style

Apply linter

- Python
Published by tonywu71 almost 2 years ago

vidore-benchmark - v3.3.0

Features

[retriever] Deprecate unused ColPaliWithPoolingRetriever
[compression] Add int8 quantization for embeddings (experimental)
[compression] Add tests for int8 quantization
[compression] Re-organize quantization modules
[build] Add dynamic versioning with hatch-vcs

Fixes

[retriever] Fix dtype conversion in ColPaliScorer
[compression] Fix int8 overflow when computing ColBERT score w/ int8 embeddings
[compression] Fix quantization tests

Documentation

Update README

Chores

Add loggers in modules (with loguru)

- Python
Published by tonywu71 almost 2 years ago

vidore-benchmark - v3.2.0

Features

Add experiments on token pooling
Add from_multiple_json method in EvalManager

Fixes

Fix error when using token pooling with bfloat16 tensors
Add missing L2 normalization in token pooling
Fix typos in the root README

Chores

Add GitHub Linguist to remove notebook from repo stats

- Python
Published by tonywu71 almost 2 years ago

vidore-benchmark - v3.1.0

Features

Add support for token pooling in document embeddings
Update README

Build

Add support for SkyPilot
Fix ruff settings

- Python
Published by tonywu71 almost 2 years ago

vidore-benchmark - v3.0.0

Features

[Breaking] All the CLI arguments now clearly show if the batch size is related to the query embedding computation, the doc embedding computation, or the scoring
Add barplot for score per token in the interpretability script
Update README and docstrings

Fixes

Fix DummyRetriever
Fix a few broken retriever tests

- Python
Published by tonywu71 almost 2 years ago

vidore-benchmark - v2.2.1

Fixes

Fix wrong dictionary key in the evaluate_retriever CLI script

- Python
Published by tonywu71 almost 2 years ago

vidore-benchmark - v2.2.0

Features

Currify vision retriever registry + fix wrong typing
Rename requirements.txt to requirements-dev.txt to avoid confusion

- Python
Published by tonywu71 almost 2 years ago

vidore-benchmark - v2.1.0

- Python
Published by tonywu71 almost 2 years ago

vidore-benchmark - v2.0.0

Breaking

For all instances of VisionRetriever, the forward_queries and forward_documents methods now require a batch_size argument

Features

Transfer repository ownership from tonywu71/vidore-benchmark to illuin-tech/vidore-benchmark
Update documentation

- Python
Published by tonywu71 almost 2 years ago

vidore-benchmark - v1.0.1

Features

Add missing references to ColPali: Efficient Document Retrieval with Vision Language Models [arXiv]
Fix typos and tweak README

Fixes

Fix a bug for the retrieve_on_pdfs entrypoint script

- Python
Published by tonywu71 almost 2 years ago

vidore-benchmark - v1.0.0

Features

Add evaluation scripts for the ViDoRe benchmark introduced in the "ColPali: Efficient Document Retrieval with Vision Language Models" paper.
Add off-the-shelf retriever implementations from the ColPali paper: BGE-M3, BM25, Jina-CLIP, Nomic-Vision, SigLIP, and ColPali.
Add code to generate similarity maps following the methodology introduced in the ColPali paper.
Add entrypoint scripts: vidore-benchmark (evaluation scripts) and generate-similarity-maps (interpretability).

- Python
Published by tonywu71 almost 2 years ago

Recent Releases of vidore-benchmark

vidore-benchmark - v5.0.0: Package restructure

Description

Features

Added

Changed

Fixed

Removed

Tests

vidore-benchmark - v4.0.2

Features

Deprecated

Build

vidore-benchmark - v4.0.1

Features

Changed

Fixed

vidore-benchmark - v4.0.0

Description

Features

Added

Changed

Removed

Fixed

vidore-benchmark - v3.4.2

Features

Fixed

vidore-benchmark - v3.4.1

What's Changed

New Contributors

vidore-benchmark - v3.4.0

Features

Build

Style

vidore-benchmark - v3.3.0

Features

Fixes

Documentation

Chores

vidore-benchmark - v3.2.0

Features

Fixes

Chores

vidore-benchmark - v3.1.0

Features

Build

vidore-benchmark - v3.0.0

Features

Fixes

vidore-benchmark - v2.2.1

Fixes

vidore-benchmark - v2.2.0

Features

vidore-benchmark - v2.1.0

vidore-benchmark - v2.0.0

Breaking

Features

vidore-benchmark - v1.0.1

Features

Fixes

vidore-benchmark - v1.0.0

Features