Recent Releases of vidore-benchmark

vidore-benchmark - v5.0.0: Package restructure

Description

This release is an extensive package restructure, making vidore-benchmark faster and easier to use. Among the main new features:

⚡ Faster eval with optimized data loading 💽 Support for BEIR format 🖥️ A cleaner, more intuitive CLI 🐍 Evaluate vision retrieval straight from the Python API

🚨 This releases introduces many breaking changes, thus it is NOT backward-compatible with previous versions.

Features

Added

  • Add CLI eval support for ColQwen2, DSEQwen2, Cohere, ColIdefics3 API embedding models
  • Add Pydantic models for storing the ViDoRe benchmark results and metadata (includes vidore-benchmark version)
  • Add option to create an EvalManager instance from ViDoReBenchmarkResults
  • Add num_workers argument when using dataloaders
  • Allow the creation of a VisionRetriever instance using a PyTorch model and a processor that implements a process_images and a process_queries methods, similarly to the ColVision processors
  • Add dataloader_prebatch_query and dataloader_prebatch_passage arguments to avoid loading the entire datasets in memory (used to cause RAM spikes when loading large image datasets)
  • Add QA-to-BEIR dataset format conversion script
  • Add support for the BEIR dataset format with ViDoReEvaluatorBEIR

Changed

  • [Breaking] Change the CLI argument names
  • Add option to load a specific checkpoint for Hf models with pretrained_model_name_or_path
  • Improve soft dependency handling in retriever classes (the colpali-engine is now optional)
  • [Breaking] Change the get_scores signature
  • [Breaking] Rename forward_documents to forward_passages to match the literature and reduce confusion
  • [Breaking] Rename DSERetriever into DSEQwen2Retriever
  • [Breaking] Rename args in the CLI script
  • When available, use processor.get_scores instead of custom scoring snippet
  • [Breaking] Rename ColQwenRetriever to ColQwen2Retriever
  • [Breaking] Rename BiQwenRetriever to BiQwen2Retriever
  • [Breaking] Revamp the evaluate module. Evaluation is now handled by the ViDoReEvaluatorQA class
  • [Breaking] Rename ViDoReEvaluator into BaseViDoReEvaluator. The new ViDoReEvaluator class allows to create retrievers using the Python API.
  • Set default num_workers to 0 in retrievers
  • Update default checkpoints for ColPali and ColQwen2 retrievers

Fixed

  • Fix evaluate_dataset when used with the BM25 retriever
  • Fix issue when no pretrained_model_name_or_path = None in load_vision_retriever_from_registry
  • Fix DummyRetriever's get_scores method
  • Fix processor output not being sent to the correct device in ColQwen2Retriever
  • Fix bugs in BiQwen2Retriever
  • Fix try-catch block for soft dep check in SigLIPRetriever

Removed

  • Remove experimental quantization module
  • Remove the interpretability module. The interpretability code has been moved and improved as part of the colpali-engine==0.3.2 release.
  • [Breaking] Remove support for token pooling. This feature will be re-introduced in colpali-engine>0.3.9
  • Replace loguru with built-in logging module
  • Remove the retrieve_on_dataset and retrieve_on_pdfs entrypoint scripts
  • Remove the pdf_utils module
  • Remove the get_top_k method from the evaluate module
  • Remove the plot_utils and test_utils modules
  • Remove the experiments directory

Tests

  • Add tests for all built-in vision retrievers
  • Add fixtures in retriever tests to speed up testing
  • Add tests for ViDoReBenchmarkResults
  • Add tests for EvalManager
  • Add tests and E2E tests for cli command evaluate-retriever
  • Add tests for ViDoReEvaluatorBEIR

Full Changelog: https://github.com/illuin-tech/vidore-benchmark/compare/v4.0.2...v5.0.0

- Python
Published by tonywu71 about 1 year ago

vidore-benchmark - v4.0.2

Features

Deprecated

  • Deprecate the interpretability module

Build

  • Fix and update conflicts for package dependencies

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v4.0.1

Features

Changed

  • Rename ColPali model alias to match model name (use --model-name vidore/colpali instead of --model-name vidore/colpali-v1.2 with the vidore-benchmark evaluate-retriever CLI)
  • Use the ColPali model name to load ColPaliProcessor instead of the PaliGemma one

Fixed

  • Add missing model.eval() to all vision retrievers to make results deterministic

Full Changelog: https://github.com/illuin-tech/vidore-benchmark/compare/v4.0.0...v4.0.1

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v4.0.0

Description

🚨 This release is breaking as it leverages the restructured colpali-engine package. In particular, the duplicate code logic was removed. Thus, it is NOT backward-compatible with previous versions.

Features

Added

  • Add "Ruff" and "Test" CI pipelines
  • Add upper bound for colpali-engine to prevent eventual breaking changes (Follow-up of https://github.com/illuin-tech/vidore-benchmark/pull/30)

Changed

  • Remove unused deps from pyproject
  • Clean pyproject
  • Bump colpali-engine to v0.3.0 and adapt code for the new API
  • Replace black with ruff linter
  • Add better ColPali model loading

Removed

  • Remove duplicate code with colpali-engine (e.g. remove ColPaliProcessor, ColPaliScorer...)

Fixed

  • Change typing to support Python 3.9
  • Fix the generate_similarity_maps CLI
  • Various fixes

Full Changelog: https://github.com/illuin-tech/vidore-benchmark/compare/v3.4.2...v4.0.0

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v3.4.2

Features

Fixed

  • Fix typo when making model_name configurable in previous release
  • Fix wrong image processing for ColPali

Full Changelog: https://github.com/illuin-tech/vidore-benchmark/compare/v3.4.1...v3.4.2

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v3.4.1

What's Changed

  • Make model_name configurable in ColPaliRetriever as optional arg
  • Tweak EvalManager
  • Tweak ColPaliProcessor
  • Improve tests

New Contributors

  • @ByeongkiJeong made their first contribution in https://github.com/illuin-tech/vidore-benchmark/pull/26

Full Changelog: https://github.com/illuin-tech/vidore-benchmark/compare/v3.4.0...v3.4.1

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v3.4.0

Features

Build

  • Tweak gitignore
  • Add sentence-transformers to compulsory dependencies
  • Remove the baselines dependency group
  • Add the colpali-engine dependency

Style

  • Apply linter

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v3.3.0

Features

  • [retriever] Deprecate unused ColPaliWithPoolingRetriever
  • [compression] Add int8 quantization for embeddings (experimental)
  • [compression] Add tests for int8 quantization
  • [compression] Re-organize quantization modules
  • [build] Add dynamic versioning with hatch-vcs

Fixes

  • [retriever] Fix dtype conversion in ColPaliScorer
  • [compression] Fix int8 overflow when computing ColBERT score w/ int8 embeddings
  • [compression] Fix quantization tests

Documentation

  • Update README

Chores

  • Add loggers in modules (with loguru)

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v3.2.0

Features

  • Add experiments on token pooling
  • Add from_multiple_json method in EvalManager

Fixes

  • Fix error when using token pooling with bfloat16 tensors
  • Add missing L2 normalization in token pooling
  • Fix typos in the root README

Chores

  • Add GitHub Linguist to remove notebook from repo stats

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v3.1.0

Features

  • Add support for token pooling in document embeddings
  • Update README

Build

  • Add support for SkyPilot
  • Fix ruff settings

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v3.0.0

Features

  • [Breaking] All the CLI arguments now clearly show if the batch size is related to the query embedding computation, the doc embedding computation, or the scoring
  • Add barplot for score per token in the interpretability script
  • Update README and docstrings

Fixes

  • Fix DummyRetriever
  • Fix a few broken retriever tests

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v2.2.1

Fixes

  • Fix wrong dictionary key in the evaluate_retriever CLI script

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v2.2.0

Features

  • Currify vision retriever registry + fix wrong typing
  • Rename requirements.txt to requirements-dev.txt to avoid confusion

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v2.1.0

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v2.0.0

Breaking

  • For all instances of VisionRetriever, the forward_queries and forward_documents methods now require a batch_size argument

Features

  • Transfer repository ownership from tonywu71/vidore-benchmark to illuin-tech/vidore-benchmark
  • Update documentation

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v1.0.1

Features

  • Add missing references to ColPali: Efficient Document Retrieval with Vision Language Models [arXiv]
  • Fix typos and tweak README

Fixes

  • Fix a bug for the retrieve_on_pdfs entrypoint script

- Python
Published by tonywu71 over 1 year ago

vidore-benchmark - v1.0.0

Features

  • Add evaluation scripts for the ViDoRe benchmark introduced in the "ColPali: Efficient Document Retrieval with Vision Language Models" paper.
  • Add off-the-shelf retriever implementations from the ColPali paper: BGE-M3, BM25, Jina-CLIP, Nomic-Vision, SigLIP, and ColPali.
  • Add code to generate similarity maps following the methodology introduced in the ColPali paper.
  • Add entrypoint scripts: vidore-benchmark (evaluation scripts) and generate-similarity-maps (interpretability).

- Python
Published by tonywu71 over 1 year ago