Updated 9 months ago

vidore-benchmark • Rank 14.9 • Science 77%

Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.

Updated 9 months ago

colpali-engine • Rank 21.4 • Science 67%

The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.

Updated 9 months ago

https://github.com/altunenes/calcarine • Rank 1.4 • Science 26%

Desktop VLM: Real-time FastVLM analysis of video & textures with live compute shaders

Updated 9 months ago

urban-worm • Science 44%

Urban-Worm is a Python library that integrates remote sensing imagery, street view data, and multimodal model to assess environments and urban units

Updated 9 months ago

awesome-robotics-3d • Science 36%

A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites

Updated 9 months ago

spatialfusion-lm • Science 26%

SpatialFusion-LM is a real-time spatial reasoning framework that combines neural depth, 3D reconstruction, and language-driven scene understanding.