Updated 10 months ago

boxmot • Rank 22.4 • Science 77%

BoxMOT: Pluggable SOTA multi-object tracking modules modules for segmentation, object detection and pose estimation models

Updated 10 months ago

clipseq • Rank 6.4 • Science 77%

CLIP sequencing analysis pipeline for QC, pre-mapping, genome mapping, UMI deduplication, and multiple peak-calling options.

Updated 10 months ago

uform • Rank 15.6 • Science 64%

Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

Updated 10 months ago

panoptic • Rank 10.3 • Science 54%

Explore and analyze large datasets of images

Updated 10 months ago

ppdiffusers • Rank 19.3 • Science 36%

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.

Updated 10 months ago

marqo-fashionclip • Rank 6.4 • Science 44%

State-of-the-art CLIP/SigLIP embedding models finetuned for the fashion domain. +57% increase in evaluation metrics vs FashionCLIP 2.0.

Updated 9 months ago

https://github.com/ai-forever/ru-clip • Rank 7.5 • Science 33%

CLIP implementation for Russian language

Updated 9 months ago

https://github.com/bentoml/clip-api-service • Rank 9.2 • Science 13%

CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search

Updated 9 months ago

https://github.com/capjamesg/sam-clip • Rank 3.4 • Science 13%

Use Grounding DINO, Segment Anything, and CLIP to label objects in images.

Updated 10 months ago

b-cosification • Science 54%

[NeurIPS 2024] Code for the paper: B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable.

Updated 10 months ago

geospatial-rag • Science 26%

AI Framework for Remote Sensing Image Analysis using RAG - 88%+ accuracy, multi-modal queries, ChatGPT-like interface

Updated 10 months ago

bayesvlm • Science 54%

Code for Post-hoc Probabilistic Vision-Language Models

Updated 9 months ago

https://github.com/924973292/mambapro • Science 23%

【AAAI2025】MambaPro: Multi-Modal Object Re-Identification with Mamba Aggregation and Synergistic Prompt

Updated 10 months ago

spn4cir • Science 54%

[ACM MM 2024] Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives

Updated 10 months ago

motherboard-dataset • Science 31%

[Kaggle Dataset] Motherboard production defect dataset for object detection. Currently available for YOLOv5, YOLOv7, YOLOv8 & CLIP. Also available on Kaggle.

Updated 9 months ago

https://github.com/ajaymin28/humanactionrecognition • Science 26%

CLIP based human action recognition, alignment of text and image using Prompt engineering.

Updated 9 months ago

https://github.com/capjamesg/webispy • Science 13%

I, Spy: A cool web guessing game 🧊

Updated 9 months ago

https://github.com/autodistill/autodistill-metaclip • Science 36%

MetaCLIP module for use with Autodistill.

Updated 10 months ago

uninfo • Science 54%

The official code for "Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image Corruption."

Updated 10 months ago

peka-eclip • Science 44%

Download and prepare ENCODE eCLIP raw fastq for processing with the nf-core/clipseq pipeline

Updated 10 months ago

bioclip • Science 85%

This is the repository for the BioCLIP model and the TreeOfLife-10M dataset [CVPR'24 Oral, Best Student Paper].

Updated 9 months ago

understanding-clip-ood • Science 36%

Official code for the paper: "When and How Does CLIP Enable Domain and Compositional Generalization?" (ICML 2025 Spotlight)

Updated 10 months ago

bioclip-2 • Science 75%

Repository for the BioCLIP 2 model project.

Updated 9 months ago

https://github.com/chen-yang-liu/git-rsclip • Science 49%

Git-RSCLIP pre-trained on 10 million Remote sensing image-text pairs