TorchGAN
TorchGAN: A Flexible Framework for GAN Training and Evaluation - Published in JOSS (2021)
wrenfold
wrenfold: Symbolic code generation for robotics - Published in JOSS (2025)
MapReader
MapReader: Open software for the visual analysis of maps - Published in JOSS (2024)
Manif
Manif: A micro Lie theory library for state estimation in robotics applications - Published in JOSS (2020)
Distant Viewing Toolkit
Distant Viewing Toolkit: A Python Package for the Analysis of Visual Culture - Published in JOSS (2020)
sahi
Framework agnostic sliced/tiled inference + interactive ui + error analysis plots
Annotate-Lab
Annotate-Lab: Simplifying Image Annotation - Published in JOSS (2024)
SIHR
SIHR: a MATLAB/GNU Octave toolbox for single image highlight removal - Published in JOSS (2020)
pfla
pfla: A Python Package for Dental Facial Analysis using Computer Vision and Statistical Shape Analysis - Published in JOSS (2018)
datasets
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
BluVision Macro - a software for automated powdery mildew and rust disease quantification on detached leaves.
BluVision Macro - a software for automated powdery mildew and rust disease quantification on detached leaves. - Published in JOSS (2020)
org.pytorch:torchvision_ops
Datasets, Transforms and Models specific to Computer Vision
awesome-implicit-nerf-robotics
A comprehensive list of Implicit Representations and NeRF papers relating to Robotics/RL domain, including papers, codes, and related websites
ammico
AI-based Media and Misinformation Content Analysis Tool: Analyze text and images
rf-detr
RF-DETR is a real-time object detection model architecture developed by Roboflow, SOTA on COCO and designed for fine-tuning.
inference
Turn any computer or edge device into a command center for your computer vision projects.
catalyst
Accelerated deep learning R&D
pytrack
a Map-Matching-based Python Toolbox for Vehicle Trajectory Reconstruction
rastervision
An open source library and framework for deep learning on satellite and aerial imagery.
mmagic
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.
mani-skill
SAPIEN Manipulation Skill Framework, an open source GPU parallelized robotics simulator and benchmark, led by Hillbot, Inc.
transformers-interpret
Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.
grand-challenge.org
A platform for end-to-end development of machine learning solutions in biomedical imaging
labelme
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
soccernet-calibration-sportlight
SoccerNet@CVPR | 1st place solution for Camera Calibration Challenge 2023
grad-cam
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
geo-trax
🚀 Geo-trax is a comprehensive pipeline for extracting and analyzing high-accuracy georeferenced vehicle trajectories from quasi-stationary, bird’s-eye view drone footage. Using advanced computer vision and deep learning, it enables detailed urban traffic analysis and supports scalable, precise studies of vehicle dynamics.
openpifpaf
Official implementation of "OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal Association" in PyTorch.
mulimgviewer
MulimgViewer is a multi-image viewer that can open multiple images in one interface, which is convenient for image comparison and image stitching.
elpv-dataset
A dataset of functional and defective solar cells extracted from EL images of solar modules
https://github.com/rerun-io/rerun
Visualize streams of multimodal data. Free, fast, easy to use, and simple to integrate. Built in Rust.
gandetection
Detecting GAN generated Images using Convolutional Neural Networks
cam2bev
TensorFlow Implementation for Computing a Semantically Segmented Bird's Eye View (BEV) Image Given the Images of Multiple Vehicle-Mounted Cameras.
https://github.com/facebookresearch/habitat-lab
A modular high-level library to train embodied AI agents across a variety of tasks and environments.
code_dcf
This repository contains the firth bias reduction experiments on the few-shot distribution calibration method conducted in the ICLR 2022 spotlight paper "On the Importance of Firth Bias Reduction in Few-Shot Classification".
ethicml
Package for evaluating the performance of methods which aim to increase fairness, accountability and/or transparency
pylocron
PyTorch implementations of recent Computer Vision tricks (ReXNet, RepVGG, Unet3p, YOLOv4, CIoU loss, AdaBelief, PolyLoss, MobileOne). Other additions: AdEMAMix
https://github.com/lancedb/lance
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
satellighte
📡 PyTorch Lightning Implementations of Recent Satellite Image Classification !
autodistill
Images to inference with no labeling (use foundation models to train supervised models).
cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
dlib
A toolkit for making real world machine learning and data analysis applications in C++
anylabeling
Effortless AI-assisted data labeling with AI support from YOLO, Segment Anything (SAM+SAM2), MobileSAM!!
https://github.com/allenai/allenact
An open source framework for research in Embodied-AI from AI2.
augraphy
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
light-side
⚡️PyTorch Lightning Implementations of Recent Low-Light Image Enhancement !
deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
cvat
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
https://github.com/carla-simulator/carla
Open-source simulator for autonomous driving research.
rf100-vl
Code from the paper "Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models"
https://github.com/bluebrain/atlas-alignment
Blue Brain multi-modal registration and alignment toolbox
aeolus-ocean
An all-weather, day-and-night, collision avoidance simulator that can be implemented as a digital twin for the autonomous COLREG-compliant navigation of maritime vessels.
tyc-dataset
Official and maintained implementation of the dataset paper "The TYC Dataset for Understanding Instance-Level Semantics and Motions of Cells in Microstructures" [ICCVW 2023].
mexca
Multimodal Emotion eXpression Capture Amsterdam. Pipeline for capturing emotion expressions from multiple modalities (video, audio, text) in the wild.
code_firth
This repository contains the main ResNet backbone experiments conducted in the ICLR 2022 spotlight paper "On the Importance of Firth Bias Reduction in Few-Shot Classification".
megadetector
MegaDetector is an AI model that helps conservation folks spend less time doing boring things with camera trap images.
https://github.com/bchao1/fast-poisson-image-editing
Fast, scalable, and extensive implementations of Poisson image editing algorithms.
hugsvision
HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision
maaassistantarknights
《明日方舟》小助手,全日常一键长草!| A one-click tool for the daily tasks of Arknights, supporting all clients.
ktrain
ktrain is a Python library that makes deep learning and AI more accessible and easier to apply
CameraTraps
PyTorch Wildlife: a Collaborative Deep Learning Framework for Conservation.
cppe5
Code for our paper CPPE - 5 (Medical Personal Protective Equipment), a new challenging object detection dataset
https://github.com/google-research/scenic
Scenic: A Jax Library for Computer Vision Research and Beyond
yuzumarker.fontdetection
✨ 首个CJK(中日韩)字体识别以及样式提取模型 YuzuMarker的字体识别模型与实现 / First-ever CJK (Chinese Japanese Korean) Font Recognition and Style Extractor, side project of YuzuMarker
face-mask-detection
Face Mask Detection system based on computer vision and deep learning using OpenCV and Tensorflow/Keras
roboflow-python
The official Roboflow Python package. Manage your datasets, models, and deployments. Roboflow has everything you need to build a computer vision application.
computer-vision-in-action
A computer vision closed-loop learning platform where code can be run interactively online. 学习闭环《计算机视觉实战演练:算法与应用》中文电子书、源码、读者交流社区(持续更新中 ...) 📘 在线电子书 https://charmve.github.io/computer-vision-in-action/ 👇项目主页