Updated 9 months ago
https://github.com/924973292/idea
【CVPR2025】IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification
Updated 9 months ago
https://github.com/aim-uofa/omni-r1
Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
Updated 9 months ago
https://github.com/aim-uofa/active-o3
ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO
Updated 9 months ago
https://github.com/aim-uofa/segagent
[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories