Updated 9 months ago

https://github.com/924973292/idea • Science 23%

【CVPR2025】IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification

Updated 9 months ago

https://github.com/aim-uofa/omni-r1 • Science 36%

Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

Updated 9 months ago

https://github.com/aim-uofa/active-o3 • Science 36%

ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO

Updated 9 months ago

https://github.com/aim-uofa/segagent • Science 23%

[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories