https://github.com/cvi-szu/mg-motionllm
[CVPR 2025] MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org, scholar.google -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.1%) to scientific vocabulary
Repository
[CVPR 2025] MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities
Statistics
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
(CVPR 2025) MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities
[Bizhu Wu](https://scholar.google.com/citations?user=u7nZ3bgAAAAJ&hl=en) · [Jinheng Xie](https://scholar.google.com/citations?user=smbRMokAAAAJ&hl=en) · [Keming Shen]() · [Zhe Kong](https://scholar.google.com/citations?user=4X3yLwsAAAAJ&hl=en) [Jianfeng Ren*](https://scholar.google.com/citations?user=ZZ928OgAAAAJ&hl=en) · [Ruibin Bai](https://scholar.google.com/citations?user=oP6AThIAAAAJ&hl=en) · [Rong Qu](https://scholar.google.com/citations?user=ErszCRMAAAAJ&hl=en) · [Linlin Shen*](https://scholar.google.com/citations?user=AZ_y9HgAAAAJ&hl=en) *Corresponding Authors [](https://arxiv.org/abs/2504.02478)Description
MG-MotionLLM can address diverse motion-relevant tasks at multiple granularities by giving different instructions in a unified manner. - coarse-grained: e.g. text-to-motion and motion captioning (upper block) - fine-grained: e.g. motion-to-detailed text and motion localization (bottom block).
To achieve this, we propose multi-granularity training scheme with novel auxiliary tasks captures motion-related features at different levels, improving understanding across a wide range of tasks. Specifically, we pretrain the model with a total of 28 distinct motion-relevant tasks, including 12 existing classical coarse-grained tasks and 16 newly proposed fine-grained ones. Here, we display examples of prompt templates for a part of tasks used during training.
Visualization
We display some novel applications of our MG-MotionLLM. - text-driven fine-grained motion editing: Temporal Editing (left), Spatial Editing (middle), and Spatial-Temporal Editing (right).
- fine-grained captioning of both whole (up) and partial (bottom) motion sequences, and motion localization via fine-grained textual description (middle).
More Information (code, weights, etc)
For code, weights, etc, please see here.
Bibtex
If you use our code in your research, kindly cite our work:
bibtex
@article{wu2025mg,
title={MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities},
author={Wu, Bizhu and Xie, Jinheng and Shen, Keming and Kong, Zhe and Ren, Jianfeng and Bai, Ruibin and Qu, Rong and Shen, Linlin},
journal={arXiv preprint arXiv:2504.02478},
year={2025}
}
Owner
- Name: Computer Vision Institute, SZU
- Login: CVI-SZU
- Kind: organization
- Location: Shenzhen Univeristy, Shenzhen, China
- Website: http://cv.szu.edu.cn/
- Repositories: 13
- Profile: https://github.com/CVI-SZU
Computer Vision Institute, Shenzhen University
GitHub Events
Total
- Issues event: 1
- Watch event: 26
- Push event: 4
- Create event: 2
Last Year
- Issues event: 1
- Watch event: 26
- Push event: 4
- Create event: 2