Updated 9 months ago
https://github.com/ai-forever/aggme
Aggregation framework for annotating datasets in computer vision tasks (detection, segmentation, video captioning etc.)
Updated 9 months ago
https://github.com/bytedance/shot2story
A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.