fracal
[CVPR2025] Fractal calibration for long-tailed object detection
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.4%) to scientific vocabulary
Repository
[CVPR2025] Fractal calibration for long-tailed object detection
Basic Info
- Host: GitHub
- Owner: kostas1515
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 14.5 MB
Statistics
- Stars: 3
- Watchers: 1
- Forks: 2
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
[CVPR2025] Fractal Calibration for long-tailed object detection
Abstract:
Real-world datasets follow an imbalanced distribution, which poses significant challenges in rare-category object detection. Recent studies tackle this problem by developing re-weighting and re-sampling methods, that utilise the class frequencies of the dataset. However, these techniques focus solely on the frequency statistics and ignore the distribution of the classes in image space, missing important information. In contrast to them, we propose FRActal CALibration (FRACAL): a novel post-calibration method for long-tailed object detection. FRACAL devises a logit adjustment method that utilises the fractal dimension to estimate how uniformly classes are distributed in image space. During inference, it uses the fractal dimension to inversely downweight the probabilities of uniformly spaced class predictions achieving balance in two axes: between frequent and rare categories, and between uniformly spaced and sparsely spaced classes. FRACAL is a post-processing method and it does not require any training, also it can be combined with many off-the-shelf models such as one-stage sigmoid detectors and two-stage instance segmentation models. FRACAL boosts the rare class performance by up to 8.6% and surpasses all previous methods on LVIS dataset, while showing good generalisation to other datasets such as COCO, V3Det and OpenImages.
Progress
- [x] Training code.
- [x] Evaluation code.
- [x] Provide instance segmentation checkpoint models.
Getting Started
Create a virtual environment
conda create --name fracal python=3.11 -y
conda activate fracal
Install dependency packages
conda install pytorch torchvision -c pytorchInstall MMDetection
pip install -U openmim mim install mmengine mim install "mmcv==2.1.0" git clone https://github.com/kostas1515/FRACAL.gitCreate data directory, download COCO 2017 datasets at https://cocodataset.org/#download (2017 Train images [118K/18GB], 2017 Val images [5K/1GB], 2017 Train/Val annotations [241MB]) and extract the zip files:
``` mkdir data cd data wget http://images.cocodataset.org/zips/train2017.zip wget http://images.cocodataset.org/zips/val2017.zip
download and unzip LVIS annotations
wget https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvisv1train.json.zip wget https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvisv1val.json.zip
```
- modify mmdetection/configs/base/datasets/lvisv1instance.py and make sure dataroot variable points to the above data directory, e.g.,
```dataroot = '
'```
Training
Train a baseline model on multiple GPUs using tools/dist_train.sh e.g.:
./tools/dist_train.sh ./configs/<folder>/<model.py> <#GPUs>
Inference with Baseline Model
To test the MaskRCNN ResNet50 RFS with Normalised Mask and Carafe on 8 GPUs run:
./tools/dist_test.sh ./experiments/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe.py ./experiments/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe/epoch_24.pth 8
Inference with FRACAL
To test the FRACAL-MaskRCNN ResNet50 RFS with Normalised Mask and Carafe on 4 GPUs run:
./tools/dist_test.sh ./experiments/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe/fracal_r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe.py ./experiments/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe/epoch_24.pth 8
Optional - Get Dataset statistics
We already provide the frequency and fractal dimension measures for LVISv1 train set, in the folder statfiles. If one needs to reproduce them then to generate the fractal dimension measures:
- run the getstatistics.py inside the folder ./statfiles/:
python get_statistics.py --dset_name lvis --path ../../../datasets/coco/annotations/lvis_v1_train.json --output ./lvis_v1_train_stats.csv
This will create a csv containing various bounding box statistics such the class, width, height, location etc...
- compute the fractal dimension based on those statistics:
python calculate_fractality.py --dset_name lvisv1 --path ./lvis_v1_train_stats.csv --output lvis_v1_train_fractal_dim.csv
To generate the frequency weights run:
python get_frequency.py --path ../../../datasets/coco/annotations/lvis_v1_train.json --output freq_lvis_v1_train.csv
This will create a csv containing various frequency weights based on instance frequency or image frequency using various link functions. The lvis_v1_train_fractal_dim.csv and freq_lvis_v1_train.csv are used inside the \mmdet\models\roi_heads\bbox_heads\fracal_bbox_head.py script.
The statistical calculations scripts support COCO,LVISv1,LVISv05,V3Det,OpenImages datasets.
Pretrained Models on LVIS
| Method | AP | APr | APc | APf | APb | Model |
|---|---|---|---|---|---|---|
| FRACAL-MaskRCNN-R50 | 28.5 | 23.0 | 28.1 | 31.5 | 28.4 | weights |
| FRACAL-MaskRCNN-R101 | 29.9 | 24.6 | 29.3 | 32.8 | 29.8 | weights |
| FRACAL-MaskRCNN-Swin-B | 38.5 | 35.5 | 39.5 | 38.7 | 39.4 | weights |
BibTeX
bibtex
@article{alexandridis2024fractal,
title={Fractal Calibration for long-tailed object detection},
author={Alexandridis, Konstantinos Panagiotis and Elezi, Ismail and Deng, Jiankang and Nguyen, Anh and Luo, Shan},
journal={arXiv preprint arXiv:2410.11774},
year={2024}
}
Acknowledgements
This code uses Pytorch and the mmdet framework. Thank you for your wonderfull work!
Owner
- Name: Alexandridis, Konstantinos Panagiotis
- Login: kostas1515
- Kind: user
- Location: London, UK
- Company: King's College London
- Website: https://kostas1515.github.io/
- Repositories: 32
- Profile: https://github.com/kostas1515
PhD Student • Electrical and Computer Engineer • Programming Enthusiast
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - name: "MMDetection Contributors" title: "OpenMMLab Detection Toolbox and Benchmark" date-released: 2018-08-22 url: "https://github.com/open-mmlab/mmdetection" license: Apache-2.0
GitHub Events
Total
- Issues event: 3
- Watch event: 3
- Issue comment event: 5
- Push event: 6
- Fork event: 2
- Create event: 3
Last Year
- Issues event: 3
- Watch event: 3
- Issue comment event: 5
- Push event: 6
- Fork event: 2
- Create event: 3