fracal

[CVPR2025] Fractal calibration for long-tailed object detection

https://github.com/kostas1515/fracal

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

[CVPR2025] Fractal calibration for long-tailed object detection

Basic Info
  • Host: GitHub
  • Owner: kostas1515
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 14.5 MB
Statistics
  • Stars: 3
  • Watchers: 1
  • Forks: 2
  • Open Issues: 1
  • Releases: 0
Created over 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

[CVPR2025] Fractal Calibration for long-tailed object detection

[![Static Badge](https://img.shields.io/badge/arxiv-2410.11774v2-blue)](https://arxiv.org/abs/2410.11774v2) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/fractal-calibration-for-long-tailed-object/instance-segmentation-on-lvis-v1-0-val)](https://paperswithcode.com/sota/instance-segmentation-on-lvis-v1-0-val?p=fractal-calibration-for-long-tailed-object)

Fractal calibration method. Abstract:

Real-world datasets follow an imbalanced distribution, which poses significant challenges in rare-category object detection. Recent studies tackle this problem by developing re-weighting and re-sampling methods, that utilise the class frequencies of the dataset. However, these techniques focus solely on the frequency statistics and ignore the distribution of the classes in image space, missing important information. In contrast to them, we propose FRActal CALibration (FRACAL): a novel post-calibration method for long-tailed object detection. FRACAL devises a logit adjustment method that utilises the fractal dimension to estimate how uniformly classes are distributed in image space. During inference, it uses the fractal dimension to inversely downweight the probabilities of uniformly spaced class predictions achieving balance in two axes: between frequent and rare categories, and between uniformly spaced and sparsely spaced classes. FRACAL is a post-processing method and it does not require any training, also it can be combined with many off-the-shelf models such as one-stage sigmoid detectors and two-stage instance segmentation models. FRACAL boosts the rare class performance by up to 8.6% and surpasses all previous methods on LVIS dataset, while showing good generalisation to other datasets such as COCO, V3Det and OpenImages.

Progress

  • [x] Training code.
  • [x] Evaluation code.
  • [x] Provide instance segmentation checkpoint models.

Getting Started

Create a virtual environment

conda create --name fracal python=3.11 -y conda activate fracal

  1. Install dependency packages conda install pytorch torchvision -c pytorch

  2. Install MMDetection pip install -U openmim mim install mmengine mim install "mmcv==2.1.0" git clone https://github.com/kostas1515/FRACAL.git

  3. Create data directory, download COCO 2017 datasets at https://cocodataset.org/#download (2017 Train images [118K/18GB], 2017 Val images [5K/1GB], 2017 Train/Val annotations [241MB]) and extract the zip files:

``` mkdir data cd data wget http://images.cocodataset.org/zips/train2017.zip wget http://images.cocodataset.org/zips/val2017.zip

download and unzip LVIS annotations

wget https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvisv1train.json.zip wget https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvisv1val.json.zip

```

  1. modify mmdetection/configs/base/datasets/lvisv1instance.py and make sure dataroot variable points to the above data directory, e.g., ```dataroot = ''```

Training

Train a baseline model on multiple GPUs using tools/dist_train.sh e.g.:

./tools/dist_train.sh ./configs/<folder>/<model.py> <#GPUs>

Inference with Baseline Model

To test the MaskRCNN ResNet50 RFS with Normalised Mask and Carafe on 8 GPUs run:

./tools/dist_test.sh ./experiments/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe.py ./experiments/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe/epoch_24.pth 8

Inference with FRACAL

To test the FRACAL-MaskRCNN ResNet50 RFS with Normalised Mask and Carafe on 4 GPUs run: ./tools/dist_test.sh ./experiments/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe/fracal_r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe.py ./experiments/r50_rfs_cos_lr_norm_4x4_2x_softmax_carafe/epoch_24.pth 8

Optional - Get Dataset statistics

We already provide the frequency and fractal dimension measures for LVISv1 train set, in the folder statfiles. If one needs to reproduce them then to generate the fractal dimension measures:

  1. run the getstatistics.py inside the folder ./statfiles/:

python get_statistics.py --dset_name lvis --path ../../../datasets/coco/annotations/lvis_v1_train.json --output ./lvis_v1_train_stats.csv

This will create a csv containing various bounding box statistics such the class, width, height, location etc...

  1. compute the fractal dimension based on those statistics: python calculate_fractality.py --dset_name lvisv1 --path ./lvis_v1_train_stats.csv --output lvis_v1_train_fractal_dim.csv

To generate the frequency weights run: python get_frequency.py --path ../../../datasets/coco/annotations/lvis_v1_train.json --output freq_lvis_v1_train.csv

This will create a csv containing various frequency weights based on instance frequency or image frequency using various link functions. The lvis_v1_train_fractal_dim.csv and freq_lvis_v1_train.csv are used inside the \mmdet\models\roi_heads\bbox_heads\fracal_bbox_head.py script.

The statistical calculations scripts support COCO,LVISv1,LVISv05,V3Det,OpenImages datasets.

Pretrained Models on LVIS

Method AP APr APc APf APb Model
FRACAL-MaskRCNN-R50 28.5 23.0 28.1 31.5 28.4 weights
FRACAL-MaskRCNN-R101 29.9 24.6 29.3 32.8 29.8 weights
FRACAL-MaskRCNN-Swin-B 38.5 35.5 39.5 38.7 39.4 weights

BibTeX

bibtex @article{alexandridis2024fractal, title={Fractal Calibration for long-tailed object detection}, author={Alexandridis, Konstantinos Panagiotis and Elezi, Ismail and Deng, Jiankang and Nguyen, Anh and Luo, Shan}, journal={arXiv preprint arXiv:2410.11774}, year={2024} }

Acknowledgements

This code uses Pytorch and the mmdet framework. Thank you for your wonderfull work!

Owner

  • Name: Alexandridis, Konstantinos Panagiotis
  • Login: kostas1515
  • Kind: user
  • Location: London, UK
  • Company: King's College London

PhD Student • Electrical and Computer Engineer • Programming Enthusiast

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "MMDetection Contributors"
title: "OpenMMLab Detection Toolbox and Benchmark"
date-released: 2018-08-22
url: "https://github.com/open-mmlab/mmdetection"
license: Apache-2.0

GitHub Events

Total
  • Issues event: 3
  • Watch event: 3
  • Issue comment event: 5
  • Push event: 6
  • Fork event: 2
  • Create event: 3
Last Year
  • Issues event: 3
  • Watch event: 3
  • Issue comment event: 5
  • Push event: 6
  • Fork event: 2
  • Create event: 3