uninfo
The official code for "Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image Corruption."
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.0%) to scientific vocabulary
Keywords
Repository
The official code for "Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image Corruption."
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image Corruption
The official code for "Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image Corruption."
[arXiv]
Abstract
Pre-trained vision-language models such as contrastive language-image pre-training (CLIP) have demonstrated a remarkable generalizability, which has enabled a wide range of applications represented by zero-shot classification. However, vision-language models still suffer when they face datasets with large gaps from training ones, i.e., distribution shifts. We found that CLIP is especially vulnerable to sensor degradation, a type of realistic distribution shift caused by sensor conditions such as weather, light, or noise. Collecting a new dataset from a test distribution for fine-tuning highly costs since sensor degradation occurs unexpectedly and has a range of variety. Thus, we investigate test-time adaptation (TTA) of zero-shot classification, which enables on-the-fly adaptation to the test distribution with unlabeled test data. Existing TTA methods for CLIP mainly focus on modifying image and text embeddings or predictions to address distribution shifts. Although these methods can adapt to domain shifts, such as fine-grained labels spaces or different renditions in input images, they fail to adapt to distribution shifts caused by sensor degradation. We found that this is because image embeddings are "corrupted" in terms of uniformity, a measure related to the amount of information. To make models robust to sensor degradation, we propose a novel method called uniformity-aware information-balanced TTA (UnInfo). To address the corruption of image embeddings, we introduce uniformity-aware confidence maximization, information-aware loss balancing, and knowledge distillation from the exponential moving average (EMA) teacher. The uniformity-aware confidence maximization induces image embeddings to uniformly distribute on the unit hypersphere to retain input information along with confidence maximization of predictions. The loss balancing adaptively assigns weights to the losses of uniformity and confidence on the basis of the current classification performance. The knowledge distillation from the EMA teacher stabilizes adaptation and avoids catastrophic forgetting. Through experiments, we demonstrate that our UnInfo improves accuracy under sensor degradation by retaining information in terms of uniformity.
Environment
- Prepare the datasets (ImageNet-C, ImageNet-C-bar) and write their paths in
dataset/dataset_config.py. - Install dependencies or build the docker image according to
docker/Dockerfile.
bash
$ docker build -t tta_uninfo docker --no-cache
TTA
```bash $ python3 main.py -c imagenet-c.yaml -o result
running with the docker image
$ docker run -it --rm -v $(pwd):$(pwd) -w $(pwd) --gpus device=0 tta_unifo python3 main.py -c imagenet-c.yaml -o result ```
Citation
If our work assists your research, please cite our paper:
@article{adachi2025uninfo,
title={Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image Corruption},
author={Kazuki Adachi and Shin'ya Yamaguchi and Tomoki Hamagami},
journal={arXiv preprint arXiv:2505.12912},
year={2025}
}
Owner
- Name: Kazuki Adachi
- Login: kzkadc
- Kind: user
- Location: Japan
- Company: NTT
- Website: https://kzkadc.github.io/
- Repositories: 4
- Profile: https://github.com/kzkadc
Citation (CITATION.cff)
cff-version: 1.2.0
title: "Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image Corruption"
message: "If our work assists your research, please cite our paper."
authors: &authors
- family-names: Adachi
given-names: Kazuki
- family-names: Yamaguchi
given-names: Shin'ya
- family-names: Hamagami
given-names: Tomoki
preferred-citation:
type: article
authors: *authors
title: "Uniformity First: Uniformity-aware Test-time Adaptation of Vision-language Models against Image Corruption"
year: 2025
journal: "arXiv preprint arXiv:2505.12912"
url: https://arxiv.org/abs/2505.12912
GitHub Events
Total
- Push event: 5
- Create event: 2
Last Year
- Push event: 5
- Create event: 2
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Kazuki Adachi | k****y@g****m | 9 |
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0