222-an-end-to-end-transformer-model-for-crowd-localization

https://github.com/szu-advtech-2024/222-an-end-to-end-transformer-model-for-crowd-localization

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.7%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: SZU-AdvTech-2024
Default Branch: main
Size: 0 Bytes

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed over 1 year ago

Metadata Files

Citation

https://github.com/SZU-AdvTech-2024/222-An-End-to-End-Transformer-Model-for-Crowd-Localization/blob/main/

# CLTR (Crowd Localization TRansformer)

[[Project page](https://dk-liang.github.io/CLTR/)] [[paper](https://arxiv.org/abs/2202.13065)]

An official implementation of "An end to end transformer model for crowd localization" (Accepted by ECCV 2022).

* *Currently, the code of this version is not well organized, which may contain some obscure code comments.*

## Environment
- python == 3.6
- pytorch == 1.80
- opencv-python
- scipy
- h5py
- pillow
- imageio
- nni
- mmcv
- tensorboard

## Datasets
- Download JHU-CROWD++ dataset from [here](http://www.crowd-counting.com/)
- Download NWPU-Crowd dataset (resized) from [Baidu](https://pan.baidu.com/s/1aqiLFU6lo3F_HqeT6wbEjg), password: 04i4 or [Onedrive](https://1drv.ms/u/s!Ak_WZsh5Fl0lhF0V7sxTVv1Vs0Aq?e=drd48k)

## Prepare data
### Generate point map
```bash
cd CLTR/data
```
For JHU-Crowd++ dataset: 
```bash
python prepare_jhu.py --data_path /xxx/xxx/jhu_crowd_v2.0
```
For NWPU-Crowd dataset: 
```bash
python prepare_nwpu.py --data_path /xxx/xxx/NWPU_CLTR
```

### Generate image list
```bash
cd CLTR
python make_npydata.py --jhu_path /xxx/xxx/jhu_crowd_v2.0 --nwpu_path /xxx/xxx/NWPU_CLTR
```

## Training
Example (some hyper-parameters may be different from the original paper):
```bash
cd CLTR
sh experiments/jhu.sh
```
or
```bash
sh experiments/nwpu.sh
```

- Please change `nproc_per_node` and `gpu_id` of `jhu.sh/nwpu.sh`, if you do not have enough GPU.
- We have fixed all random seeds, i.e., different runs will report the same results under the same setting.
- The model will be saved in `CLTR/save_file/log_file`
- Note that using FPN will improve the performance, but we do not add it in this version.
- Turning some hyper-parameters will also bring improvement (e.g., the image size, crop size, number of queries).

Here we give the comparison:

| NWPU-Crowd (val set) | MAE   | MSE   |
| -------------------- | ----- | ----- |
| Original paper       | 61.9  | 246.3 |
| This repo ([training log](./images/NWPU.log)) | 51.3  | 116.7 |

## Testing
Example:
```bash
python test.py --dataset jhu --pre model.pth --gpu_id 2,3
```
or
```bash
python test.py --dataset nwpu --pre model.pth --gpu_id 0,1
```

- The `model.pth` can be obtained from the training phase.

## Video Demo
Example:
```bash
python video_demo.py --video_path ./video_demo/demo.mp4 --num_queries 700 --pre video_model.pth
```

- The `"video_model.pth"` (trained from NWPU-Crowd training set) can be downloaded from [Baidu disk](https://pan.baidu.com/s/1ifubiFbj8u63pX3qt3F5rQ), password: rw6b or [google drive](https://drive.google.com/file/d/1bccQIMeYBrEsgLAbWgxFE2sOsEhE2EKC/view?usp=sharing).
- The generated video will be named `"out_video.avi"`

![avatar](./images/intro.jpeg)

Visiting [bilibili](https://www.bilibili.com/video/BV1sS4y147YT/) or [Youtube](https://youtu.be/fqFNGMnveVQ) to watch the video demo.

## Acknowledgement
Thanks for the following great work:

```bibtex
@inproceedings{carion2020end,
  title={End-to-end object detection with transformers},
  booktitle={European conference on computer vision},
  pages={213--229},
  year={2020},
  organization={Springer}
}
```

```bibtex
@inproceedings{meng2021conditional,
  title={Conditional detr for fast training convergence},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={3651--3660},
  year={2021}
}
```

## Reference
If you find this project is useful, please cite:
```bibtex
@article{liang2022end,
  title={An end-to-end transformer model for crowd localization},
  journal={European Conference on Computer Vision},
  year={2022}
}
```

Owner

Name: SZU-AdvTech-2024
Login: SZU-AdvTech-2024
Kind: organization

Repositories: 1
Profile: https://github.com/SZU-AdvTech-2024

GitHub Events

Total

Push event: 2
Create event: 3

Last Year

Push event: 2
Create event: 3

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science