222-an-end-to-end-transformer-model-for-crowd-localization
https://github.com/szu-advtech-2024/222-an-end-to-end-transformer-model-for-crowd-localization
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.7%) to scientific vocabulary
Last synced: 9 months ago
·
JSON representation
Repository
Basic Info
- Host: GitHub
- Owner: SZU-AdvTech-2024
- Default Branch: main
- Size: 0 Bytes
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Created over 1 year ago
· Last pushed over 1 year ago
Metadata Files
Citation
https://github.com/SZU-AdvTech-2024/222-An-End-to-End-Transformer-Model-for-Crowd-Localization/blob/main/
# CLTR (Crowd Localization TRansformer)
[[Project page](https://dk-liang.github.io/CLTR/)] [[paper](https://arxiv.org/abs/2202.13065)]
An official implementation of "An end to end transformer model for crowd localization" (Accepted by ECCV 2022).
* *Currently, the code of this version is not well organized, which may contain some obscure code comments.*
## Environment
- python == 3.6
- pytorch == 1.80
- opencv-python
- scipy
- h5py
- pillow
- imageio
- nni
- mmcv
- tensorboard
## Datasets
- Download JHU-CROWD++ dataset from [here](http://www.crowd-counting.com/)
- Download NWPU-Crowd dataset (resized) from [Baidu](https://pan.baidu.com/s/1aqiLFU6lo3F_HqeT6wbEjg), password: 04i4 or [Onedrive](https://1drv.ms/u/s!Ak_WZsh5Fl0lhF0V7sxTVv1Vs0Aq?e=drd48k)
## Prepare data
### Generate point map
```bash
cd CLTR/data
```
For JHU-Crowd++ dataset:
```bash
python prepare_jhu.py --data_path /xxx/xxx/jhu_crowd_v2.0
```
For NWPU-Crowd dataset:
```bash
python prepare_nwpu.py --data_path /xxx/xxx/NWPU_CLTR
```
### Generate image list
```bash
cd CLTR
python make_npydata.py --jhu_path /xxx/xxx/jhu_crowd_v2.0 --nwpu_path /xxx/xxx/NWPU_CLTR
```
## Training
Example (some hyper-parameters may be different from the original paper):
```bash
cd CLTR
sh experiments/jhu.sh
```
or
```bash
sh experiments/nwpu.sh
```
- Please change `nproc_per_node` and `gpu_id` of `jhu.sh/nwpu.sh`, if you do not have enough GPU.
- We have fixed all random seeds, i.e., different runs will report the same results under the same setting.
- The model will be saved in `CLTR/save_file/log_file`
- Note that using FPN will improve the performance, but we do not add it in this version.
- Turning some hyper-parameters will also bring improvement (e.g., the image size, crop size, number of queries).
Here we give the comparison:
| NWPU-Crowd (val set) | MAE | MSE |
| -------------------- | ----- | ----- |
| Original paper | 61.9 | 246.3 |
| This repo ([training log](./images/NWPU.log)) | 51.3 | 116.7 |
## Testing
Example:
```bash
python test.py --dataset jhu --pre model.pth --gpu_id 2,3
```
or
```bash
python test.py --dataset nwpu --pre model.pth --gpu_id 0,1
```
- The `model.pth` can be obtained from the training phase.
## Video Demo
Example:
```bash
python video_demo.py --video_path ./video_demo/demo.mp4 --num_queries 700 --pre video_model.pth
```
- The `"video_model.pth"` (trained from NWPU-Crowd training set) can be downloaded from [Baidu disk](https://pan.baidu.com/s/1ifubiFbj8u63pX3qt3F5rQ), password: rw6b or [google drive](https://drive.google.com/file/d/1bccQIMeYBrEsgLAbWgxFE2sOsEhE2EKC/view?usp=sharing).
- The generated video will be named `"out_video.avi"`

Visiting [bilibili](https://www.bilibili.com/video/BV1sS4y147YT/) or [Youtube](https://youtu.be/fqFNGMnveVQ) to watch the video demo.
## Acknowledgement
Thanks for the following great work:
```bibtex
@inproceedings{carion2020end,
title={End-to-end object detection with transformers},
booktitle={European conference on computer vision},
pages={213--229},
year={2020},
organization={Springer}
}
```
```bibtex
@inproceedings{meng2021conditional,
title={Conditional detr for fast training convergence},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={3651--3660},
year={2021}
}
```
## Reference
If you find this project is useful, please cite:
```bibtex
@article{liang2022end,
title={An end-to-end transformer model for crowd localization},
journal={European Conference on Computer Vision},
year={2022}
}
```
Owner
- Name: SZU-AdvTech-2024
- Login: SZU-AdvTech-2024
- Kind: organization
- Repositories: 1
- Profile: https://github.com/SZU-AdvTech-2024
GitHub Events
Total
- Push event: 2
- Create event: 3
Last Year
- Push event: 2
- Create event: 3