https://github.com/chen-yang-liu/rscama
[IEEE GRSL 2024 π₯] RSCaMa: Remote Sensing Image Change Captioning with State Space Model
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
βCITATION.cff file
-
βcodemeta.json file
Found codemeta.json file -
β.zenodo.json file
Found .zenodo.json file -
βDOI references
Found 1 DOI reference(s) in README -
βAcademic publication links
Links to: arxiv.org, ieee.org -
βAcademic email domains
-
βInstitutional organization owner
-
βJOSS paper metadata
-
βScientific vocabulary similarity
Low similarity (14.4%) to scientific vocabulary
Keywords
Repository
[IEEE GRSL 2024 π₯] RSCaMa: Remote Sensing Image Change Captioning with State Space Model
Basic Info
Statistics
- Stars: 69
- Watchers: 2
- Forks: 3
- Open Issues: 4
- Releases: 0
Topics
Metadata Files
README.md
Share us a :star: if you're interested in this repo
β This repository contains the PyTorch implementation of "RSCaMa: Remote Sensing Image Change Captioning with State Space Model".
β ποΈESI Highly Cited Paper
News:
π₯Remote Sensing Spatio-Temporal Vision-Language Models: A Comprehensive Survey: [Paper] [Github]
Installation and Dependencies
python
git clone https://github.com/Chen-Yang-Liu/RSCaMa.git
cd RSCaMa
conda create -n RSCaMa_env python=3.9
conda activate RSCaMa_env
pip install -r requirements.txt
Data Preparation
- Download the LEVIR_CC dataset: LEVIR-CC .
- The data structure of LEVIR-CC is organized as follows:
ββ/root/Data/LEVIR_CC/
ββLevirCCcaptions.json
ββimages
ββtrain
β ββA
β ββB
ββval
β ββA
β ββB
ββtest
β ββA
β ββB
where folder A contains images of pre-phase, folder B contains images of post-phase.
- Extract text files for the change descriptions of each image pair in LEVIR-CC:
python preprocess_data.py --input_captions_json /DATA_PATH/Levir-CC-dataset/LevirCCcaptions.json
!NOTE: When preparing the text token files, we suggest setting the word count threshold of LEVIR-CC to 5 and Dubai_CC to 0 for fair comparisons.
NOTE
- Please modify the source code of
CLIPpackage, please modifyCLIP.model.VisionTransformer.forward()as [this]. - Mamba is only supported on Linux systems.
Training
python train_CC.py --data_folder /DATA_PATH/Levir-CC-dataset/images
!NOTE: If the program encounters the error: "'Meteor' object has no attribute 'lock'," we recommend installing it with sudo apt install openjdk-11-jdk to resolve this issue.
Evaluate
python
python test.py --data_folder /DATA_PATH/Levir-CC-dataset/images --checkpoint xxxx.pth
Alternatively, you can download our pretrained model here: [Hugging face].
Experiment:
Citation:
``` @ARTICLE{liu2024rscama, author={Liu, Chenyang and Chen, Keyan and Chen, Bowen and Zhang, Haotian and Zou, Zhengxia and Shi, Zhenwei}, journal={IEEE Geoscience and Remote Sensing Letters}, title={RSCaMa: Remote Sensing Image Change Captioning With State Space Model}, year={2024}, volume={21}, number={}, pages={1-5}, keywords={Decoding;Visualization;Transformers;Task analysis;Solid modeling;Remote sensing;Feature extraction;Change captioning;Mamba;spatial difference-guided SSM;state space model (SSM);temporal traveling SSM}, doi={10.1109/LGRS.2024.3404604}}
```
Owner
- Name: Liu Chenyang
- Login: Chen-Yang-Liu
- Kind: user
- Location: Beijing
- Website: https://Chen-Yang-Liu.github.io
- Repositories: 15
- Profile: https://github.com/Chen-Yang-Liu
Liu Chenyang
GitHub Events
Total
- Issues event: 3
- Watch event: 22
- Issue comment event: 3
- Push event: 4
Last Year
- Issues event: 3
- Watch event: 22
- Issue comment event: 3
- Push event: 4
Issues and Pull Requests
Last synced: almost 2 years ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- rginjapan (2)
- keaill (2)
- luchenhao-luke (1)
- yangsunzhe (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Brotli ==1.1.0
- Cython ==3.0.2
- DALL-E ==0.1
- GitPython ==3.1.37
- Jinja2 ==3.1.2
- MarkupSafe ==2.1.3
- Pillow ==10.0.1
- PyYAML ==6.0.1
- Pygments ==2.16.1
- accelerate ==0.23.0
- aiohttp ==3.8.5
- aiosignal ==1.3.1
- antlr4-python3-runtime ==4.9.3
- appdirs ==1.4.4
- argcomplete ==3.3.0
- asttokens ==2.4.0
- async-timeout ==4.0.3
- attrs ==23.1.0
- axial-positional-embedding ==0.2.1
- backcall ==0.2.0
- bitsandbytes ==0.41.1
- black ==23.9.1
- blobfile ==2.0.2
- braceexpand ==0.1.7
- cachetools ==5.3.3
- causal-conv1d ==1.2.0.post2
- certifi ==2023.7.22
- charset-normalizer ==3.2.0
- click ==8.1.7
- cmake ==3.27.5
- coloredlogs ==15.0.1
- dalle-pytorch ==1.6.6
- datasets ==2.14.5
- decorator ==5.1.1
- deepspeed ==0.10.3
- dill ==0.3.7
- docker-pycreds ==0.4.0
- einops ==0.6.1
- exceptiongroup ==1.1.3
- executing ==1.2.0
- filelock ==3.12.4
- fire ==0.5.0
- frozenlist ==1.4.0
- fsspec ==2023.6.0
- ftfy ==6.1.1
- gitdb ==4.0.10
- hjson ==3.1.0
- huggingface-hub ==0.20.2
- humanfriendly ==10.0
- idna ==3.4
- imageio ==2.34.0
- inflate64 ==0.3.1
- iniconfig ==2.0.0
- ipython ==8.15.0
- jedi ==0.19.0
- lazy_loader ==0.3
- lightning-utilities ==0.9.0
- lit ==16.0.6
- loralib ==0.1.2
- lxml ==4.9.3
- mamba-ssm ==1.2.0.post1
- matplotlib-inline ==0.1.6
- mpmath ==1.3.0
- multidict ==6.0.4
- multiprocess ==0.70.15
- multivolumefile ==0.2.3
- mypy ==1.5.1
- mypy-extensions ==1.0.0
- networkx ==3.1
- ninja ==1.11.1
- numpy ==1.26.0
- nvidia-cublas-cu11 ==11.10.3.66
- nvidia-cuda-cupti-cu11 ==11.7.101
- nvidia-cuda-nvrtc-cu11 ==11.7.99
- nvidia-cuda-runtime-cu11 ==11.7.99
- nvidia-cudnn-cu11 ==8.5.0.96
- nvidia-cufft-cu11 ==10.9.0.58
- nvidia-curand-cu11 ==10.2.10.91
- nvidia-cusolver-cu11 ==11.4.0.1
- nvidia-cusparse-cu11 ==11.7.4.91
- nvidia-ml-py ==12.535.161
- nvidia-nccl-cu11 ==2.14.3
- nvidia-nvtx-cu11 ==11.7.91
- nvitop ==1.3.2
- omegaconf ==2.3.0
- openai-clip ==1.0.1
- opencv-python ==4.8.0.76
- optimum ==1.13.2
- packaging ==23.1
- pandas ==2.1.1
- parso ==0.8.3
- pathspec ==0.11.2
- pathtools ==0.1.2
- peft ==0.5.0
- pexpect ==4.8.0
- pickleshare ==0.7.5
- pip ==23.2.1
- pipx ==1.5.0
- platformdirs ==3.10.0
- pluggy ==1.3.0
- prompt-toolkit ==3.0.39
- protobuf ==4.24.3
- psutil ==5.9.8
- ptyprocess ==0.7.0
- pure-eval ==0.2.2
- py-cpuinfo ==9.0.0
- py7zr ==0.20.6
- pyarrow ==13.0.0
- pybcj ==1.0.1
- pycryptodomex ==3.19.0
- pydantic ==1.10.12
- pynvml ==11.5.0
- pyppmd ==1.0.0
- pytest ==7.4.2
- pytest-mock ==3.11.1
- python-dateutil ==2.8.2
- pytorch-lightning ==2.0.9
- pytz ==2023.3.post1
- pyzstd ==0.15.9
- regex ==2023.8.8
- requests ==2.31.0
- rotary-embedding-torch ==0.3.0
- safetensors ==0.4.2
- scikit-image ==0.22.0
- scipy ==1.11.2
- sentencepiece ==0.1.99
- sentry-sdk ==1.31.0
- setproctitle ==1.3.2
- setuptools ==68.0.0
- six ==1.16.0
- smmap ==5.0.1
- stack-data ==0.6.2
- sympy ==1.12
- taming-transformers-rom1504 ==0.0.6
- termcolor ==2.4.0
- texttable ==1.6.7
- tifffile ==2024.2.12
- tokenize-rt ==5.2.0
- tokenizers ==0.15.0
- tomli ==2.0.1
- torch ==2.0.1
- torchmetrics ==1.2.0
- torchvision ==0.15.2
- tqdm ==4.66.1
- traitlets ==5.10.0
- transformers ==4.39.3
- triton ==2.0.0
- typing_extensions ==4.8.0
- tzdata ==2023.3
- urllib3 ==2.0.5
- userpath ==1.9.2
- wandb ==0.16.2
- wcwidth ==0.2.6
- webdataset ==0.2.57
- wheel ==0.38.4
- xxhash ==3.3.0
- yarl ==1.9.2
- youtokentome ==1.0.6