https://github.com/chen-yang-liu/rscama

[IEEE GRSL 2024 πŸ”₯] RSCaMa: Remote Sensing Image Change Captioning with State Space Model

https://github.com/chen-yang-liu/rscama

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • β—‹
    CITATION.cff file
  • βœ“
    codemeta.json file
    Found codemeta.json file
  • βœ“
    .zenodo.json file
    Found .zenodo.json file
  • βœ“
    DOI references
    Found 1 DOI reference(s) in README
  • βœ“
    Academic publication links
    Links to: arxiv.org, ieee.org
  • β—‹
    Academic email domains
  • β—‹
    Institutional organization owner
  • β—‹
    JOSS paper metadata
  • β—‹
    Scientific vocabulary similarity
    Low similarity (14.4%) to scientific vocabulary

Keywords

change-captioning change-detection mamba remote-sensing
Last synced: 6 months ago · JSON representation

Repository

[IEEE GRSL 2024 πŸ”₯] RSCaMa: Remote Sensing Image Change Captioning with State Space Model

Basic Info
  • Host: GitHub
  • Owner: Chen-Yang-Liu
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 64.9 MB
Statistics
  • Stars: 69
  • Watchers: 2
  • Forks: 3
  • Open Issues: 4
  • Releases: 0
Topics
change-captioning change-detection mamba remote-sensing
Created almost 2 years ago · Last pushed 7 months ago
Metadata Files
Readme

README.md

RSCaMa: Remote Sensing Image Change Captioning with State Space Model


πŸ†οΈESI Highly Cited Paper

license

Share us a :star: if you're interested in this repo

βœ… This repository contains the PyTorch implementation of "RSCaMa: Remote Sensing Image Change Captioning with State Space Model".

βœ… πŸ†οΈESI Highly Cited Paper

News:

πŸ”₯Remote Sensing Spatio-Temporal Vision-Language Models: A Comprehensive Survey: [Paper] [Github]

Installation and Dependencies

python git clone https://github.com/Chen-Yang-Liu/RSCaMa.git cd RSCaMa conda create -n RSCaMa_env python=3.9 conda activate RSCaMa_env pip install -r requirements.txt

Data Preparation

  • Download the LEVIR_CC dataset: LEVIR-CC .
  • The data structure of LEVIR-CC is organized as follows:

β”œβ”€/root/Data/LEVIR_CC/ β”œβ”€LevirCCcaptions.json β”œβ”€images β”œβ”€train β”‚ β”œβ”€A β”‚ β”œβ”€B β”œβ”€val β”‚ β”œβ”€A β”‚ β”œβ”€B β”œβ”€test β”‚ β”œβ”€A β”‚ β”œβ”€B where folder A contains images of pre-phase, folder B contains images of post-phase.

  • Extract text files for the change descriptions of each image pair in LEVIR-CC:

python preprocess_data.py --input_captions_json /DATA_PATH/Levir-CC-dataset/LevirCCcaptions.json

!NOTE: When preparing the text token files, we suggest setting the word count threshold of LEVIR-CC to 5 and Dubai_CC to 0 for fair comparisons.

NOTE

  • Please modify the source code of CLIP package, please modify CLIP.model.VisionTransformer.forward() as [this].
  • Mamba is only supported on Linux systems.

Training

python train_CC.py --data_folder /DATA_PATH/Levir-CC-dataset/images

!NOTE: If the program encounters the error: "'Meteor' object has no attribute 'lock'," we recommend installing it with sudo apt install openjdk-11-jdk to resolve this issue.

Evaluate

python python test.py --data_folder /DATA_PATH/Levir-CC-dataset/images --checkpoint xxxx.pth Alternatively, you can download our pretrained model here: [Hugging face].

Experiment:






Citation:

``` @ARTICLE{liu2024rscama, author={Liu, Chenyang and Chen, Keyan and Chen, Bowen and Zhang, Haotian and Zou, Zhengxia and Shi, Zhenwei}, journal={IEEE Geoscience and Remote Sensing Letters}, title={RSCaMa: Remote Sensing Image Change Captioning With State Space Model}, year={2024}, volume={21}, number={}, pages={1-5}, keywords={Decoding;Visualization;Transformers;Task analysis;Solid modeling;Remote sensing;Feature extraction;Change captioning;Mamba;spatial difference-guided SSM;state space model (SSM);temporal traveling SSM}, doi={10.1109/LGRS.2024.3404604}}

```

Owner

  • Name: Liu Chenyang
  • Login: Chen-Yang-Liu
  • Kind: user
  • Location: Beijing

Liu Chenyang

GitHub Events

Total
  • Issues event: 3
  • Watch event: 22
  • Issue comment event: 3
  • Push event: 4
Last Year
  • Issues event: 3
  • Watch event: 22
  • Issue comment event: 3
  • Push event: 4

Issues and Pull Requests

Last synced: almost 2 years ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • rginjapan (2)
  • keaill (2)
  • luchenhao-luke (1)
  • yangsunzhe (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirement.txt pypi
  • Brotli ==1.1.0
  • Cython ==3.0.2
  • DALL-E ==0.1
  • GitPython ==3.1.37
  • Jinja2 ==3.1.2
  • MarkupSafe ==2.1.3
  • Pillow ==10.0.1
  • PyYAML ==6.0.1
  • Pygments ==2.16.1
  • accelerate ==0.23.0
  • aiohttp ==3.8.5
  • aiosignal ==1.3.1
  • antlr4-python3-runtime ==4.9.3
  • appdirs ==1.4.4
  • argcomplete ==3.3.0
  • asttokens ==2.4.0
  • async-timeout ==4.0.3
  • attrs ==23.1.0
  • axial-positional-embedding ==0.2.1
  • backcall ==0.2.0
  • bitsandbytes ==0.41.1
  • black ==23.9.1
  • blobfile ==2.0.2
  • braceexpand ==0.1.7
  • cachetools ==5.3.3
  • causal-conv1d ==1.2.0.post2
  • certifi ==2023.7.22
  • charset-normalizer ==3.2.0
  • click ==8.1.7
  • cmake ==3.27.5
  • coloredlogs ==15.0.1
  • dalle-pytorch ==1.6.6
  • datasets ==2.14.5
  • decorator ==5.1.1
  • deepspeed ==0.10.3
  • dill ==0.3.7
  • docker-pycreds ==0.4.0
  • einops ==0.6.1
  • exceptiongroup ==1.1.3
  • executing ==1.2.0
  • filelock ==3.12.4
  • fire ==0.5.0
  • frozenlist ==1.4.0
  • fsspec ==2023.6.0
  • ftfy ==6.1.1
  • gitdb ==4.0.10
  • hjson ==3.1.0
  • huggingface-hub ==0.20.2
  • humanfriendly ==10.0
  • idna ==3.4
  • imageio ==2.34.0
  • inflate64 ==0.3.1
  • iniconfig ==2.0.0
  • ipython ==8.15.0
  • jedi ==0.19.0
  • lazy_loader ==0.3
  • lightning-utilities ==0.9.0
  • lit ==16.0.6
  • loralib ==0.1.2
  • lxml ==4.9.3
  • mamba-ssm ==1.2.0.post1
  • matplotlib-inline ==0.1.6
  • mpmath ==1.3.0
  • multidict ==6.0.4
  • multiprocess ==0.70.15
  • multivolumefile ==0.2.3
  • mypy ==1.5.1
  • mypy-extensions ==1.0.0
  • networkx ==3.1
  • ninja ==1.11.1
  • numpy ==1.26.0
  • nvidia-cublas-cu11 ==11.10.3.66
  • nvidia-cuda-cupti-cu11 ==11.7.101
  • nvidia-cuda-nvrtc-cu11 ==11.7.99
  • nvidia-cuda-runtime-cu11 ==11.7.99
  • nvidia-cudnn-cu11 ==8.5.0.96
  • nvidia-cufft-cu11 ==10.9.0.58
  • nvidia-curand-cu11 ==10.2.10.91
  • nvidia-cusolver-cu11 ==11.4.0.1
  • nvidia-cusparse-cu11 ==11.7.4.91
  • nvidia-ml-py ==12.535.161
  • nvidia-nccl-cu11 ==2.14.3
  • nvidia-nvtx-cu11 ==11.7.91
  • nvitop ==1.3.2
  • omegaconf ==2.3.0
  • openai-clip ==1.0.1
  • opencv-python ==4.8.0.76
  • optimum ==1.13.2
  • packaging ==23.1
  • pandas ==2.1.1
  • parso ==0.8.3
  • pathspec ==0.11.2
  • pathtools ==0.1.2
  • peft ==0.5.0
  • pexpect ==4.8.0
  • pickleshare ==0.7.5
  • pip ==23.2.1
  • pipx ==1.5.0
  • platformdirs ==3.10.0
  • pluggy ==1.3.0
  • prompt-toolkit ==3.0.39
  • protobuf ==4.24.3
  • psutil ==5.9.8
  • ptyprocess ==0.7.0
  • pure-eval ==0.2.2
  • py-cpuinfo ==9.0.0
  • py7zr ==0.20.6
  • pyarrow ==13.0.0
  • pybcj ==1.0.1
  • pycryptodomex ==3.19.0
  • pydantic ==1.10.12
  • pynvml ==11.5.0
  • pyppmd ==1.0.0
  • pytest ==7.4.2
  • pytest-mock ==3.11.1
  • python-dateutil ==2.8.2
  • pytorch-lightning ==2.0.9
  • pytz ==2023.3.post1
  • pyzstd ==0.15.9
  • regex ==2023.8.8
  • requests ==2.31.0
  • rotary-embedding-torch ==0.3.0
  • safetensors ==0.4.2
  • scikit-image ==0.22.0
  • scipy ==1.11.2
  • sentencepiece ==0.1.99
  • sentry-sdk ==1.31.0
  • setproctitle ==1.3.2
  • setuptools ==68.0.0
  • six ==1.16.0
  • smmap ==5.0.1
  • stack-data ==0.6.2
  • sympy ==1.12
  • taming-transformers-rom1504 ==0.0.6
  • termcolor ==2.4.0
  • texttable ==1.6.7
  • tifffile ==2024.2.12
  • tokenize-rt ==5.2.0
  • tokenizers ==0.15.0
  • tomli ==2.0.1
  • torch ==2.0.1
  • torchmetrics ==1.2.0
  • torchvision ==0.15.2
  • tqdm ==4.66.1
  • traitlets ==5.10.0
  • transformers ==4.39.3
  • triton ==2.0.0
  • typing_extensions ==4.8.0
  • tzdata ==2023.3
  • urllib3 ==2.0.5
  • userpath ==1.9.2
  • wandb ==0.16.2
  • wcwidth ==0.2.6
  • webdataset ==0.2.57
  • wheel ==0.38.4
  • xxhash ==3.3.0
  • yarl ==1.9.2
  • youtokentome ==1.0.6