whispers2024_lumir
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.0%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: jeongho-min
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 22.8 MB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Gated-MCBAM: Cross-Modal Block Attention Module with Gating Mechanism for Remote Sensing Segmentation
This repository contains the implementation of Gated-MCBAM, an innovative dual-stream framework that combines cross-modal attention and gating mechanisms for multi-source remote sensing segmentation. Our approach effectively integrates SAR and optical remote sensing data through a sophisticated attention mechanism. This implementation is based on the MMSegmentation framework.
Model Architecture
Our model features: - Dual-stream architecture for processing SAR and MSI data - Cross-modal attention mechanism for feature interaction - Gating mechanism for adaptive feature selection - Multi-scale feature fusion - Integration of Swin Transformer and ResNet backbones
Preprocessing
1. 12-Channel to 10-Channel Conversion
- We use both 12-channel and 10-channel data for ensemble predictions
- The original 12-channel data from YREB-dataset is converted to 10-channel format
- Conversion script:
tools/dataset_converters/12ch-10ch.py - Output is stored in the
multisenfolder
2. SAR Data Processing
- Process VV and VH channels into 3-channel format
- Create a new directory called 'SARAVGTIF' containing:
- Channel 1: VV
- Channel 2: VH
- Channel 3: (VV+VH)/2
- Processing script:
tools/dataset_converters/new_channel_yreb.py
Training
Prerequisites
- Preprocessed dataset (multisen folder and SARAVGTIF)
- Config files located in
workdir/whisper/for different methods
Training on a Single GPU
Basic usage:
bash
python tools/train.py ${CONFIG_FILE} [optional arguments]
Example:
bash
python tools/train.py ./workdir/whisper/Gcbamr50_swin_weight_256x256_upernet_last_v3/config.py
Optional Arguments
--work-dir ${WORK_DIR}: Override the working directory--amp: Enable auto mixed precision training--resume: Resume from the latest checkpoint in the work_dir automatically--cfg-options ${OVERRIDE_CONFIGS}: Override config settings. For example:bash python tools/train.py ${CONFIG_FILE} --cfg-options model.encoder.in_channels=6### Pretrained Models The model uses ImageNet pretrained weights for both streams:- EO Stream: ResNet backbone initialized with ImageNet pretrained weights
- MSI Stream: ConvNeXt/Swin Transformer backbone initialized with ImageNet pretrained weights
- The pretrained weights will be automatically downloaded during the first training
Config Files
Configuration files for different methods can be found in their respective folders under:
workdir/whisper/
└── method_name/
└── config.py
For detailed training configurations and options, please refer to the MMSegmentation official documentation.
Model Weights
Pre-trained model weights can be downloaded from: Google Drive Link
Testing
Individual Model Testing
You can test individual models using:
bash
python tools/test.py \
--config path/to/config.py \
--checkpoint path/to/weights.pth
Ensemble Testing
- Download the model weights from our Google Drive
- Save the weights in your local directory
- Navigate to the
workdirfolder - Run
ensemble.pywith appropriate config files and weight paths
Directory Structure
├── tools
│ ├── test.py
│ └── train.py
│ └── dataset_converters
│ └── new_channel_yreb.py
└── 12ch-10ch.py
└── workdir
└── ensemble.py
Contact
For any questions or issues, please contact: - Email: jeongho.min@unist.ac.kr
License
[License information to be added]
Citation
If you find this work useful in your research, please consider citing:
[Citation information to be added]
Owner
- Login: jeongho-min
- Kind: user
- Company: Lumir
- Repositories: 1
- Profile: https://github.com/jeongho-min
Email: jeongho.min@unist.ac.kr
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - name: "MMSegmentation Contributors" title: "OpenMMLab Semantic Segmentation Toolbox and Benchmark" date-released: 2020-07-10 url: "https://github.com/open-mmlab/mmsegmentation" license: Apache-2.0
GitHub Events
Total
- Watch event: 2
- Public event: 1
- Push event: 16
Last Year
- Watch event: 2
- Public event: 1
- Push event: 16
Dependencies
- Markdown ==3.7
- PyYAML ==6.0.2
- Pygments ==2.18.0
- Werkzeug ==3.0.4
- absl-py ==2.1.0
- addict ==2.4.0
- affine ==2.4.0
- aliyun-python-sdk-core ==2.15.2
- aliyun-python-sdk-kms ==2.16.5
- attrs ==24.2.0
- cachetools ==5.5.0
- cffi ==1.17.1
- click ==8.1.7
- click-plugins ==1.1.1
- cligj ==0.7.2
- colorama ==0.4.6
- contourpy ==1.1.1
- crcmod ==1.7
- cryptography ==43.0.1
- cycler ==0.12.1
- einops ==0.8.0
- filelock ==3.14.0
- fonttools ==4.54.1
- fsspec ==2024.9.0
- ftfy ==6.2.3
- future ==1.0.0
- google-auth ==2.35.0
- google-auth-oauthlib ==1.0.0
- grpcio ==1.66.2
- huggingface-hub ==0.25.1
- idna ==3.10
- imagecodecs ==2023.3.16
- imageio ==2.35.1
- importlib_metadata ==8.5.0
- importlib_resources ==6.4.5
- jmespath ==0.10.0
- joblib ==1.4.2
- kiwisolver ==1.4.7
- markdown-it-py ==3.0.0
- matplotlib ==3.7.5
- mdurl ==0.1.2
- mkl-service ==2.4.0
- mmcv ==2.0.0rc4
- mmengine ==0.10.5
- model-index ==0.1.11
- oauthlib ==3.2.2
- opencv-python ==4.10.0.84
- opendatalab ==0.0.10
- openmim ==0.3.9
- openxlab ==0.1.1
- ordered-set ==4.1.0
- oss2 ==2.17.0
- packaging ==24.1
- pandas ==2.0.3
- platformdirs ==4.3.6
- prettytable ==3.11.0
- protobuf ==5.28.2
- pyasn1 ==0.6.1
- pyasn1_modules ==0.4.1
- pycparser ==2.22
- pycryptodome ==3.20.0
- pyparsing ==3.1.4
- python-dateutil ==2.9.0.post0
- pytz ==2023.4
- rasterio ==1.3.11
- regex ==2024.9.11
- requests ==2.28.2
- requests-oauthlib ==2.0.0
- rich ==13.4.2
- rsa ==4.9
- safetensors ==0.4.5
- scikit-learn ==1.3.2
- scipy ==1.10.1
- six ==1.16.0
- snuggs ==1.4.7
- tabulate ==0.9.0
- tensorboard ==2.14.0
- tensorboard-data-server ==0.7.2
- termcolor ==2.4.0
- threadpoolctl ==3.5.0
- tifffile ==2023.7.10
- timm ==1.0.9
- tomli ==2.0.1
- torch ==1.12.1
- torchaudio ==0.12.1
- torchvision ==0.13.1
- tqdm ==4.65.2
- triton ==2.0.0
- tzdata ==2024.2
- urllib3 ==1.26.20
- wcwidth ==0.2.13
- yapf ==0.40.2
- zipp ==3.20.2
- _libgcc_mutex 0.1
- _openmp_mutex 5.1
- binutils 2.36.1
- binutils_impl_linux-64 2.36.1
- binutils_linux-64 2.36
- blas 1.0
- brotli-python 1.0.9
- bzip2 1.0.8
- c-compiler 1.1.2
- ca-certificates 2024.8.30
- certifi 2024.8.30
- charset-normalizer 3.3.2
- cuda-cudart 11.7.99
- cuda-cupti 11.7.101
- cuda-libraries 11.7.1
- cuda-nvrtc 11.7.99
- cuda-nvtx 11.7.91
- cuda-runtime 11.7.1
- cuda-version 12.6
- cudatoolkit-dev 11.7.0
- cxx-compiler 1.1.2
- ffmpeg 4.3
- freetype 2.12.1
- gcc_impl_linux-64 7.5.0
- gcc_linux-64 7.5.0
- gmp 6.2.1
- gmpy2 2.1.2
- gnutls 3.6.15
- gxx_impl_linux-64 7.5.0
- gxx_linux-64 7.5.0
- intel-openmp 2023.1.0
- jinja2 3.1.4
- jpeg 9e
- kernel-headers_linux-64 3.10.0
- lame 3.100
- lcms2 2.12
- ld_impl_linux-64 2.36.1
- lerc 3.0
- libcublas 11.10.3.66
- libcufft 10.7.2.124
- libcufile 1.11.1.6
- libcurand 10.3.7.68
- libcusolver 11.4.0.1
- libcusparse 11.7.4.91
- libdeflate 1.17
- libffi 3.4.4
- libgcc-ng 11.2.0
- libgomp 11.2.0
- libiconv 1.16
- libidn2 2.3.4
- libnpp 11.7.4.75
- libnvjpeg 11.8.0.2
- libpng 1.6.39
- libstdcxx-ng 11.2.0
- libtasn1 4.19.0
- libtiff 4.5.1
- libunistring 0.9.10
- libwebp-base 1.3.2
- libxcb 1.12
- libxcrypt 4.4.28
- lz4-c 1.9.4
- markupsafe 2.1.3
- mkl 2023.1.0
- mkl-service 2.4.0
- mkl_fft 1.3.8
- mkl_random 1.2.4
- mpc 1.1.0
- mpfr 4.0.2
- mpmath 1.3.0
- ncurses 6.4
- nettle 3.7.3
- networkx 3.1
- ninja 1.11.0
- numpy-base 1.24.3
- openh264 2.1.1
- openjpeg 2.5.2
- openssl 3.0.15
- pillow 10.4.0
- pip 24.2
- pysocks 1.7.1
- python 3.8.19
- pytorch-cuda 11.7
- pytorch-mutex 1.0
- readline 8.2
- sqlite 3.45.3
- sympy 1.13.2
- sysroot_linux-64 2.17
- tbb 2021.8.0
- tk 8.6.14
- torchtriton 2.0.0
- typing_extensions 4.11.0
- wheel 0.44.0
- xorg-kbproto 1.0.7
- xorg-libx11 1.7.2
- xorg-libxext 1.3.4
- xorg-xextproto 7.3.0
- xorg-xproto 7.0.31
- xz 5.4.6
- zlib 1.2.13
- zstd 1.5.5