Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.6%) to scientific vocabulary
Repository
WWW25@CV-ISLR
Basic Info
Statistics
- Stars: 4
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
WWW25@CV-ISLR
This repository contains our implementation for the Cross-View Isolated Sign Language Recognition (CV-ISLR) task submitted to the WWW 2025 competition. Our approach combines Ensemble Learning and Video Swin Transformer (VST) modules to address the challenges of cross-view sign language recognition. The framework is built on top of the MMAction2 v1.2.0 library.
Main Contributions
Ensemble Learning Integration:
We integrate Ensemble Learning into the CV-ISLR framework, enhancing robustness and generalization to effectively handle viewpoint variability.Multi-Dimensional VST Blocks:
We utilize VST blocks of varying sizes (Small, Base, Large) for both RGB and Depth videos, capturing features at multiple levels of granularity to improve recognition accuracy.
Installation
To set up the environment, follow these steps:
- Clone the repository:
```bash git clone https://github.com/Jiafei127/CV-ISLR.git cd CV-ISLR
- Install dependencies:
```bash
conda create -n cvislr python=3.8 -y
conda activate cvislr
conda install pytorch torchvision -c pytorch # This command will automatically install the latest version PyTorch and cudatoolkit, please check whether they match your environment.
pip install -U openmim
mim install mmengine
mim install mmcv
mim install mmdet
mim install mmpose
- Install MMAction2 v1.2.0:
```bash pip install -v -e .
Below is a markdown file for the CV-ISLR competition code repository. It adheres to GitHub's best practices for a well-documented repository.
Training
To train the models for RGB and Depth inputs:
Prepare the dataset: Download and preprocess the MM-WLAuslan dataset following the instructions provided in the
dataset/README.md.Train the backbone models:
bash python tools/train.py configs/recognition/swin/swin-<file_name>_rgb.py python tools/train.py configs/recognition/swin/swin-<file_name>_depth.pySave model checkpoints: After training, checkpoints will be saved in the
work_dirs/folder.Inference:
bash PORT=29500 bash tools/dist_test.sh configs/recognition/swin/swin-<file_name>_rgb.py ./work_dirs/swin-<checkpoint_name>.pth --dump result.pkl
Ensemble Learning
After training the individual models, apply the ensemble strategy (You can download our Model Checkpoints[huggingface].):
- Merge predictions from multiple backbones:
bash cd ./ENSEMBLE python ensemble.py - Submit the zip file
answer.zipto Codalab.
Performance
Top-1 Accuracy Results
| Team | RGB Acc@1 | RGB-D Acc@1 | |---------------|-----------|-------------| | VIPL-SLR | 56.87% | 57.97% | | tonicemerald | 40.30% | 33.97% | | gkdx2 (Ours) | 20.29% | 24.53% |
Table 1: The top-3 results for CV-ISLR on RGB and RGB-D tracks.
| Backbone | RGB-based | Depth-based | RGB-D-based | |-------------|-----------|-------------|-------------| | VST-Small | 14.84% | 14.01% | - | | VST-Base | 17.51% | 16.46% | - | | VST-Large | 17.04% | 17.58% | - | | Ensemble | 20.29% | - | 24.53% |
Table 2: Experimental results for RGB and RGB-D tracks on different backbones.
Acknowledgements
This project is built on the MMAction2 framework and utilizes the MM-WLAuslan dataset. We thank the developers for their contributions to open-source tools and datasets.
Owner
- Name: Fei Wang
- Login: Jiafei127
- Kind: user
- Repositories: 1
- Profile: https://github.com/Jiafei127
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - name: "MMAction2 Contributors" title: "OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark" date-released: 2020-07-21 url: "https://github.com/open-mmlab/mmaction2" license: Apache-2.0
GitHub Events
Total
- Watch event: 5
- Push event: 11
- Pull request event: 2
- Fork event: 1
Last Year
- Watch event: 5
- Push event: 11
- Pull request event: 2
- Fork event: 1
Dependencies
- Pillow *
- decord >=0.4.1
- einops *
- matplotlib *
- numpy *
- opencv-contrib-python *
- scipy *
- torch >=1.3
- docutils ==0.18.1
- einops *
- modelindex *
- myst-parser *
- opencv-python *
- scipy *
- sphinx ==6.1.3
- sphinx-notfound-page *
- sphinx-tabs *
- sphinx_copybutton *
- sphinx_markdown_tables *
- sphinxcontrib-jquery *
- tabulate *
- mmcv >=2.0.0rc4,<2.2.0
- mmengine >=0.7.1,<1.0.0
- transformers >=4.28.0
- PyTurboJPEG *
- av >=9.0
- future *
- imgaug *
- librosa *
- lmdb *
- moviepy *
- openai-clip *
- packaging *
- pims *
- soundfile *
- tensorboard *
- wandb *
- mmcv *
- titlecase *
- torch *
- torchvision *
- coverage * test
- flake8 * test
- interrogate * test
- isort ==4.3.21 test
- parameterized * test
- pytest * test
- pytest-runner * test
- xdoctest >=0.10.0 test
- yapf * test
- ca-certificates 2020.1.1.*
- certifi 2020.4.5.1.*
- ffmpeg 2.8.6.*
- libcxx 10.0.0.*
- libedit 3.1.20181209.*
- libffi 3.3.*
- ncurses 6.2.*
- openssl 1.1.1g.*
- pip 20.0.2.*
- python 3.7.7.*
- readline 8.0.*
- setuptools 46.4.0.*
- sqlite 3.31.1.*
- tk 8.6.8.*
- wheel 0.34.2.*
- xz 5.2.5.*
- zlib 1.2.11.*
- ca-certificates 2020.1.1.*
- certifi 2020.4.5.1.*
- ffmpeg 2.8.6.*
- libcxx 10.0.0.*
- libedit 3.1.20181209.*
- libffi 3.3.*
- ncurses 6.2.*
- openssl 1.1.1g.*
- pip 20.0.2.*
- python 3.7.7.*
- readline 8.0.*
- setuptools 46.4.0.*
- sqlite 3.31.1.*
- tk 8.6.8.*
- wheel 0.34.2.*
- xz 5.2.5.*
- zlib 1.2.11.*
- ca-certificates 2020.1.1.*
- certifi 2020.4.5.1.*
- ffmpeg 2.8.6.*
- libcxx 10.0.0.*
- libedit 3.1.20181209.*
- libffi 3.3.*
- ncurses 6.2.*
- openssl 1.1.1g.*
- pip 20.0.2.*
- python 3.7.7.*
- readline 8.0.*
- setuptools 46.4.0.*
- sqlite 3.31.1.*
- tk 8.6.8.*
- wheel 0.34.2.*
- xz 5.2.5.*
- zlib 1.2.11.*