sign-language-recognition
Implementation of a Isolated Sign Language Recognition using CNN
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: sciencedirect.com, springer.com, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.0%) to scientific vocabulary
Keywords
Repository
Implementation of a Isolated Sign Language Recognition using CNN
Basic Info
Statistics
- Stars: 2
- Watchers: 3
- Forks: 1
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Sign Language Recognition
Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation
Overview
This repository contains the code for the paper:
Alves, Carlos Eduardo GR, Francisco de Assis Boldt, and Thiago M. Paixão. "Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation." arXiv preprint arXiv:2404.19148 (2024).
We propose a novel approach to Sign Language Recognition (SLR) using skeleton image representations, achieving state-of-the-art results on Brazilian Sign Language datasets.
Table of Contents
- Overview
- Disclaimer
- Requirements & Installation
- Dataset
- Directory Structure
- Preprocessing
- Training
- Results
- Citation
- Contributing
- License
Disclaimer
This repository is under refactoring. The code was originally designed for experiments and is being improved for user-friendliness.
If you only want to train the model, use the sign-language-recognition-model repository for a simpler, quick-start version.
For questions, please open an issue.
Requirements & Installation
- Python 3.7
- Required libraries are listed in
requirements.txt.
Install dependencies:
bash
pip install -r requirements.txt
Dataset
We use two Brazilian Sign Language (Libras) datasets:
MINDS-Libras
- 20 signs, 12 signers, 5 repetitions per sign.
- Dataset on Zenodo
- Paper
Libras-UFOP
- 56 signs, 5 signers, 8–16 repetitions per sign.
- Paper
Preprocessed Datasets
For convenience, we provide preprocessed versions of the datasets with extracted landmarks:
MINDS-Libras: Download preprocessed data
- Place at:
00_datasets/dataset_output/libras_minds/libras_minds_openpose.csv
- Place at:
Include-50: Download preprocessed data
- Place at:
00_datasets/dataset_output/include50/include50_openpose.csv
- Place at:
KSL (Korean Sign Language): Download preprocessed data
- Place at:
00_datasets/dataset_output/KSL/ksl_openpose.csv
- Place at:
Directory Structure
01_landmarks_extraction/– Extract landmark points from videos using OpenPose.02_data_processing/– Join CSV files into a single dataset.03_model_training/– Model training scripts and model definitions.04_result_analysis/– Notebooks and scripts for analyzing results.00_data_exploration/– Data exploration scripts and notebooks.99_model_output/– Model outputs.99_old/– Legacy scripts and notebooks.99_others/– Miscellaneous scripts and data.99_skeleton_explore/– Skeleton-based experiments.
Preprocessing
- Extract Landmarks:
- Use scripts in
01_landmarks_extraction/to extract landmark points from videos (e.g., with OpenPose).
- Use scripts in
- Join CSVs:
- Use scripts in
02_data_processing/to merge CSVs into a single dataset file.
- Use scripts in
Training
- Training scripts are in
03_model_training/. - Most hyperparameters are parallelized, but some may need adjustment per dataset.
- See
03_model_training/README.mdfor details.
Example:
bash
python 03_model_training/model_training.py --config your_config.yaml
Results
Our model achieves strong performance across multiple sign language datasets:
Brazilian Sign Language Datasets
- MINDS-Libras:
- Accuracy: 0.93
- +2 percentage points accuracy, +3 F1-Score over previous SOTA
- Libras-UFOP:
- Accuracy: 0.82
- +8 percentage points accuracy, +9 F1-Score over previous SOTA
Additional Datasets
- Include-50:
- Accuracy: 0.97 (ResNet18 + Skeleton-DML)
- Excellent performance on this larger dataset
- KSL (Korean Sign Language):
- Accuracy: 0.63 (ResNet18 + Skeleton-DML)
- Best performance with Skeleton-DML representation
Citation
If you use this code for your research, please cite our paper:
Alves, Carlos Eduardo GR, Francisco de Assis Boldt, and Thiago M. Paixão. "Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation." arXiv preprint arXiv:2404.19148 (2024).
BibTeX:
bibtex
@article{alves2024enhancing,
title={Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation},
author={Alves, Carlos Eduardo GR and Boldt, Francisco de Assis and Paix{\~a}o, Thiago M},
journal={arXiv preprint arXiv:2404.19148},
year={2024}
}
Contributing
Contributions are welcome! Please open an issue or submit a pull request if you have suggestions, bug fixes, or improvements.
License
This project is licensed under the terms of the MIT License. See the LICENSE file for details.
Owner
- Name: Carlos Eduardo
- Login: Dudu197
- Kind: user
- Location: Volta Redonda, RJ, Brasil
- Company: Dudollar
- Website: https://dudollar.com.br
- Repositories: 6
- Profile: https://github.com/Dudu197
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Alves
given-names: Carlos Eduardo Gomes Reddo
- family-names: M Paixão
given-names: Thiago
title: "Enhancing Brazilian Sign Language Recognition Through Skeleton Image Representation"
version: 1.0.0
identifiers:
- type: doi
value: 10.1109/SIBGRAPI62404.2024.10716301
date-released: 2024-10-18
GitHub Events
Total
- Watch event: 1
- Push event: 3
Last Year
- Watch event: 1
- Push event: 3
Dependencies
- mediapipe *
- @tensorflow/tfjs 4.11.0
- @tensorflow/tfjs-backend-cpu 4.11.0
- @tensorflow/tfjs-backend-webgl 4.11.0
- @tensorflow/tfjs-converter 4.11.0
- @tensorflow/tfjs-core 4.11.0
- @tensorflow/tfjs-data 4.11.0
- @tensorflow/tfjs-layers 4.11.0
- @types/long 4.0.2
- @types/node 20.6.0
- @types/node-fetch 2.6.4
- @types/offscreencanvas 2019.7.1
- @types/offscreencanvas 2019.3.0
- @types/seedrandom 2.4.30
- @webgpu/types 0.1.30
- ansi-regex 5.0.1
- ansi-styles 4.3.0
- argparse 1.0.10
- asynckit 0.4.0
- chalk 4.1.2
- cliui 7.0.4
- color-convert 2.0.1
- color-name 1.1.4
- combined-stream 1.0.8
- core-js 3.29.1
- delayed-stream 1.0.0
- emoji-regex 8.0.0
- escalade 3.1.1
- form-data 3.0.1
- get-caller-file 2.0.5
- has-flag 4.0.0
- is-fullwidth-code-point 3.0.0
- long 4.0.0
- mime-db 1.52.0
- mime-types 2.1.35
- node-fetch 2.6.13
- regenerator-runtime 0.13.11
- require-directory 2.1.1
- safe-buffer 5.2.1
- seedrandom 3.0.5
- sprintf-js 1.0.3
- string-width 4.2.3
- string_decoder 1.3.0
- strip-ansi 6.0.1
- supports-color 7.2.0
- tr46 0.0.3
- webidl-conversions 3.0.1
- whatwg-url 5.0.0
- wrap-ansi 7.0.0
- y18n 5.0.8
- yargs 16.2.0
- yargs-parser 20.2.9
- @tensorflow/tfjs ^4.11.0