sign-language-recognition

Implementation of a Isolated Sign Language Recognition using CNN

https://github.com/dudu197/sign-language-recognition

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
✓
Academic publication links
Links to: sciencedirect.com, springer.com, zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.0%) to scientific vocabulary

Keywords

cnn computer-vision pytorch

Last synced: 11 months ago · JSON representation ·

Repository

Implementation of a Isolated Sign Language Recognition using CNN

Basic Info

Host: GitHub
Owner: Dudu197
License: apache-2.0
Language: Jupyter Notebook
Default Branch: main
Homepage:
Size: 38.3 MB

Statistics

Stars: 2
Watchers: 3
Forks: 1
Open Issues: 0
Releases: 0

Topics

cnn computer-vision pytorch

Created almost 3 years ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

Sign Language Recognition

Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation

Overview

This repository contains the code for the paper:

Alves, Carlos Eduardo GR, Francisco de Assis Boldt, and Thiago M. Paixão. "Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation." arXiv preprint arXiv:2404.19148 (2024).

We propose a novel approach to Sign Language Recognition (SLR) using skeleton image representations, achieving state-of-the-art results on Brazilian Sign Language datasets.

Overview
Disclaimer
Requirements & Installation
Dataset
Directory Structure
Preprocessing
Training
Results
Citation
Contributing
License

Disclaimer

This repository is under refactoring. The code was originally designed for experiments and is being improved for user-friendliness.

If you only want to train the model, use the sign-language-recognition-model repository for a simpler, quick-start version.

For questions, please open an issue.

Requirements & Installation

Python 3.7
Required libraries are listed in requirements.txt.

Install dependencies: bash pip install -r requirements.txt

Dataset

We use two Brazilian Sign Language (Libras) datasets:

MINDS-Libras

20 signs, 12 signers, 5 repetitions per sign.
Dataset on Zenodo
Paper

Libras-UFOP

56 signs, 5 signers, 8–16 repetitions per sign.
Paper

Preprocessed Datasets

For convenience, we provide preprocessed versions of the datasets with extracted landmarks:

MINDS-Libras: Download preprocessed data
- Place at: 00_datasets/dataset_output/libras_minds/libras_minds_openpose.csv
Include-50: Download preprocessed data
- Place at: 00_datasets/dataset_output/include50/include50_openpose.csv
KSL (Korean Sign Language): Download preprocessed data
- Place at: 00_datasets/dataset_output/KSL/ksl_openpose.csv

Directory Structure

01_landmarks_extraction/ – Extract landmark points from videos using OpenPose.
02_data_processing/ – Join CSV files into a single dataset.
03_model_training/ – Model training scripts and model definitions.
04_result_analysis/ – Notebooks and scripts for analyzing results.
00_data_exploration/ – Data exploration scripts and notebooks.
99_model_output/ – Model outputs.
99_old/ – Legacy scripts and notebooks.
99_others/ – Miscellaneous scripts and data.
99_skeleton_explore/ – Skeleton-based experiments.

Preprocessing

Extract Landmarks:
- Use scripts in 01_landmarks_extraction/ to extract landmark points from videos (e.g., with OpenPose).
Join CSVs:
- Use scripts in 02_data_processing/ to merge CSVs into a single dataset file.

Training

Training scripts are in 03_model_training/.
Most hyperparameters are parallelized, but some may need adjustment per dataset.
See 03_model_training/README.md for details.

Example: bash python 03_model_training/model_training.py --config your_config.yaml

Results

Our model achieves strong performance across multiple sign language datasets:

Brazilian Sign Language Datasets

MINDS-Libras:
- Accuracy: 0.93
- +2 percentage points accuracy, +3 F1-Score over previous SOTA
Libras-UFOP:
- Accuracy: 0.82
- +8 percentage points accuracy, +9 F1-Score over previous SOTA

Additional Datasets

Include-50:
- Accuracy: 0.97 (ResNet18 + Skeleton-DML)
- Excellent performance on this larger dataset
KSL (Korean Sign Language):
- Accuracy: 0.63 (ResNet18 + Skeleton-DML)
- Best performance with Skeleton-DML representation

Citation

If you use this code for your research, please cite our paper:

Alves, Carlos Eduardo GR, Francisco de Assis Boldt, and Thiago M. Paixão. "Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation." arXiv preprint arXiv:2404.19148 (2024).

BibTeX: bibtex @article{alves2024enhancing, title={Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation}, author={Alves, Carlos Eduardo GR and Boldt, Francisco de Assis and Paix{\~a}o, Thiago M}, journal={arXiv preprint arXiv:2404.19148}, year={2024} }

Contributing

Contributions are welcome! Please open an issue or submit a pull request if you have suggestions, bug fixes, or improvements.

License

This project is licensed under the terms of the MIT License. See the LICENSE file for details.

Owner

Name: Carlos Eduardo
Login: Dudu197
Kind: user
Location: Volta Redonda, RJ, Brasil
Company: Dudollar

Website: https://dudollar.com.br
Repositories: 6
Profile: https://github.com/Dudu197

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Alves
    given-names: Carlos Eduardo Gomes Reddo
  - family-names: M Paixão
    given-names: Thiago
title: "Enhancing Brazilian Sign Language Recognition Through Skeleton Image Representation"
version: 1.0.0
identifiers:
  - type: doi
    value: 10.1109/SIBGRAPI62404.2024.10716301
date-released: 2024-10-18

GitHub Events

Total

Watch event: 1
Push event: 3

Last Year

Watch event: 1
Push event: 3

Dependencies

requirements.txt pypi

mediapipe *

99_old/web/package-lock.json npm

@tensorflow/tfjs 4.11.0
@tensorflow/tfjs-backend-cpu 4.11.0
@tensorflow/tfjs-backend-webgl 4.11.0
@tensorflow/tfjs-converter 4.11.0
@tensorflow/tfjs-core 4.11.0
@tensorflow/tfjs-data 4.11.0
@tensorflow/tfjs-layers 4.11.0
@types/long 4.0.2
@types/node 20.6.0
@types/node-fetch 2.6.4
@types/offscreencanvas 2019.7.1
@types/offscreencanvas 2019.3.0
@types/seedrandom 2.4.30
@webgpu/types 0.1.30
ansi-regex 5.0.1
ansi-styles 4.3.0
argparse 1.0.10
asynckit 0.4.0
chalk 4.1.2
cliui 7.0.4
color-convert 2.0.1
color-name 1.1.4
combined-stream 1.0.8
core-js 3.29.1
delayed-stream 1.0.0
emoji-regex 8.0.0
escalade 3.1.1
form-data 3.0.1
get-caller-file 2.0.5
has-flag 4.0.0
is-fullwidth-code-point 3.0.0
long 4.0.0
mime-db 1.52.0
mime-types 2.1.35
node-fetch 2.6.13
regenerator-runtime 0.13.11
require-directory 2.1.1
safe-buffer 5.2.1
seedrandom 3.0.5
sprintf-js 1.0.3
string-width 4.2.3
string_decoder 1.3.0
strip-ansi 6.0.1
supports-color 7.2.0
tr46 0.0.3
webidl-conversions 3.0.1
whatwg-url 5.0.0
wrap-ansi 7.0.0
y18n 5.0.8
yargs 16.2.0
yargs-parser 20.2.9

99_old/web/package.json npm

@tensorflow/tfjs ^4.11.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science