sign-language-recognition

Implementation of a Isolated Sign Language Recognition using CNN

https://github.com/dudu197/sign-language-recognition

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: sciencedirect.com, springer.com, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.0%) to scientific vocabulary

Keywords

cnn computer-vision pytorch
Last synced: 6 months ago · JSON representation ·

Repository

Implementation of a Isolated Sign Language Recognition using CNN

Basic Info
  • Host: GitHub
  • Owner: Dudu197
  • License: apache-2.0
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage:
  • Size: 38.3 MB
Statistics
  • Stars: 2
  • Watchers: 3
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Topics
cnn computer-vision pytorch
Created over 2 years ago · Last pushed 8 months ago
Metadata Files
Readme License Citation

README.md

Sign Language Recognition

Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation


Overview

This repository contains the code for the paper:

Alves, Carlos Eduardo GR, Francisco de Assis Boldt, and Thiago M. Paixão. "Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation." arXiv preprint arXiv:2404.19148 (2024).

We propose a novel approach to Sign Language Recognition (SLR) using skeleton image representations, achieving state-of-the-art results on Brazilian Sign Language datasets.


Table of Contents


Disclaimer

This repository is under refactoring. The code was originally designed for experiments and is being improved for user-friendliness.

If you only want to train the model, use the sign-language-recognition-model repository for a simpler, quick-start version.

For questions, please open an issue.


Requirements & Installation

  • Python 3.7
  • Required libraries are listed in requirements.txt.

Install dependencies: bash pip install -r requirements.txt


Dataset

We use two Brazilian Sign Language (Libras) datasets:

MINDS-Libras

Libras-UFOP

  • 56 signs, 5 signers, 8–16 repetitions per sign.
  • Paper

Preprocessed Datasets

For convenience, we provide preprocessed versions of the datasets with extracted landmarks:


Directory Structure

  • 01_landmarks_extraction/ – Extract landmark points from videos using OpenPose.
  • 02_data_processing/ – Join CSV files into a single dataset.
  • 03_model_training/ – Model training scripts and model definitions.
  • 04_result_analysis/ – Notebooks and scripts for analyzing results.
  • 00_data_exploration/ – Data exploration scripts and notebooks.
  • 99_model_output/ – Model outputs.
  • 99_old/ – Legacy scripts and notebooks.
  • 99_others/ – Miscellaneous scripts and data.
  • 99_skeleton_explore/ – Skeleton-based experiments.

Preprocessing

  1. Extract Landmarks:
    • Use scripts in 01_landmarks_extraction/ to extract landmark points from videos (e.g., with OpenPose).
  2. Join CSVs:
    • Use scripts in 02_data_processing/ to merge CSVs into a single dataset file.

Training

  • Training scripts are in 03_model_training/.
  • Most hyperparameters are parallelized, but some may need adjustment per dataset.
  • See 03_model_training/README.md for details.

Example: bash python 03_model_training/model_training.py --config your_config.yaml


Results

Our model achieves strong performance across multiple sign language datasets:

Brazilian Sign Language Datasets

  • MINDS-Libras:
    • Accuracy: 0.93
    • +2 percentage points accuracy, +3 F1-Score over previous SOTA
  • Libras-UFOP:
    • Accuracy: 0.82
    • +8 percentage points accuracy, +9 F1-Score over previous SOTA

Additional Datasets

  • Include-50:
    • Accuracy: 0.97 (ResNet18 + Skeleton-DML)
    • Excellent performance on this larger dataset
  • KSL (Korean Sign Language):
    • Accuracy: 0.63 (ResNet18 + Skeleton-DML)
    • Best performance with Skeleton-DML representation

Citation

If you use this code for your research, please cite our paper:

Alves, Carlos Eduardo GR, Francisco de Assis Boldt, and Thiago M. Paixão. "Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation." arXiv preprint arXiv:2404.19148 (2024).

BibTeX: bibtex @article{alves2024enhancing, title={Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation}, author={Alves, Carlos Eduardo GR and Boldt, Francisco de Assis and Paix{\~a}o, Thiago M}, journal={arXiv preprint arXiv:2404.19148}, year={2024} }


Contributing

Contributions are welcome! Please open an issue or submit a pull request if you have suggestions, bug fixes, or improvements.


License

This project is licensed under the terms of the MIT License. See the LICENSE file for details.

Owner

  • Name: Carlos Eduardo
  • Login: Dudu197
  • Kind: user
  • Location: Volta Redonda, RJ, Brasil
  • Company: Dudollar

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Alves
    given-names: Carlos Eduardo Gomes Reddo
  - family-names: M Paixão
    given-names: Thiago
title: "Enhancing Brazilian Sign Language Recognition Through Skeleton Image Representation"
version: 1.0.0
identifiers:
  - type: doi
    value: 10.1109/SIBGRAPI62404.2024.10716301
date-released: 2024-10-18

GitHub Events

Total
  • Watch event: 1
  • Push event: 3
Last Year
  • Watch event: 1
  • Push event: 3

Dependencies

requirements.txt pypi
  • mediapipe *
99_old/web/package-lock.json npm
  • @tensorflow/tfjs 4.11.0
  • @tensorflow/tfjs-backend-cpu 4.11.0
  • @tensorflow/tfjs-backend-webgl 4.11.0
  • @tensorflow/tfjs-converter 4.11.0
  • @tensorflow/tfjs-core 4.11.0
  • @tensorflow/tfjs-data 4.11.0
  • @tensorflow/tfjs-layers 4.11.0
  • @types/long 4.0.2
  • @types/node 20.6.0
  • @types/node-fetch 2.6.4
  • @types/offscreencanvas 2019.7.1
  • @types/offscreencanvas 2019.3.0
  • @types/seedrandom 2.4.30
  • @webgpu/types 0.1.30
  • ansi-regex 5.0.1
  • ansi-styles 4.3.0
  • argparse 1.0.10
  • asynckit 0.4.0
  • chalk 4.1.2
  • cliui 7.0.4
  • color-convert 2.0.1
  • color-name 1.1.4
  • combined-stream 1.0.8
  • core-js 3.29.1
  • delayed-stream 1.0.0
  • emoji-regex 8.0.0
  • escalade 3.1.1
  • form-data 3.0.1
  • get-caller-file 2.0.5
  • has-flag 4.0.0
  • is-fullwidth-code-point 3.0.0
  • long 4.0.0
  • mime-db 1.52.0
  • mime-types 2.1.35
  • node-fetch 2.6.13
  • regenerator-runtime 0.13.11
  • require-directory 2.1.1
  • safe-buffer 5.2.1
  • seedrandom 3.0.5
  • sprintf-js 1.0.3
  • string-width 4.2.3
  • string_decoder 1.3.0
  • strip-ansi 6.0.1
  • supports-color 7.2.0
  • tr46 0.0.3
  • webidl-conversions 3.0.1
  • whatwg-url 5.0.0
  • wrap-ansi 7.0.0
  • y18n 5.0.8
  • yargs 16.2.0
  • yargs-parser 20.2.9
99_old/web/package.json npm
  • @tensorflow/tfjs ^4.11.0