imu2text

IMU2Text: A hybrid CNN+GNN pipeline for handwriting recognition and trajectory prediction using IMU data with state-of-the-art accuracy (99.74%).

https://github.com/vahinitech/imu2text

Science Score: 31.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.5%) to scientific vocabulary

Keywords

cnn cnn-classification deeplearning gnn graph-neural-networks handwriting-recognition imu imu-data meachine-learning multi-task-learning sensor-simulation

Last synced: 4 months ago · JSON representation ·

Repository

IMU2Text: A hybrid CNN+GNN pipeline for handwriting recognition and trajectory prediction using IMU data with state-of-the-art accuracy (99.74%).

Basic Info

Host: GitHub
Owner: vahinitech
License: other
Language: Python
Default Branch: main
Homepage:
Size: 48.8 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Topics

cnn cnn-classification deeplearning gnn graph-neural-networks handwriting-recognition imu imu-data meachine-learning multi-task-learning sensor-simulation

Created about 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

IMU Character Recognition with CNN + GNN

This project implements a state-of-the-art handwriting recognition pipeline using Convolutional Neural Networks (CNN) and Graph Neural Networks (GNN) in a Multi-Task Learning (MTL) framework. The approach jointly optimizes character classification and trajectory regression tasks. This novel architecture builds on the principles discussed in Joint Classification and Trajectory Regression of Online Handwriting Using a Multi-Task Learning Approach and achieves substantial improvements over baseline methods.

The current model achieves a classification accuracy of 99.74% on the OnHW Dataset, improving significantly over the baseline CNN model (94.8%). Furthermore, the trajectory prediction component achieves smoother and more accurate results compared to previous approaches.

Features

1. Hybrid CNN-GNN Architecture

CNN Component:
- Extracts temporal and spatial features from the IMU sensor data.
GNN Component:
- Captures relational structures and interdependencies in multi-channel sensor data.

2. Multi-Task Learning (MTL)

Simultaneously solves two tasks:
- Character Classification: Maps IMU features to character labels.
- Trajectory Regression: Reconstructs handwriting trajectories for visualization and analysis.

3. Symbol and Equation Recognition

Extended pipeline supports classification of mathematical symbols and multi-character equations.

4. Advanced Loss Strategies

Combines:
- Cross-Entropy Loss for classification.
- Distance, Spatio-Temporal, and Distribution-Based Losses for regression.

Dataset

The OnHW Dataset includes:

13 IMU Channels: Data from accelerometers, gyroscope, magnetometer, and force sensors.
Variable-Length Sequences: Handwriting samples of varying lengths.
Ground Truth Labels: Characters, symbols, and trajectories.

The dataset is preprocessed to normalize IMU data and interpolate sequences to a uniform length.

How to Use

1. Install Dependencies

bash pip install -r requirements.txt

2. Preprocess the Train the model

bash python cnn_gnn.py

Analysis

Accuracy Comparison

| Model | Classification Accuracy (%) | |--------------|-----------------------------| | Baseline CNN | 94.8 | | CNN + GNN | 99.74 |

MTL Loss Combination Insights

The table below compares different combinations of loss functions for trajectory regression and character classification:

| Loss Combination | Classification Accuracy (%) | Regression Error (RMSE) | |------------------------|-----------------------------|--------------------------| | MSE + CrossEntropy | 86.69 | 0.1169 | | Huber + CrossEntropy | 88.43 | 0.1533 | | MSE + PearsonCorrelation | 87.03 | 0.1490 | | MSE + Wasserstein | 88.15 | 0.1530 |

Error Analysis

Mismatched predictions primarily occur with visually similar characters. For example:

P is often mistaken for D.
g is mistaken for y.

To reduce errors, the following strategies are recommended:

Data augmentation to increase diversity.
Weighted losses focusing on difficult samples.

Loss and Accuracy Plots

Suggestions for Further Improvements

1. Advanced Architectures

Implement Graph Attention Networks (GAT) for improved relational modeling.
Incorporate Transformer-based Models for enhanced sequence-to-sequence learning.

2. Data Augmentation

Introduce simulated IMU data with noise, shifts, and rotations to improve generalization.

3. Real-Time Optimization

Convert models to lightweight formats (e.g., TensorFlow Lite or ONNX) for deployment on resource-constrained devices.

Visualization of Architecture

Training Process

mermaid %%{init: {'theme': 'forest'}}%% flowchart TD A[IMU Data - Inertial MTS] -->|Extract Features| B[CNN Trunk] C[Relational Data - GNN Features] -->|Model Relationships| D[GNN Trunk] B --> E[Feature Concatenation] D --> E E -->|Forward Pass| F[Loss Calculation] F --> G[Gradient Update] G --> H[Updated Model Weights]

Evaluation Process

mermaid %%{init: {'theme': 'forest'}}%% flowchart TD A[Test IMU Data] -->|Inference| B[CNN Trunk] C[Test Relational Data] -->|Inference| D[GNN Trunk] B --> E[Feature Concatenation] D --> E E -->|Predictions| F[Classification Results] E -->|Predictions| G[Regression Results]

Results Overview

mermaid %%{init: {'theme': 'forest'}}%% flowchart TD A[Classification Results] -->|Accuracy| B[Final Character Prediction] C[Regression Results] -->|RMSE| D[Reconstructed Trajectory] B --> E[Performance Metrics] D --> E E --> F[Final Evaluation Report]

Paper and References

This project builds upon the methodology presented in:

Paper: Joint Classification and Trajectory Regression of Online Handwriting Using a Multi-Task Learning Approach.
Citation:

bibtex @inproceedings{ott2022joint, title={Joint Classification and Trajectory Regression of Online Handwriting Using a Multi-Task Learning Approach}, author={Ott, Felix and Rügamer, David and Heublein, Lucas and Bischl, Bernd and Mutschler, Christopher}, booktitle={IEEE Winter Conference on Applications of Computer Vision (WACV)}, pages={266--276}, year={2022} }

Resources

Dataset: OnHW Dataset
Paper: PDF

License

This project is licensed under the InkShare IMU2Text License, granted by Vahi Software Solutions. The license allows academic, personal, and non-commercial use. Commercial use requires prior authorization.

For details, see the LICENSE file.

For inquiries about commercial use or licensing, please contact: vkosuri@inkshare.in.

Owner

Name: Vahini Technologies
Login: vahinitech
Kind: organization
Email: info@vahinitech.com
Location: India

Website: https://vahinitech.com
Repositories: 1
Profile: https://github.com/vahinitech

Citation (CITATION.cff)

cff-version: 1.2.0
message: |
  This repository provides resources for handwriting recognition.
  Please cite the relevant sections depending on your use case:
  - For character recognition: Cite Reference A.
  - For symbol and equation recognition: Cite Reference B.

title: "imu2text: Handwriting Recognition Framework"
authors:
  - family-names: "Ott"
    given-names: "Felix"
  - family-names: "Rügamer"
    given-names: "David"
references:
  - type: article
    authors:
      - family-names: "Ott"
        given-names: "Felix"
      - family-names: "Rügamer"
        given-names: "David"
    title: "Joint Classification and Trajectory Regression of Online Handwriting"
    journal: "IEEE Winter Conf. on Applications of Computer Vision"
    year: 2022
    doi: "10.1109/WACV51458.2022.00131"
  - type: article
    authors:
      - family-names: "Ott"
        given-names: "Felix"
      - family-names: "Rügamer"
        given-names: "David"
    title: "Benchmarking Online Sequence-to-Sequence and Character-based Handwriting Recognition"
    journal: "International Journal on Document Analysis and Recognition"
    year: 2022
    doi: "10.1007/s10032-022-00415-6"

GitHub Events

Total

Watch event: 4

Last Year

Watch event: 4

Dependencies

requirements.txt pypi

ax-platform >=0.1.20
botorch >=0.4.0
matplotlib ==3.4.2
numpy ==1.24.3
pandas ==2.0.3
scikit-learn ==1.3.2
scipy ==1.10.1
tensorflow ==2.13.0
tensorflow-estimator ==2.13.0
tensorflow-macos ==2.13.0
torch ==2.4.1
torch >=1.9.1
torchvision >=0.10.1

imu2text

Science Score: 31.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

IMU Character Recognition with CNN + GNN

Features

1. Hybrid CNN-GNN Architecture

2. Multi-Task Learning (MTL)

3. Symbol and Equation Recognition

4. Advanced Loss Strategies

Dataset

How to Use

1. Install Dependencies

2. Preprocess the Train the model

Analysis

Accuracy Comparison

MTL Loss Combination Insights

Error Analysis

Loss and Accuracy Plots

Suggestions for Further Improvements

1. Advanced Architectures

2. Data Augmentation

3. Real-Time Optimization

Visualization of Architecture

Training Process

Evaluation Process

Results Overview

Paper and References

Resources

License

License

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies