asaca-automatic-speech-analysis-for-cognitive-assessment

The automatic system that can extract PRAAT-like speech features from raw speech wav files, and also can get low WER (<10) high quality transcriptions at the same time.

https://github.com/rhysonyang-2030/asaca-automatic-speech-analysis-for-cognitive-assessment

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (19.2%) to scientific vocabulary

Keywords

ai classification deep-learning feature-engineering feature-extraction machine-learning multimodal natural-language-processing praat python python-script shap speech speech-analysis speech-and-language-processing speech-to-text training wav2vec2 wav2vec2ctc
Last synced: 4 months ago · JSON representation ·

Repository

The automatic system that can extract PRAAT-like speech features from raw speech wav files, and also can get low WER (<10) high quality transcriptions at the same time.

Basic Info
  • Host: GitHub
  • Owner: RhysonYang-2030
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 1.99 MB
Statistics
  • Stars: 3
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Topics
ai classification deep-learning feature-engineering feature-extraction machine-learning multimodal natural-language-processing praat python python-script shap speech speech-analysis speech-and-language-processing speech-to-text training wav2vec2 wav2vec2ctc
Created 7 months ago · Last pushed 4 months ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation Security

README.md

ASACA – Automatic Speech Analysis for Cognitive Assessments

CI PyPI License Python GUI

ASACA is an end-to-end toolkit that transforms raw speech into multimodal biomarkers — lexical, prosodic and pause-based — and returns an interpretable prediction ( HC / MCI / AD ) and low Word error rate transcriptions (WER <0.1)).


✨ Key Features

| Capability | Detail | |------------|--------| | Single-command inference | asaca run audio.wav outputs JSON + PDF report | | Fine-tuned wav2vec 2.0 ASR | < 10 % WER on in-domain test set | | Explainability | SHAP plots per classification | | Rich feature set | word-error rate, syllable rate, pause stats, spectral cues | | Offline-ready | Model weights stored under Models/ via Git LFS | | PEP 517/621 packaging | pip install asaca or editable mode |


🚀 Quick start

Install the package from PyPI and run inference on a WAV file:

bash pip install asaca asaca-cli gui

Alternatively install in editable mode for development:

```bash git clone https://github.com/RhysonYang-2030/ASACA-Automatic-Speech-Analysis-for-Cognitive-Assessment.git cd ASACA-Automatic-Speech-Analysis-for-Cognitive-Assessment pip install -e .[dev] pip install numpy==1.24.4

```

The CLI outputs recognised text along with a PDF report and JSON file in the specified output directory.

Usage

Pipeline

text asaca/ ├── src/ # library code ├── tests/ # unit tests ├── docs/ # MkDocs documentation ├── examples/ # example notebooks and data └── notebooks/ # tutorial notebooks

Run asaca-cli --help to see all commands including feature extraction.

Documentation

Full API reference and user guide live in the docs/ directory and on Read the Docs.

License

Released under the Apache-2.0 license.

Citation

If you use ASACA in your research, please cite the project using the CITATION.cff file.

Contact

Maintainer: Xinbo Yang

Owner

  • Name: Xinbo Yang
  • Login: RhysonYang-2030
  • Kind: user
  • Location: Dublin
  • Company: Trinity College Dublin

Early-career researcher at the intersection of artificial intelligence, speech & EEG analytics, and cognitive science.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use ASACA, please cite the MSc thesis below."
title: "ASACA — Automatic Speech Analysis for Cognition Assessment"
authors:
  - family-names: Yang
    given-names: Xinbo
    affiliation: Trinity College Dublin
date-released: "2025-09-01"
version: "0.1.0"
type: thesis
thesis:
  type: master's
  institution: Trinity College Dublin
repository-code: "https://github.com/RhysonYang-2030/ASACA-Automatic-Speech-Analysis-for-Cognitive-Assessment"

GitHub Events

Total
  • Release event: 9
  • Watch event: 2
  • Delete event: 11
  • Push event: 88
  • Pull request event: 12
  • Create event: 15
Last Year
  • Release event: 9
  • Watch event: 2
  • Delete event: 11
  • Push event: 88
  • Pull request event: 12
  • Create event: 15

Dependencies

.github/workflows/ci.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v5 composite
Dockerfile docker
  • pytorch/pytorch 2.1.0-cuda12.1-cudnn8-runtime build
pyproject.toml pypi
  • PyYAML >=6.0
  • ct ​c-decoder @ git+https://github.com/githubharald/CTCDecoder.git@6b5c3dd
  • ctc-segmentation >=1.7
  • datasets >=2.18
  • evaluate >=0.4
  • jiwer >=3.0
  • joblib >=1.3
  • librosa >=0.10
  • matplotlib >=3.7
  • numpy ==1.24.4
  • openpyxl >=3.1
  • pandas >=1.5
  • pillow >=10.2
  • praat-parselmouth >=0.4
  • pronouncing >=0.2
  • psutil >=5.9
  • pyannote.audio >=3.1
  • pyannote.core >=5.0
  • pyctcdecode >=0.5
  • resampy >=0.4
  • safetensors >=0.4
  • scikit-learn >=1.4
  • scipy >=1.10
  • shap >=0.44
  • soundfile >=0.12
  • sympy >=1.12
  • torch >=2.0
  • torchaudio >=2.0
  • tqdm >=4.66
  • transformers >=4.38
  • webrtcvad >=2.0
requirements.txt pypi
  • PyQt5 >=5.15
  • PyYAML >=6.0
  • ctc-segmentation >=1.7
  • datasets >=2.18
  • evaluate >=0.4
  • jiwer >=3.0
  • joblib >=1.3
  • librosa >=0.10
  • matplotlib >=3.7
  • numpy ==1.24.4
  • openpyxl >=3.1
  • pandas >=1.5
  • pillow >=10.2
  • praat-parselmouth >=0.4
  • pronouncing >=0.2
  • psutil >=5.9
  • pyannote.audio >=3.1
  • pyannote.core >=5.0
  • pyctcdecode >=0.5
  • pyqtgraph >=0.13
  • reportlab >=4.0
  • resampy >=0.4
  • safetensors >=0.4
  • scikit-learn >=1.4
  • scipy >=1.10
  • shap >=0.44
  • soundfile >=0.12
  • sympy >=1.12
  • torch >=2.0
  • torchaudio >=2.0
  • tqdm >=4.66
  • transformers >=4.38
  • webrtcvad >=2.0
.github/workflows/publish-pypi.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v5 composite
  • pypa/gh-action-pypi-publish release/v1 composite
requirements-dev.txt pypi
  • black * development
  • mypy * development
  • pre-commit * development
  • pytest-cov * development
  • ruff * development