speechml-toolkit

A toolkit for managing speech data in machine learning projects, covering audio preprocessing, feature extraction (MFCCs, spectrograms), and model training. Ideal for tasks like speech recognition, speaker ID, and emotion detection. Includes noise reduction and normalization tools, making it a complete solution for audio data processing

https://github.com/aakashvats11/speechml-toolkit

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic links in README
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (2.7%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

A toolkit for managing speech data in machine learning projects, covering audio preprocessing, feature extraction (MFCCs, spectrograms), and model training. Ideal for tasks like speech recognition, speaker ID, and emotion detection. Includes noise reduction and normalization tools, making it a complete solution for audio data processing

Basic Info
  • Host: GitHub
  • Owner: aakashvats11
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Size: 25.4 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

Owner

  • Login: aakashvats11
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Żelasko"
  given-names: "Piotr"
  orcid: "https://orcid.org/0000-0002-8245-0413"
- family-names: "Povey"
  given-names: "Daniel"
  orcid: "https://orcid.org/0000-0002-0611-3634"
- family-names: "Trmal"
  given-names: "Jan"
- family-names: "Khudanpur"
  given-names: "Sanjeev"
license: Apache-2.0 License
title: "Lhotse: a speech data representation library for the modern deep learning ecosystem"
date-released: 2020-04-24
url: "https://github.com/lhotse-speech/lhotse"
preferred-citation:
  type: proceedings
  authors:
  - family-names: "Żelasko"
    given-names: "Piotr"
    orcid: "https://orcid.org/0000-0002-8245-0413"
  - family-names: "Povey"
    given-names: "Daniel"
    orcid: "https://orcid.org/0000-0002-0611-3634"
  - family-names: "Trmal"
    given-names: "Jan"
  - family-names: "Khudanpur"
    given-names: "Sanjeev"
  conference:
    name: "NeurIPS Data-Centric AI Workshop"
  title: "Lhotse: a speech data representation library for the modern deep learning ecosystem"
  url: "https://arxiv.org/abs/2110.12561"
  year: 2021

GitHub Events

Total
  • Push event: 3
  • Create event: 2
Last Year
  • Push event: 3
  • Create event: 2

Dependencies

pyproject.toml pypi
setup.py pypi
  • SoundFile >=0.10
  • audioread >=2.1.9
  • click >=7.1.1
  • cytoolz >=0.10.1
  • intervaltree >=
  • numpy >=1.18.1
  • packaging *
  • pyyaml >=5.3.1
  • tabulate >=0.8.1
  • torch *
  • tqdm *