speech

https://github.com/nabin2004/speech

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.5%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: nabin2004
License: apache-2.0
Language: Python
Default Branch: nemo-v2
Size: 300 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed over 1 year ago

Metadata Files

Readme Contributing License Citation

Speech-to-Text Fine-Tuning

Dataset Preparation Scripts

During the dataset preparation process, various Python scripts were used to format and preprocess the data. Most scripts can be found in the PythonScripts/ directory.

Dataset Format

The dataset is structured in JSON format, where each entry represents an audio sample with metadata. Below is an example:

json {"audio_filepath": "./input/wav_format/106__.wav", "text": " ", "duration": 1.296, "language_ids": "1", "sample_ids": "0001", "lang": "ne"} {"audio_filepath": "./input/wav_format/107__.wav", "text": " ", "duration": 1.152, "language_ids": "1", "sample_ids": "0002", "lang": "ne"} {"audio_filepath": "./input/wav_format/108_.wav", "text": " ", "duration": 0.936, "language_ids": "1", "sample_ids": "0003", "lang": "ne"} {"audio_filepath": "./input/wav_format/109_Kumaripati.wav", "text": " Kumaripati ", "duration": 1.08, "language_ids": "1", "sample_ids": "0004", "lang": "ne"} {"audio_filepath": "./input/wav_format/110_Pulchowk.wav", "text": " Pulchowk ", "duration": 0.96, "language_ids": "1", "sample_ids": "0005", "lang": "ne"}

Fields Explanation

audio_filepath: Path to the audio file.
text: Transcription of the audio.
duration: Length of the audio in seconds.
language_ids: Language identifier.
sample_ids: Unique identifier for the sample.
lang: Language code (e.g., "ne" for Nepali).

Configuration File

The fine-tuning process is managed using a configuration file:

examples/asr/conf/asr_finetune/speech_to_text_finetune.yaml

This file contains hyperparameters and training settings for fine-tuning the ASR model.

Running the Fine-Tuning Script

To start the fine-tuning process, execute the following bash script:

bash ./finetune_asr.sh

Requirements

The required dependencies are listed in the requirements/ directory. To install them, you can use the provided script:

bash bash reinstall.sh

Owner

Name: nabin2004
Login: nabin2004
Kind: user

Twitter: nabinstwt
Repositories: 5
Profile: https://github.com/nabin2004

GitHub Events

Total

Watch event: 1
Push event: 2
Public event: 1
Create event: 1

Last Year

Watch event: 1
Push event: 2
Public event: 1
Create event: 1

Dependencies

.github/actions/cancel-workflow/action.yml actions

.github/workflows/blossom-ci.yml actions

NVIDIA/blossom-action main composite
actions/checkout v2 composite

.github/workflows/changelog-build.yml actions

actions/checkout v2 composite
mikepenz/release-changelog-builder-action v3.3.1 composite

.github/workflows/cherry-pick-release-commit.yml actions

actions/checkout v3 composite
carloscastrojumo/github-cherry-pick-action bb0869df47c27be4ae4c7a2d93d22827aa5a0054 composite

.github/workflows/cicd-main.yml actions

NVIDIA/NeMo/.github/actions/cancel-workflow main composite
actions/checkout v2 composite
azure/docker-login v1 composite

.github/workflows/close-inactive-issue-pr.yml actions

actions/stale v6 composite

.github/workflows/codeql.yml actions

actions/checkout v3 composite
github/codeql-action/analyze v2 composite
github/codeql-action/autobuild v2 composite
github/codeql-action/init v2 composite

.github/workflows/config/codeql.yml actions

.github/workflows/gh-docs.yml actions

actions/checkout v3 composite

.github/workflows/import-test.yml actions

actions/checkout v2 composite

.github/workflows/labeler.yml actions

actions/labeler v4 composite

Dockerfile docker

${BASE_IMAGE} latest build
nemo-deps latest build
scratch latest build

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science