Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.5%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: nabin2004
- License: apache-2.0
- Language: Python
- Default Branch: nemo-v2
- Size: 300 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Speech-to-Text Fine-Tuning
Dataset Preparation Scripts
During the dataset preparation process, various Python scripts were used to format and preprocess the data. Most scripts can be found in the PythonScripts/ directory.
Dataset Format
The dataset is structured in JSON format, where each entry represents an audio sample with metadata. Below is an example:
json
{"audio_filepath": "./input/wav_format/106__.wav", "text": " ", "duration": 1.296, "language_ids": "1", "sample_ids": "0001", "lang": "ne"}
{"audio_filepath": "./input/wav_format/107__.wav", "text": " ", "duration": 1.152, "language_ids": "1", "sample_ids": "0002", "lang": "ne"}
{"audio_filepath": "./input/wav_format/108_.wav", "text": " ", "duration": 0.936, "language_ids": "1", "sample_ids": "0003", "lang": "ne"}
{"audio_filepath": "./input/wav_format/109_Kumaripati.wav", "text": " Kumaripati ", "duration": 1.08, "language_ids": "1", "sample_ids": "0004", "lang": "ne"}
{"audio_filepath": "./input/wav_format/110_Pulchowk.wav", "text": " Pulchowk ", "duration": 0.96, "language_ids": "1", "sample_ids": "0005", "lang": "ne"}
Fields Explanation
audio_filepath: Path to the audio file.text: Transcription of the audio.duration: Length of the audio in seconds.language_ids: Language identifier.sample_ids: Unique identifier for the sample.lang: Language code (e.g., "ne" for Nepali).
Configuration File
The fine-tuning process is managed using a configuration file:
examples/asr/conf/asr_finetune/speech_to_text_finetune.yaml
This file contains hyperparameters and training settings for fine-tuning the ASR model.
Running the Fine-Tuning Script
To start the fine-tuning process, execute the following bash script:
bash
./finetune_asr.sh
Requirements
The required dependencies are listed in the requirements/ directory. To install them, you can use the provided script:
bash
bash reinstall.sh
Owner
- Name: nabin2004
- Login: nabin2004
- Kind: user
- Twitter: nabinstwt
- Repositories: 5
- Profile: https://github.com/nabin2004
GitHub Events
Total
- Watch event: 1
- Push event: 2
- Public event: 1
- Create event: 1
Last Year
- Watch event: 1
- Push event: 2
- Public event: 1
- Create event: 1
Dependencies
- NVIDIA/blossom-action main composite
- actions/checkout v2 composite
- actions/checkout v2 composite
- mikepenz/release-changelog-builder-action v3.3.1 composite
- actions/checkout v3 composite
- carloscastrojumo/github-cherry-pick-action bb0869df47c27be4ae4c7a2d93d22827aa5a0054 composite
- NVIDIA/NeMo/.github/actions/cancel-workflow main composite
- actions/checkout v2 composite
- azure/docker-login v1 composite
- actions/stale v6 composite
- actions/checkout v3 composite
- github/codeql-action/analyze v2 composite
- github/codeql-action/autobuild v2 composite
- github/codeql-action/init v2 composite
- actions/checkout v3 composite
- actions/checkout v2 composite
- actions/labeler v4 composite
- ${BASE_IMAGE} latest build
- nemo-deps latest build
- scratch latest build