transcript
Using Whisper to transcribe English, French, or Spanish, from mp3/m4a files, or from live microphone input.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.1%) to scientific vocabulary
Repository
Using Whisper to transcribe English, French, or Spanish, from mp3/m4a files, or from live microphone input.
Basic Info
- Host: GitHub
- Owner: ghamzak
- Language: Python
- Default Branch: main
- Size: 8.79 KB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Multilingual NLP: Transcription and Translation
Setup Notes
Important: For audio processing (mp3/m4a conversion and transcription), use Python 3.12 or lower. Python 3.13+ is not supported due to removal of the audioop module, which is required by pydub and other audio libraries.
Recommended: Install Python 3.12 and create your virtual environment with it. Gen AI Mini Projects
Note for macOS Users
If you encounter errors when installing PyAudio (such as portaudio.h file not found), you need to install the PortAudio library using Homebrew:
sh
brew install portaudio
After installing PortAudio, re-run:
sh
pip install -r requirements.txt
This will allow PyAudio to build and install successfully on macOS.
Running the Apps
Note: When running the Streamlit app for the first time, it may take a while to start because it needs to download about 200MB of data before loading. Please be patient during the initial startup.
How to Run the Apps
- To run the live transcription app (using your microphone):
sh
cd Live_audio2text_app
streamlit run app.py
- To run the audio-to-text app for MP3/M4A files:
sh
cd audio2text_from_mp3
streamlit run streamlit_app.py
Owner
- Name: Ghazaleh Kazeminejad
- Login: ghamzak
- Kind: user
- Location: Boulder, Colorado
- Repositories: 1
- Profile: https://github.com/ghamzak
Computational Linguist
Citation (citations.md)
## Whisper [Whisper](https://github.com/openai/whisper): Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.
GitHub Events
Total
- Push event: 3
- Create event: 2
Last Year
- Push event: 3
- Create event: 2
Dependencies
- PyAudio *
- numpy *
- pydub *
- scipy *
- streamlit *
- torch *
- transformers *