transcript

Using Whisper to transcribe English, French, or Spanish, from mp3/m4a files, or from live microphone input.

https://github.com/ghamzak/transcript

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.1%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Using Whisper to transcribe English, French, or Spanish, from mp3/m4a files, or from live microphone input.

Basic Info

Host: GitHub
Owner: ghamzak
Language: Python
Default Branch: main
Size: 8.79 KB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Created 11 months ago · Last pushed 11 months ago

Metadata Files

Readme Citation

Multilingual NLP: Transcription and Translation

Setup Notes

Important: For audio processing (mp3/m4a conversion and transcription), use Python 3.12 or lower. Python 3.13+ is not supported due to removal of the audioop module, which is required by pydub and other audio libraries.

Recommended: Install Python 3.12 and create your virtual environment with it. Gen AI Mini Projects

Note for macOS Users

If you encounter errors when installing PyAudio (such as portaudio.h file not found), you need to install the PortAudio library using Homebrew:

sh brew install portaudio

After installing PortAudio, re-run:

sh pip install -r requirements.txt

This will allow PyAudio to build and install successfully on macOS.

Running the Apps

Note: When running the Streamlit app for the first time, it may take a while to start because it needs to download about 200MB of data before loading. Please be patient during the initial startup.

How to Run the Apps

To run the live transcription app (using your microphone):

sh cd Live_audio2text_app streamlit run app.py

To run the audio-to-text app for MP3/M4A files:

sh cd audio2text_from_mp3 streamlit run streamlit_app.py

Owner

Name: Ghazaleh Kazeminejad
Login: ghamzak
Kind: user
Location: Boulder, Colorado

Repositories: 1
Profile: https://github.com/ghamzak

Computational Linguist

Citation (citations.md)

## Whisper

[Whisper](https://github.com/openai/whisper): Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

GitHub Events

Total

Push event: 3
Create event: 2

Last Year

Push event: 3
Create event: 2

Dependencies

requirements.txt pypi

PyAudio *
numpy *
pydub *
scipy *
streamlit *
torch *
transformers *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science