transcript

Using Whisper to transcribe English, French, or Spanish, from mp3/m4a files, or from live microphone input.

https://github.com/ghamzak/transcript

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.1%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Using Whisper to transcribe English, French, or Spanish, from mp3/m4a files, or from live microphone input.

Basic Info
  • Host: GitHub
  • Owner: ghamzak
  • Language: Python
  • Default Branch: main
  • Size: 8.79 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 7 months ago · Last pushed 7 months ago
Metadata Files
Readme Citation

README.md

Multilingual NLP: Transcription and Translation

Setup Notes

Important: For audio processing (mp3/m4a conversion and transcription), use Python 3.12 or lower. Python 3.13+ is not supported due to removal of the audioop module, which is required by pydub and other audio libraries.

Recommended: Install Python 3.12 and create your virtual environment with it. Gen AI Mini Projects

Note for macOS Users

If you encounter errors when installing PyAudio (such as portaudio.h file not found), you need to install the PortAudio library using Homebrew:

sh brew install portaudio

After installing PortAudio, re-run:

sh pip install -r requirements.txt

This will allow PyAudio to build and install successfully on macOS.

Running the Apps

Note: When running the Streamlit app for the first time, it may take a while to start because it needs to download about 200MB of data before loading. Please be patient during the initial startup.

How to Run the Apps

  • To run the live transcription app (using your microphone):

sh cd Live_audio2text_app streamlit run app.py

  • To run the audio-to-text app for MP3/M4A files:

sh cd audio2text_from_mp3 streamlit run streamlit_app.py

Owner

  • Name: Ghazaleh Kazeminejad
  • Login: ghamzak
  • Kind: user
  • Location: Boulder, Colorado

Computational Linguist

Citation (citations.md)

## Whisper

[Whisper](https://github.com/openai/whisper): Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

GitHub Events

Total
  • Push event: 3
  • Create event: 2
Last Year
  • Push event: 3
  • Create event: 2

Dependencies

requirements.txt pypi
  • PyAudio *
  • numpy *
  • pydub *
  • scipy *
  • streamlit *
  • torch *
  • transformers *