whisper_api
The objective of this repository is make a basic a functional speech to text api configurable and escalable . The api use a powerful speech to text model, whisper,
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.0%) to scientific vocabulary
Keywords
Repository
The objective of this repository is make a basic a functional speech to text api configurable and escalable . The api use a powerful speech to text model, whisper,
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
readme.md
Whisper API
Abstract
Whisper API is a functional ans scalable speech to text API developed using Python and whisper as base. The objective of this repository is give some easy to configure base api to integrate in some special case, for this purpose is necessary to take that we use a client-side pattern (it's possible to change depending of the case). Also, we give the docker container to simplify the test and the deployment, check de package zone.
Table Of Contents
Setup
Download dependencies
```bash
on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg libasound-dev libportaudio2 libportaudiocpp0 portaudio19-dev
on Arch Linux
sudo pacman -S ffmpeg libasound-dev libportaudio2 libportaudiocpp0 portaudio19-dev
on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg pyaudio
on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg pyaudio
on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg pyaudio ```
Create a python environment / enable the environment
conda create --name whisper python=3.10 -y conda activate whisper pip install -r requirements.txtRun
uvicorn app:app --reloadDocker
docker run -d --name WhisperAPI -p8000:80 ghcr.io/danielsarmiento04/custom_whisper_api:latest
Next Steps
- Websocket service, idea
License
This repository is licensed under the Apache 2.0 License.
References
[1] Radford, A., Kim, J. W., Xu, T., Brockman, G., McLeavey, C., & Sutskever, I. (2022). Robust Speech Recognition via Large-Scale Weak Supervision. doi:10.48550/ARXIV.2212.04356
[2] Ramírez, S. FastAPI [Computer software]. https://github.com/tiangolo/fastapi
Owner
- Name: José Daniel Sarmiento
- Login: DanielSarmiento04
- Kind: user
- Location: Santander, Colombia
- Company: Axede S.A
- Repositories: 7
- Profile: https://github.com/DanielSarmiento04
Programmer, mechanical engineer and entrepreneur, my goal is to improve the quality of life of people, technology is the tool I use.
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: Whisper_api
message: 'fast, scalable and easy to use speech to text api'
type: software
authors:
- given-names: Daniel
family-names: Sarmiento
email: josedanielsarmiento219@gmail.com
repository-code: 'https://github.com/DanielSarmiento04/whisper_api'
abstract: >-
Whisper API is a functional ans scalable speech to text
API developed using Python and whisper as base. The
objective of this repository is give some easy to
configure base api to integrate in some special case, for
this purpose is necessary to take that we use a
client-side pattern (it's possible to change depending of
the case). Also, we give the docker container to simplify
the test and the deployment, check de package zone.
keywords:
- whisper
- python
- machine learning
license: Apache-2.0
version: '1.0'
date-released: '2024-01-04'
GitHub Events
Total
Last Year
Dependencies
- python 3.10 build
- Jinja2 ==3.1.2
- MarkupSafe ==2.1.3
- annotated-types ==0.5.0
- anyio ==3.7.1
- certifi ==2023.7.22
- charset-normalizer ==3.2.0
- click ==8.1.6
- colorama ==0.4.6
- exceptiongroup ==1.1.2
- fastapi ==0.101.0
- ffmpeg-python ==0.2.0
- filelock ==3.12.2
- future ==0.18.3
- h11 ==0.14.0
- idna ==3.4
- llvmlite ==0.40.1
- more-itertools ==10.1.0
- mpmath ==1.3.0
- networkx ==3.1
- numba ==0.57.1
- numpy ==1.24.4
- openai-whisper ==20230314
- pydantic ==2.1.1
- pydantic_core ==2.4.0
- pydub ==0.25.1
- python-multipart ==0.0.6
- regex ==2023.6.3
- requests ==2.31.0
- semantic-version ==2.10.0
- setuptools-rust ==1.6.0
- sniffio ==1.3.0
- starlette ==0.27.0
- sympy ==1.12
- tiktoken ==0.3.1
- torch ==2.0.1
- tqdm ==4.65.0
- typing_extensions ==4.7.1
- urllib3 ==2.0.4
- uvicorn ==0.23.2
- actions/checkout v3 composite
- docker/build-push-action 0565240e2d4ab88bba5387d719585280857ece09 composite
- docker/login-action 343f7c4344506bcbf9b4de18042ae17996df046d composite
- docker/metadata-action 96383f45573cb7f253c731d3b3ab81c87ef81934 composite
- docker/setup-buildx-action f95db51fddba0c2d1ec667646a06c2ce06100226 composite
- sigstore/cosign-installer 6e04d228eb30da1757ee4e1dd75a0ec73a653e06 composite