whisper_api

The objective of this repository is make a basic a functional speech to text api configurable and escalable . The api use a powerful speech to text model, whisper,

https://github.com/danielsarmiento04/whisper_api

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.0%) to scientific vocabulary

Keywords

api python speech-to-text whisper
Last synced: 6 months ago · JSON representation ·

Repository

The objective of this repository is make a basic a functional speech to text api configurable and escalable . The api use a powerful speech to text model, whisper,

Basic Info
  • Host: GitHub
  • Owner: DanielSarmiento04
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 52.7 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
api python speech-to-text whisper
Created over 2 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation

readme.md

Whisper API

Abstract

Whisper API is a functional ans scalable speech to text API developed using Python and whisper as base. The objective of this repository is give some easy to configure base api to integrate in some special case, for this purpose is necessary to take that we use a client-side pattern (it's possible to change depending of the case). Also, we give the docker container to simplify the test and the deployment, check de package zone.

Table Of Contents

Setup

  1. Download dependencies

    ```bash

    on Ubuntu or Debian

    sudo apt update && sudo apt install ffmpeg libasound-dev libportaudio2 libportaudiocpp0 portaudio19-dev

    on Arch Linux

    sudo pacman -S ffmpeg libasound-dev libportaudio2 libportaudiocpp0 portaudio19-dev

    on MacOS using Homebrew (https://brew.sh/)

    brew install ffmpeg pyaudio

    on Windows using Chocolatey (https://chocolatey.org/)

    choco install ffmpeg pyaudio

    on Windows using Scoop (https://scoop.sh/)

    scoop install ffmpeg pyaudio ```

  2. Create a python environment / enable the environment

    conda create --name whisper python=3.10 -y conda activate whisper pip install -r requirements.txt

  3. Run

    uvicorn app:app --reload

  4. Docker

    docker run -d --name WhisperAPI -p8000:80 ghcr.io/danielsarmiento04/custom_whisper_api:latest

Next Steps

  1. Websocket service, idea

License

This repository is licensed under the Apache 2.0 License.

References

[1] Radford, A., Kim, J. W., Xu, T., Brockman, G., McLeavey, C., & Sutskever, I. (2022). Robust Speech Recognition via Large-Scale Weak Supervision. doi:10.48550/ARXIV.2212.04356

[2] Ramírez, S. FastAPI [Computer software]. https://github.com/tiangolo/fastapi

Owner

  • Name: José Daniel Sarmiento
  • Login: DanielSarmiento04
  • Kind: user
  • Location: Santander, Colombia
  • Company: Axede S.A

Programmer, mechanical engineer and entrepreneur, my goal is to improve the quality of life of people, technology is the tool I use.

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Whisper_api
message: 'fast, scalable and easy to use speech to text api'
type: software
authors:
  - given-names: Daniel
    family-names: Sarmiento
    email: josedanielsarmiento219@gmail.com
repository-code: 'https://github.com/DanielSarmiento04/whisper_api'
abstract: >-
  Whisper API is a functional ans scalable speech to text
  API developed using Python and whisper as base. The
  objective of this repository is give some easy to
  configure base api to integrate in some special case, for
  this purpose is necessary to take that we use a
  client-side pattern (it's possible to change depending of
  the case). Also, we give the docker container to simplify
  the test and the deployment, check de package zone.
keywords:
  - whisper
  - python
  - machine learning
license: Apache-2.0
version: '1.0'
date-released: '2024-01-04'

GitHub Events

Total
Last Year

Dependencies

Dockerfile docker
  • python 3.10 build
requirements.txt pypi
  • Jinja2 ==3.1.2
  • MarkupSafe ==2.1.3
  • annotated-types ==0.5.0
  • anyio ==3.7.1
  • certifi ==2023.7.22
  • charset-normalizer ==3.2.0
  • click ==8.1.6
  • colorama ==0.4.6
  • exceptiongroup ==1.1.2
  • fastapi ==0.101.0
  • ffmpeg-python ==0.2.0
  • filelock ==3.12.2
  • future ==0.18.3
  • h11 ==0.14.0
  • idna ==3.4
  • llvmlite ==0.40.1
  • more-itertools ==10.1.0
  • mpmath ==1.3.0
  • networkx ==3.1
  • numba ==0.57.1
  • numpy ==1.24.4
  • openai-whisper ==20230314
  • pydantic ==2.1.1
  • pydantic_core ==2.4.0
  • pydub ==0.25.1
  • python-multipart ==0.0.6
  • regex ==2023.6.3
  • requests ==2.31.0
  • semantic-version ==2.10.0
  • setuptools-rust ==1.6.0
  • sniffio ==1.3.0
  • starlette ==0.27.0
  • sympy ==1.12
  • tiktoken ==0.3.1
  • torch ==2.0.1
  • tqdm ==4.65.0
  • typing_extensions ==4.7.1
  • urllib3 ==2.0.4
  • uvicorn ==0.23.2
.github/workflows/docker-publish.yml actions
  • actions/checkout v3 composite
  • docker/build-push-action 0565240e2d4ab88bba5387d719585280857ece09 composite
  • docker/login-action 343f7c4344506bcbf9b4de18042ae17996df046d composite
  • docker/metadata-action 96383f45573cb7f253c731d3b3ab81c87ef81934 composite
  • docker/setup-buildx-action f95db51fddba0c2d1ec667646a06c2ce06100226 composite
  • sigstore/cosign-installer 6e04d228eb30da1757ee4e1dd75a0ec73a653e06 composite
docker-compose.yml docker