https://github.com/aarnphm/whispercpp

Pybind11 bindings for Whisper.cpp

https://github.com/aarnphm/whispercpp

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.8%) to scientific vocabulary

Keywords

audio-transcription bazel bentoml mlops-workflow nix pybind11 python3 whisper whisper-cpp

Keywords from Contributors

hack transformers autograder standardization sequences interpretability projections interaction meshing report
Last synced: 5 months ago · JSON representation

Repository

Pybind11 bindings for Whisper.cpp

Basic Info
  • Host: GitHub
  • Owner: aarnphm
  • License: apache-2.0
  • Language: C++
  • Default Branch: main
  • Homepage:
  • Size: 2.1 MB
Statistics
  • Stars: 339
  • Watchers: 5
  • Forks: 67
  • Open Issues: 35
  • Releases: 14
Topics
audio-transcription bazel bentoml mlops-workflow nix pybind11 python3 whisper whisper-cpp
Created almost 3 years ago · Last pushed about 1 year ago
Metadata Files
Readme License Codeowners Security

README.md

whispercpp CI

Pybind11 bindings for whisper.cpp

Quickstart

Install with pip:

bash pip install whispercpp

NOTE: We will setup a hermetic toolchain for all platforms that doesn't have a prebuilt wheels, (which means you don't have to setup anything to install the Python package) which will take a bit longer to install. Pass -vv to pip to see the progress.

To use the latest version, install from source:

bash pip install git+https://github.com/aarnphm/whispercpp.git -vv

For local setup, initialize all submodules:

bash git submodule update --init --recursive

Build the wheel:

```bash

Option 1: using pypa/build

python3 -m build -w

Option 2: using bazel

./tools/bazel build //:whispercpp_wheel ```

Install the wheel:

```bash

Option 1: via pypa/build

pip install dist/*.whl

Option 2: using bazel

pip install $(./tools/bazel info bazel-bin)/*.whl ```

The binding provides a Whisper class:

```python from whispercpp import Whisper

w = Whisper.from_pretrained("tiny.en") ```

Currently, the inference API is provided via transcribe:

python w.transcribe(np.ones((1, 16000)))

You can use any of your favorite audio libraries (ffmpeg or librosa, or whispercpp.api.load_wav_file) to load audio files into a Numpy array, then pass it to transcribe:

```python import ffmpeg import numpy as np

try: y, _ = ( ffmpeg.input("/path/to/audio.wav", threads=0) .output("-", format="s16le", acodec="pcms16le", ac=1, ar=samplerate) .run( cmd=["ffmpeg", "-nostdin"], capturestdout=True, capturestderr=True ) ) except ffmpeg.Error as e: raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e

arr = np.frombuffer(y, np.int16).flatten().astype(np.float32) / 32768.0

w.transcribe(arr) ```

You can also use the model transcribe_from_file for convience:

python w.transcribe_from_file("/path/to/audio.wav")

The Pybind11 bindings supports all of the features from whisper.cpp, that takes inspiration from whisper-rs

The binding can also be used via api:

```python from whispercpp import api

Binding directly fromn whisper.cpp

```

Development

See DEVELOPMENT.md

APIs

Whisper

  1. Whisper.from_pretrained(model_name: str) -> Whisper

Load a pre-trained model from the local cache or download and cache if needed. Supports loading a custom ggml model from a local path passed as model_name.

python w = Whisper.from_pretrained("tiny.en") w = Whisper.from_pretrained("/path/to/model.bin")

The model will be saved to $XDG_DATA_HOME/whispercpp or ~/.local/share/whispercpp if the environment variable is not set.

  1. Whisper.transcribe(arr: NDArray[np.float32], num_proc: int = 1)

Running transcription on a given Numpy array. This calls full from whisper.cpp. If num_proc is greater than 1, it will use full_parallel instead.

python w.transcribe(np.ones((1, 16000)))

To transcribe from a WAV file use transcribe_from_file:

python w.transcribe_from_file("/path/to/audio.wav")

  1. Whisper.stream_transcribe(*, length_ms: int=..., device_id: int=..., num_proc: int=...) -> Iterator[str]

[EXPERIMENTAL] Streaming transcription. This calls stream_ from whisper.cpp. The transcription will be yielded as soon as it's available. See stream.py for an example.

Note: The device_id is the index of the audio device. You can use whispercpp.api.available_audio_devices to get the list of available audio devices.

api

api is a direct binding from whisper.cpp, that has similar API to whisper-rs.

  1. api.Context

This class is a wrapper around whisper_context

```python from whispercpp import api

ctx = api.Context.fromfile("/path/to/savedweight.bin") ```

Note: The context can also be accessed from the Whisper class via w.context

  1. api.Params

This class is a wrapper around whisper_params

```python from whispercpp import api

params = api.Params() ```

Note: The params can also be accessed from the Whisper class via w.params

Why not?

  • whispercpp.py. There are a few key differences here:

    • They provides the Cython bindings. From the UX standpoint, this achieves the same goal as whispercpp. The difference is whispercpp use Pybind11 instead. Feel free to use it if you prefer Cython over Pybind11. Note that whispercpp.py and whispercpp are mutually exclusive, as they also use the whispercpp namespace.
    • whispercpp provides similar APIs as whisper-rs, which provides a nicer UX to work with. There are literally two APIs (from_pretrained and transcribe) to quickly use whisper.cpp in Python.
    • whispercpp doesn't pollute your $HOME directory, rather it follows the XDG Base Directory Specification for saved weights.
  • Using cdll and ctypes and be done with it?

    • This is also valid, but requires a lot of hacking and it is pretty slow comparing to Cython and Pybind11.

Examples

See examples for more information

Owner

  • Name: Aaron Pham
  • Login: aarnphm
  • Kind: user
  • Location: Toronto, Canada

GitHub Events

Total
  • Issues event: 2
  • Watch event: 21
  • Issue comment event: 7
  • Push event: 18
  • Pull request event: 2
  • Fork event: 4
Last Year
  • Issues event: 2
  • Watch event: 21
  • Issue comment event: 7
  • Push event: 18
  • Pull request event: 2
  • Fork event: 4

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 384
  • Total Committers: 12
  • Avg Commits per committer: 32.0
  • Development Distribution Score (DDS): 0.305
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Aaron Pham [bot] 2****m 267
dependabot[bot] 4****] 54
github-actions[bot] 4****] 47
Robin Heinemann r****n@g****m 5
pajowu p****u@p****e 4
mmyjona j****e@g****m 1
Yi Zhang a****y@a****t 1
Simon 8****r 1
Maximilian Blazek 6****v 1
Kyle Leaders r****e 1
Ian Tan i****x 1
Hay Kranen h****r@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 51
  • Total pull requests: 160
  • Average time to close issues: 4 days
  • Average time to close pull requests: 5 days
  • Total issue authors: 37
  • Total pull request authors: 15
  • Average comments per issue: 2.41
  • Average comments per pull request: 0.16
  • Merged pull requests: 145
  • Bot issues: 0
  • Bot pull requests: 110
Past Year
  • Issues: 4
  • Pull requests: 1
  • Average time to close issues: 36 minutes
  • Average time to close pull requests: 2 minutes
  • Issue authors: 4
  • Pull request authors: 1
  • Average comments per issue: 1.25
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • aarnphm (12)
  • chrisspen (3)
  • AdithyanI (1)
  • dgtlntv (1)
  • regstuff (1)
  • osilverstein (1)
  • zhqu1148980644 (1)
  • farmer00317558 (1)
  • dependabot[bot] (1)
  • sudochia (1)
  • DE-ZIX (1)
  • sadath-12 (1)
  • loukylor (1)
  • sorgfresser (1)
  • juanmc2005 (1)
Pull Request Authors
  • dependabot[bot] (66)
  • github-actions[bot] (51)
  • aarnphm (29)
  • rroohhh (6)
  • pajowu (4)
  • Lovemma (2)
  • githubnemo (2)
  • hay (1)
  • iantanwx (1)
  • andermatt64 (1)
  • remkade (1)
  • mmyjona (1)
  • sorgfresser (1)
  • asxzy (1)
  • dgtlntv (1)
Top Labels
Issue Labels
bug (29) enhancement (9) triage/perf (1) triage/ci (1) dependencies (1) github_actions (1)
Pull Request Labels
dependencies (66) nix/dependencies (51) github_actions (44) python (17)

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 1,450 last-month
  • Total dependent packages: 2
    (may contain duplicates)
  • Total dependent repositories: 4
    (may contain duplicates)
  • Total versions: 32
  • Total maintainers: 1
pypi.org: whispercpp
  • Versions: 15
  • Dependent Packages: 2
  • Dependent Repositories: 4
  • Downloads: 1,450 Last month
  • Docker Downloads: 0
Rankings
Docker downloads count: 1.5%
Dependent packages count: 4.7%
Stargazers count: 4.8%
Average: 5.8%
Forks count: 7.2%
Dependent repos count: 7.5%
Downloads: 8.7%
Maintainers (1)
Last synced: 6 months ago
proxy.golang.org: github.com/aarnphm/whispercpp
  • Versions: 17
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.5%
Average: 6.7%
Dependent repos count: 7.0%
Last synced: 6 months ago

Dependencies

pyproject.toml pypi
setup.py pypi
.github/actions/setup-repo/action.yml actions
  • actions/cache v3 composite
  • actions/setup-python v4 composite
.github/workflows/ci.yml actions
  • ./.github/actions/setup-repo * composite
  • FedericoCarboni/setup-ffmpeg v2 composite
  • actions/checkout v3 composite
.github/workflows/codeql.yml actions
  • ./.github/actions/setup-repo * composite
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
.github/workflows/scheduled-jobs.yml actions
  • actions/checkout v3 composite
  • cachix/install-nix-action v21 composite
  • crazy-max/ghaction-import-gpg v5 composite
.github/workflows/style.yml actions
  • ./.github/actions/setup-repo * composite
  • actions/checkout v3 composite
  • actions/setup-node v3 composite
  • cachix/install-nix-action v21 composite
.github/workflows/update-nixpkgs.yml actions
  • knl/niv-updater-action v12 composite
.github/workflows/wheels.yml actions
  • ./.github/actions/setup-repo * composite
  • actions/checkout v3 composite
  • actions/download-artifact v3 composite
  • actions/upload-artifact v3 composite
  • docker/setup-qemu-action v2 composite
  • pypa/cibuildwheel v2.13.1 composite
  • pypa/gh-action-pypi-publish v1.8.7 composite
package.json npm
  • pyright ^1.1.296
yarn.lock npm
  • pyright 1.1.296
examples/bentoml/requirements.txt pypi
  • bentoml >=1.0.15
  • locust *
  • whispercpp *
requirements/pypi.txt pypi
  • bazel-runfiles ==0.19.0
  • black *
  • build *
  • ffmpeg-python ==0.2.0
  • isort *
  • numpy *
  • pytest *
  • pytest-asyncio *
  • pytest-cov *
  • pytest-xdist *
  • ruff *
  • twine *
  • virtualenv *
requirements/release/requirements.in pypi
  • twine *
requirements/release/requirements.txt pypi
  • bleach ==6.0.0
  • certifi ==2022.12.7
  • cffi ==1.15.1
  • charset-normalizer ==3.0.1
  • cryptography ==39.0.2
  • docutils ==0.19
  • idna ==3.4
  • importlib-metadata ==6.0.0
  • importlib-resources ==5.12.0
  • jaraco-classes ==3.2.3
  • jeepney ==0.8.0
  • keyring ==23.13.1
  • markdown-it-py ==2.1.0
  • mdurl ==0.1.2
  • more-itertools ==9.0.0
  • pkginfo ==1.9.6
  • pycparser ==2.21
  • pygments ==2.14.0
  • readme-renderer ==37.3
  • requests ==2.28.2
  • requests-toolbelt ==0.10.1
  • rfc3986 ==2.0.0
  • rich ==13.2.0
  • secretstorage ==3.3.3
  • six ==1.16.0
  • twine ==4.0.2
  • typing-extensions ==4.5.0
  • urllib3 ==1.26.14
  • webencodings ==0.5.1
  • zipp ==3.11.0
requirements/release/requirements_darwin.txt pypi
  • bleach ==6.0.0
  • certifi ==2022.12.7
  • charset-normalizer ==3.0.1
  • docutils ==0.19
  • idna ==3.4
  • importlib-metadata ==6.0.0
  • importlib-resources ==5.12.0
  • jaraco-classes ==3.2.3
  • keyring ==23.13.1
  • markdown-it-py ==2.2.0
  • mdurl ==0.1.2
  • more-itertools ==9.0.0
  • pkginfo ==1.9.6
  • pygments ==2.14.0
  • readme-renderer ==37.3
  • requests ==2.28.2
  • requests-toolbelt ==0.10.1
  • rfc3986 ==2.0.0
  • rich ==13.2.0
  • six ==1.16.0
  • twine ==4.0.2
  • typing-extensions ==4.5.0
  • urllib3 ==1.26.14
  • webencodings ==0.5.1
  • zipp ==3.11.0
requirements/release/requirements_windows.txt pypi
  • bleach ==6.0.0
  • certifi ==2022.12.7
  • charset-normalizer ==3.0.1
  • docutils ==0.19
  • idna ==3.4
  • importlib-metadata ==6.0.0
  • importlib-resources ==5.12.0
  • jaraco-classes ==3.2.3
  • keyring ==23.13.1
  • markdown-it-py ==2.2.0
  • mdurl ==0.1.2
  • more-itertools ==9.0.0
  • pkginfo ==1.9.6
  • pygments ==2.14.0
  • readme-renderer ==37.3
  • requests ==2.28.2
  • requests-toolbelt ==0.10.1
  • rfc3986 ==2.0.0
  • rich ==13.2.0
  • six ==1.16.0
  • twine ==4.0.2
  • typing-extensions ==4.5.0
  • urllib3 ==1.26.14
  • webencodings ==0.5.1
  • zipp ==3.11.0