https://github.com/areid987/voice-ai-moonshine

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.7%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: AReid987
License: mit
Language: Python
Default Branch: main
Size: 0 Bytes

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 3
Releases: 0

Created about 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme License

Moonshine

[Blog] [Paper] [Model Card] [Podcast]

Moonshine is a family of speech-to-text models optimized for fast and accurate automatic speech recognition (ASR) on resource-constrained devices. It is well-suited to real-time, on-device applications like live transcription and voice command recognition. Moonshine obtains word-error rates (WER) better than similarly-sized tiny.en and base.en Whisper models from OpenAI on the datasets used in the OpenASR leaderboard maintained by HuggingFace:

Tiny	Base
\| WER \| Moonshine \| Whisper \| \| ---------- \| --------- \| ------- \| \| Average \| 12.66 \| 12.81 \| \| AMI \| 22.77 \| 24.24 \| \| Earnings22 \| 21.25 \| 19.12 \| \| Gigaspeech \| 14.41 \| 14.08 \| \| LS Clean \| 4.52 \| 5.66 \| \| LS Other \| 11.71 \| 15.45 \| \| SPGISpeech \| 7.70 \| 5.93 \| \| Tedlium \| 5.64 \| 5.97 \| \| Voxpopuli \| 13.27 \| 12.00 \|	\| WER \| Moonshine \| Whisper \| \| ---------- \| --------- \| ------- \| \| Average \| 10.07 \| 10.32 \| \| AMI \| 17.79 \| 21.13 \| \| Earnings22 \| 17.65 \| 15.09 \| \| Gigaspeech \| 12.19 \| 12.83 \| \| LS Clean \| 3.23 \| 4.25 \| \| LS Other \| 8.18 \| 10.35 \| \| SPGISpeech \| 5.46 \| 4.26 \| \| Tedlium \| 5.22 \| 4.87 \| \| Voxpopuli \| 10.81 \| 9.76 \|

Moonshine's compute requirements scale with the length of input audio. This means that shorter input audio is processed faster, unlike existing Whisper models that process everything as 30-second chunks. To give you an idea of the benefits: Moonshine processes 10-second audio segments 5x faster than Whisper while maintaining the same (or better!) WER.

Moonshine Base is approximately 400MB, while Tiny is around 190MB. Both publicly-released models currently support English only.

This repo hosts inference code and demos for Moonshine.

Installation
Examples
TODO
Citation

Installation

We currently offer two options for installing Moonshine:

useful-moonshine, which uses Keras (with support for Torch, TensorFlow, and JAX backends)
useful-moonshine-onnx, which uses the ONNX runtime

These instructions apply to both options; follow along to get started.

Note: We like uv for managing Python environments, so we use it here. If you don't want to use it, simply skip the uv installation and leave uv off of your shell commands.

1. Create a virtual environment

First, install uv for Python environment management.

Then create and activate a virtual environment:

shell uv venv env_moonshine source env_moonshine/bin/activate

2a. Install the `useful-moonshine` package to use Moonshine with Torch, TensorFlow, or JAX

The useful-moonshine inference code is written in Keras and can run with each of the backends that Keras supports: Torch, TensorFlow, and JAX. The backend you choose will determine which flavor of the useful-moonshine package to install. If you're just getting started, we suggest installing the (default) Torch backend:

shell uv pip install useful-moonshine@git+https://github.com/usefulsensors/moonshine.git

To run the provided inference code, you have to instruct Keras to use the PyTorch backend by setting an environment variable:

shell export KERAS_BACKEND=torch

To run with the TensorFlow backend, run the following to install Moonshine and set the environment variable:

shell uv pip install useful-moonshine[tensorflow]@git+https://github.com/usefulsensors/moonshine.git export KERAS_BACKEND=tensorflow

To run with the JAX backend, run the following:

```shell uv pip install useful-moonshine[jax]@git+https://github.com/usefulsensors/moonshine.git export KERAS_BACKEND=jax

Use useful-moonshine[jax-cuda] for jax on GPU

```

2b. Install the `useful-moonshine-onnx` package to use Moonshine with ONNX

Using Moonshine with the ONNX runtime is preferable if you want to run the models on SBCs like the Raspberry Pi. We've prepared a separate version of the package with minimal dependencies to support these use cases. To use it, run the following:

shell uv pip install useful-moonshine-onnx@git+https://git@github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx

3. Try it out

You can test whichever type of Moonshine you installed by transcribing the provided example audio file with the .transcribe function:

```shell python

import moonshine # or import moonshineonnx moonshine.transcribe(moonshine.ASSETSDIR / 'beckett.wav', 'moonshine/tiny') # or moonshine_onnx.transcribe(...) ['Ever tried ever failed, no matter try again, fail again, fail better.'] ```

The first argument is a path to an audio file and the second is the name of a Moonshine model. moonshine/tiny and moonshine/base are the currently available models.

Examples

Since the Moonshine models can be used with a variety of different runtimes and applications, we've included code samples showing how to use them in different situations. The demo folder in this repository also has more information on many of them.

Live Captions

You can try the Moonshine ONNX models with live input from a microphone with the live captions demo.

Running in the Browser

You can try out the Moonshine ONNX models locally in a web browser with our HuggingFace space. We've included the source for this demo in this repository; this is a great starting place for those wishing to build web-based applications with Moonshine.

CTranslate2

The files for the CTranslate2 versions of Moonshine are available at huggingface.co/UsefulSensors/moonshine/tree/main/ctranslate2, but they require a pull request to be merged before they can be used with the mainline version of the framework. Until then, you should be able to try them with our branch, with this example script.

HuggingFace Transformers

Both models are also available on the HuggingFace hub and can be used with the transformers library, as follows:

```python from transformers import AutoModelForSpeechSeq2Seq, AutoConfig, PreTrainedTokenizerFast

import torchaudio import sys

audio, sr = torchaudio.load(sys.argv[1]) if sr != 16000: audio = torchaudio.functional.resample(audio, sr, 16000)

'usefulsensors/moonshine-base' for the base model

model = AutoModelForSpeechSeq2Seq.frompretrained('usefulsensors/moonshine-tiny', trustremotecode=True) tokenizer = PreTrainedTokenizerFast.frompretrained('usefulsensors/moonshine-tiny')

tokens = model(audio) print(tokenizer.decode(tokens[0], skipspecialtokens=True)) ```

TODO

[x] Live transcription demo
[x] ONNX model
[x] HF transformers support
[x] Demo Moonshine running in the browser
[ ] CTranslate2 support (complete but awaiting a merge)
[ ] MLX support
[ ] Fine-tuning code
[ ] HF transformers.js support
[ ] Long-form transcription demo

Known Issues

UserWarning: You are using a softmax over axis 3 of a tensor of shape torch.Size([1, 8, 1, 1])

This is a benign warning arising from Keras. For the first token in the decoding loop, the attention score matrix's shape is 1x1, which triggers this warning. You can safely ignore it, or run with python -W ignore to suppress the warning.

Citation

If you benefit from our work, please cite us: @misc{jeffries2024moonshinespeechrecognitionlive, title={Moonshine: Speech Recognition for Live Transcription and Voice Commands}, author={Nat Jeffries and Evan King and Manjunath Kudlur and Guy Nicholson and James Wang and Pete Warden}, year={2024}, eprint={2410.15608}, archivePrefix={arXiv}, primaryClass={cs.SD}, url={https://arxiv.org/abs/2410.15608}, }

Owner

Name: Antonio Reid
Login: AReid987
Kind: user
Location: Austin, Texas

Repositories: 9
Profile: https://github.com/AReid987

GitHub Events

Total

Issues event: 1
Push event: 4
Pull request event: 1
Create event: 5

Last Year

Issues event: 1
Push event: 4
Pull request event: 1
Create event: 5

Dependencies

.github/workflows/main.yml actions

actions/checkout v4 composite
actions/setup-python v5 composite
stefanzweifel/git-auto-commit-action v5 composite

demo/moonshine-web/package-lock.json npm

@esbuild/aix-ppc64 0.21.5 development
@esbuild/android-arm 0.21.5 development
@esbuild/android-arm64 0.21.5 development
@esbuild/android-x64 0.21.5 development
@esbuild/darwin-arm64 0.21.5 development
@esbuild/darwin-x64 0.21.5 development
@esbuild/freebsd-arm64 0.21.5 development
@esbuild/freebsd-x64 0.21.5 development
@esbuild/linux-arm 0.21.5 development
@esbuild/linux-arm64 0.21.5 development
@esbuild/linux-ia32 0.21.5 development
@esbuild/linux-loong64 0.21.5 development
@esbuild/linux-mips64el 0.21.5 development
@esbuild/linux-ppc64 0.21.5 development
@esbuild/linux-riscv64 0.21.5 development
@esbuild/linux-s390x 0.21.5 development
@esbuild/linux-x64 0.21.5 development
@esbuild/netbsd-x64 0.21.5 development
@esbuild/openbsd-x64 0.21.5 development
@esbuild/sunos-x64 0.21.5 development
@esbuild/win32-arm64 0.21.5 development
@esbuild/win32-ia32 0.21.5 development
@esbuild/win32-x64 0.21.5 development
@huggingface/hub 0.19.0 development
@huggingface/tasks 0.12.30 development
@rollup/rollup-android-arm-eabi 4.26.0 development
@rollup/rollup-android-arm64 4.26.0 development
@rollup/rollup-darwin-arm64 4.26.0 development
@rollup/rollup-darwin-x64 4.26.0 development
@rollup/rollup-freebsd-arm64 4.26.0 development
@rollup/rollup-freebsd-x64 4.26.0 development
@rollup/rollup-linux-arm-gnueabihf 4.26.0 development
@rollup/rollup-linux-arm-musleabihf 4.26.0 development
@rollup/rollup-linux-arm64-gnu 4.26.0 development
@rollup/rollup-linux-arm64-musl 4.26.0 development
@rollup/rollup-linux-powerpc64le-gnu 4.26.0 development
@rollup/rollup-linux-riscv64-gnu 4.26.0 development
@rollup/rollup-linux-s390x-gnu 4.26.0 development
@rollup/rollup-linux-x64-gnu 4.26.0 development
@rollup/rollup-linux-x64-musl 4.26.0 development
@rollup/rollup-win32-arm64-msvc 4.26.0 development
@rollup/rollup-win32-ia32-msvc 4.26.0 development
@rollup/rollup-win32-x64-msvc 4.26.0 development
@types/estree 1.0.6 development
esbuild 0.21.5 development
fsevents 2.3.3 development
nanoid 3.3.7 development
picocolors 1.1.1 development
postcss 8.4.49 development
rollup 4.26.0 development
source-map-js 1.2.1 development
vite 5.4.11 development
@protobufjs/aspromise 1.1.2
@protobufjs/base64 1.1.2
@protobufjs/codegen 2.0.4
@protobufjs/eventemitter 1.1.0
@protobufjs/fetch 1.1.0
@protobufjs/float 1.0.2
@protobufjs/inquire 1.1.0
@protobufjs/path 1.1.2
@protobufjs/pool 1.1.0
@protobufjs/utf8 1.1.0
@types/node 22.9.0
flatbuffers 1.12.0
guid-typescript 1.0.9
llama-tokenizer-js 1.2.2
long 5.2.3
onnxruntime-common 1.20.0
onnxruntime-web 1.20.0
platform 1.3.6
protobufjs 7.4.0
undici-types 6.19.8

demo/moonshine-web/package.json npm

@huggingface/hub ^0.19.0 development
vite ^5.4.11 development
llama-tokenizer-js ^1.2.2
onnxruntime-web ^1.20.0

demo/moonshine-onnx/requirements.txt pypi

silero_vad *
sounddevice *

moonshine-onnx/requirements.txt pypi

huggingface_hub *
librosa *
onnxruntime *
tokenizers >=0.19.0

moonshine-onnx/setup.py pypi

Path *
for *
str *

requirements.txt pypi

einops ==0.8.0
keras ==3.6.0
librosa >=0.9.0
numba *
tokenizers >=0.19.0
torch >=2.4.0

setup.py pypi

Path *
for *
str *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/areid987/voice-ai-moonshine

Science Score: 23.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Moonshine

Installation

1. Create a virtual environment

2a. Install the `useful-moonshine` package to use Moonshine with Torch, TensorFlow, or JAX

Use useful-moonshine[jax-cuda] for jax on GPU

2b. Install the `useful-moonshine-onnx` package to use Moonshine with ONNX

3. Try it out

Examples

Live Captions

Running in the Browser

CTranslate2

HuggingFace Transformers

'usefulsensors/moonshine-base' for the base model

TODO

Known Issues

UserWarning: You are using a softmax over axis 3 of a tensor of shape torch.Size([1, 8, 1, 1])

Citation

Owner

GitHub Events

Total

Last Year

Dependencies

https://github.com/areid987/voice-ai-moonshine

Science Score: 23.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Moonshine

Installation

1. Create a virtual environment

2a. Install the useful-moonshine package to use Moonshine with Torch, TensorFlow, or JAX

Use useful-moonshine[jax-cuda] for jax on GPU

2b. Install the useful-moonshine-onnx package to use Moonshine with ONNX

3. Try it out

Examples

Live Captions

Running in the Browser

CTranslate2

HuggingFace Transformers

'usefulsensors/moonshine-base' for the base model

TODO

Known Issues

UserWarning: You are using a softmax over axis 3 of a tensor of shape torch.Size([1, 8, 1, 1])

Citation

Owner

GitHub Events

Total

Last Year

Dependencies

2a. Install the `useful-moonshine` package to use Moonshine with Torch, TensorFlow, or JAX

2b. Install the `useful-moonshine-onnx` package to use Moonshine with ONNX