silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Keywords

onnx onnx-runtime onnxruntime pytorch speech speech-processing vad voice-activity-detection voice-commands voice-control voice-detection voice-recognition

Keywords from Contributors

cryptocurrency transformer

Last synced: 6 months ago · JSON representation ·

Repository

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Basic Info

Host: GitHub
Owner: snakers4
License: mit
Language: Python
Default Branch: master
Homepage:
Size: 102 MB

Statistics

Stars: 6,633
Watchers: 58
Forks: 613
Open Issues: 21
Releases: 8

Topics

onnx onnx-runtime onnxruntime pytorch speech speech-processing vad voice-activity-detection voice-commands voice-control voice-detection voice-recognition

Created about 5 years ago · Last pushed 6 months ago

Metadata Files

Readme License Code of conduct Citation

Silero VAD

Silero VAD - pre-trained enterprise-grade Voice Activity Detector (also see our STT models).

Real Time Example

https://user-images.githubusercontent.com/36505480/144874384-95f80f6d-a4f1-42cc-9be7-004c891dd481.mp4 Please note, that video loads only if you are logged in your GitHub account.

Fast start

Dependencies

System requirements to run python examples on `x86-64` systems: - `python 3.8+`; - 1G+ RAM; - A modern CPU with AVX, AVX2, AVX-512 or AMX instruction sets. Dependencies: - `torch>=1.12.0`; - `torchaudio>=0.12.0` (for I/O only); - `onnxruntime>=1.16.1` (for ONNX model usage). Silero VAD uses torchaudio library for audio I/O (`torchaudio.info`, `torchaudio.load`, and `torchaudio.save`), so a proper audio backend is required: - Option №1 - [**FFmpeg**](https://www.ffmpeg.org/) backend. `conda install -c conda-forge 'ffmpeg<7'`; - Option №2 - [**sox_io**](https://pypi.org/project/sox/) backend. `apt-get install sox`, TorchAudio is tested on libsox 14.4.2; - Option №3 - [**soundfile**](https://pypi.org/project/soundfile/) backend. `pip install soundfile`. If you are planning to run the VAD using solely the `onnx-runtime`, it will run on any other system architectures where onnx-runtume is [supported](https://onnxruntime.ai/getting-started). In this case please note that: - You will have to implement the I/O; - You will have to adapt the existing wrappers / examples / post-processing for your use-case.

Using pip: pip install silero-vad

python3 from silero_vad import load_silero_vad, read_audio, get_speech_timestamps model = load_silero_vad() wav = read_audio('path_to_audio_file') speech_timestamps = get_speech_timestamps( wav, model, return_seconds=True, # Return speech timestamps in seconds (default is samples) )

Using torch.hub: ```python3 import torch torch.setnumthreads(1)

model, utils = torch.hub.load(repoordir='snakers4/silero-vad', model='silerovad') (getspeechtimestamps, _, readaudio, _, _) = utils

wav = readaudio('pathtoaudiofile') speechtimestamps = getspeechtimestamps( wav, model, returnseconds=True, # Return speech timestamps in seconds (default is samples) ) ```

Key Features

Stellar accuracy

Silero VAD has excellent results on speech detection tasks.

Fast

One audio chunk (30+ ms) takes less than 1ms to be processed on a single CPU thread. Using batching or GPU can also improve performance considerably. Under certain conditions ONNX may even run up to 4-5x faster.

Lightweight

JIT model is around two megabytes in size.

General

Silero VAD was trained on huge corpora that include over 6000 languages and it performs well on audios from different domains with various background noise and quality levels.

Flexible sampling rate

Silero VAD supports 8000 Hz and 16000 Hz sampling rates.

Highly Portable

Silero VAD reaps benefits from the rich ecosystems built around PyTorch and ONNX running everywhere where these runtimes are available.

No Strings Attached

Published under permissive license (MIT) Silero VAD has zero strings attached - no telemetry, no keys, no registration, no built-in expiration, no keys or vendor lock.

Typical Use Cases

Voice activity detection for IOT / edge / mobile use cases
Data cleaning and preparation, voice detection in general
Telephony and call-center automation, voice bots
Voice interfaces

Links

Get In Touch

Try our models, create an issue, start a discussion, join our telegram chat, email us, read our news.

Please see our wiki for relevant information and email us directly.

Citations

@misc{Silero VAD, author = {Silero Team}, title = {Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier}, year = {2024}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/snakers4/silero-vad}}, commit = {insert_some_commit_here}, email = {hello@silero.ai} }

Examples and VAD-based Community Apps

Example of VAD ONNX Runtime model usage in C++
Voice activity detection for the browser using ONNX Runtime Web
Rust, Go, Java, C++, C# and other community examples

Owner

Name: Alexander Veysov
Login: snakers4
Kind: user

Repositories: 16
Profile: https://github.com/snakers4

It is by will alone I set my mind in motion.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "Silero VAD"
authors:
  - family-names: "Silero Team"
    email: "hello@silero.ai"
type: software
repository-code: "https://github.com/snakers4/silero-vad"
license: MIT
abstract: "Pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier"
preferred-citation:
  type: software
  authors:
    - family-names: "Silero Team"
      email: "hello@silero.ai"
  title: "Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier"
  year: 2024
  publisher: "GitHub"
  journal: "GitHub repository"
  howpublished: "https://github.com/snakers4/silero-vad"

Committers

Last synced: 9 months ago

All Time

Total Commits: 292
Total Committers: 40
Avg Commits per committer: 7.3
Development Distribution Score (DDS): 0.562

Past Year

Commits: 72
Committers: 17
Avg Commits per committer: 4.235
Development Distribution Score (DDS): 0.556

Top Committers

Name	Email	Commits
adamnsandle	d**2@g**m	128
Alexander Veysov	a**v@g**m	79
yuGAN6	7****6	8
gianpaolo bontempo	b**x@h**t	7
Kai Karren	m**l@k**e	7
Nathan Lee	j**2@g**m	6
Ziyuan Wang	z**k@g**m	6
sontref	s**f@g**m	5
streamer45	c**1@g**m	3
bygreencn	b**n@g**m	3
Yair Lifshitz	y**r@l**o	3
Mohamed Bouaziz	m**z@z**i	3
Antonio Bevilacqua	b**y@g**m	2
EarningsCall	9****l	2
Gabriel Ziegler	g**3@g**m	2
Ojuro Yokoyama	o**a@g**m	2
Saenyakorn Siangsanoh	s**i@g**m	2
Stefan Miletic	s**c@g**m	2
Alexander Kalashnikov	a**v@o**u	1
Abin Thomas	a**e@g**m	1
きわみざむらい	2****i	1
yuguanqin	y**n@f**m	1
rumbleFTW	0**h@g**m	1
qwbarch	q**h@g**m	1
nick.ganju	n**u@g**m	1
mhThomsen	m**4@g**m	1
kh	c**3@g**m	1
kafan1986	d**6@g**m	1
jiqiang.fu	j**u@r**m	1
VvvvvGH	c**p@g**m	1
and 10 more...

Committer Domains (Top 20 + Academic)

qq.com: 1 viderahealth.com: 1 owlsome.tech: 1 163.com: 1 yahoo.com.sg: 1 neqindi.cz: 1 rokid.com: 1 foxmail.com: 1 optimalcity.ru: 1 zaion.ai: 1 lifshitz.io: 1 kaikarren.de: 1 hotmail.it: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 184
Total pull requests: 84
Average time to close issues: about 1 month
Average time to close pull requests: 7 days
Total issue authors: 161
Total pull request authors: 42
Average comments per issue: 2.37
Average comments per pull request: 0.71
Merged pull requests: 76
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 58
Pull requests: 28
Average time to close issues: 5 days
Average time to close pull requests: 1 day
Issue authors: 54
Pull request authors: 12
Average comments per issue: 1.05
Average comments per pull request: 0.43
Merged pull requests: 26
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

snakers4 (4)
NathanJHLee (4)
JJun-Guo (4)
Simon-chai (4)
jifashen (3)
EarningsCall (2)
wl-junlin (2)
TechInterMezzo (2)
mukundt (2)
zhuhao528 (2)
forthcoming (2)
eliran-fm (2)
hunzlausman (2)
jhdeov (2)
computervisionlearner (2)

Pull Request Authors

adamnsandle (49)
snakers4 (7)
streamer45 (5)
b3by (4)
yairl (2)
qwbarch (2)
akmitrich (2)
NathanJHLee (2)
gau-nernst (2)
bygreencn (2)
ZuoFuhong (2)
EarningsCall (2)
abinthomasonline (2)
Sontref (2)
sobomax (2)

Top Labels

Issue Labels

help wanted (107) bug (43) enhancement (23) v5 (6) documentation (1) examples (1)

Pull Request Labels

Packages

Total packages: 2
Total downloads:
- pypi 319,765 last-month

Total dependent packages: 0
(may contain duplicates)
Total dependent repositories: 0
(may contain duplicates)
Total versions: 20
Total maintainers: 2

proxy.golang.org: github.com/snakers4/silero-vad

Documentation: https://pkg.go.dev/github.com/snakers4/silero-vad#section-documentation
License: mit
Latest release: v5.1.2+incompatible
published over 1 year ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 6.1%

Average: 6.3%

Dependent repos count: 6.5%

Last synced: 6 months ago

pypi.org: silero-vad

Voice Activity Detector (VAD) by Silero

Homepage: https://github.com/snakers4/silero-vad
Documentation: https://silero-vad.readthedocs.io/
License: MIT License
Latest release: 5.1.2
published over 1 year ago

Versions: 18
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 319,765 Last month

Rankings

Dependent packages count: 9.5%

Average: 36.0%

Dependent repos count: 62.5%

Maintainers (2)

snakers4 adamnsandle

Last synced: 6 months ago

silero-vad

Science Score: 44.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Silero VAD

Fast start

Key Features

Typical Use Cases

Links

Get In Touch

Examples and VAD-based Community Apps

Owner

Citation (CITATION.cff)

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

proxy.golang.org: github.com/snakers4/silero-vad

Rankings

pypi.org: silero-vad

Rankings

Maintainers (2)