Recent Releases of silero-vad
silero-vad - Update the PIP package
A tag to upload new PIP package.
- Python
Published by snakers4 over 1 year ago
silero-vad - Minor fixes
What's Changed
- Fix Golang example by @streamer45 in https://github.com/snakers4/silero-vad/pull/496
- fix: rust example for v5 checkpoint by @rumbleFTW in https://github.com/snakers4/silero-vad/pull/497
- VadIterator first chunk bag fx by @adamnsandle in https://github.com/snakers4/silero-vad/pull/505
- Add java example for wav file & support V5 model by @yuguanqin in https://github.com/snakers4/silero-vad/pull/506
- add csharp example by @nganju98 in https://github.com/snakers4/silero-vad/pull/507
- downgrade onnxruntime dependency by @adamnsandle in https://github.com/snakers4/silero-vad/pull/521
- код для тюнинга by @adamnsandle in https://github.com/snakers4/silero-vad/pull/526
- add neg_threshold parameter explicitly by @adamnsandle in https://github.com/snakers4/silero-vad/pull/528
- Adamnsandle by @adamnsandle in https://github.com/snakers4/silero-vad/pull/529
- Fixed the pyaudio example can not run issue. by @gengyuchao in https://github.com/snakers4/silero-vad/pull/539
- Update README.md by @adamnsandle in https://github.com/snakers4/silero-vad/pull/540
- Update README.md by @adamnsandle in https://github.com/snakers4/silero-vad/pull/541
- Update README.md by @snakers4 in https://github.com/snakers4/silero-vad/pull/542
- Adamnsandle by @adamnsandle in https://github.com/snakers4/silero-vad/pull/543
- Adamnsandle by @adamnsandle in https://github.com/snakers4/silero-vad/pull/549
New Contributors
- @rumbleFTW made their first contribution in https://github.com/snakers4/silero-vad/pull/497
- @yuguanqin made their first contribution in https://github.com/snakers4/silero-vad/pull/506
- @nganju98 made their first contribution in https://github.com/snakers4/silero-vad/pull/507
- @gengyuchao made their first contribution in https://github.com/snakers4/silero-vad/pull/539
Full Changelog: https://github.com/snakers4/silero-vad/compare/v5.1...v5.1.1
- Python
Published by snakers4 over 1 year ago
silero-vad - v5.1
Experimental PIP package release
- Experimental pip-package release;
- Community PRs to update the examples;
What's Changed
- Adamnsandle by @adamnsandle in https://github.com/snakers4/silero-vad/pull/481
- Update microphoneandwebRTC_integration.py by @eltociear in https://github.com/snakers4/silero-vad/pull/475
- cpp example by @filtercodes in https://github.com/snakers4/silero-vad/pull/482
- Update Golang example to support model v5 by @streamer45 in https://github.com/snakers4/silero-vad/pull/489
- Create python-publish.yml by @adamnsandle in https://github.com/snakers4/silero-vad/pull/492
- Adamnsandle by @adamnsandle in https://github.com/snakers4/silero-vad/pull/493
New Contributors
- @eltociear made their first contribution in https://github.com/snakers4/silero-vad/pull/475
- @filtercodes made their first contribution in https://github.com/snakers4/silero-vad/pull/482
Full Changelog: https://github.com/snakers4/silero-vad/compare/v5.0...v5.1
- Python
Published by snakers4 almost 2 years ago
silero-vad - Finally, V5 is here, 3x faster, supporting 6000+ languages!
Performance and Model Size
- 3x faster inference for TorchScript, 10% faster inference for ONNX;
- Now TorchScript is as fast as ONNX;
- Model size is 2x larger, 2MB vs. 1MB;
Quality
- The VAD supports more than 6,000 languages now;
- Significanly more robust on noisy data;
- Overall 5-7% quality increase on clean data;
- Quality difference for 8 kHz and 16 kHz is negligible now;
- Quality difference for different window sizes is negligible => window size was deprecated;
- Added benchmarks on 9 unique datasets (2 private) and one holistic multi-domain dataset;
Changes and deprecations
- ONNX opset 16;
window_size_samplesis deprecated - now the VAD only works with fixed size window;- VAD now works with 8 kHz and 16 kHz sample rates, only with fixed 256 and 512 sample windows respectively;
- Slightly changed internal logic, now some context (part of previous chunk) is passed along with the current chunk;
- Sample rates that are a multiple of 16 kHz are still supported;
- Python
Published by snakers4 almost 2 years ago
silero-vad - # New V4 VAD Released
New V4 VAD Released
- Improved quality
- Improved perfomance
- Both 8k and 16k sampling rates are now supported by the ONNX model
- Batching is now supported by the ONNX model
- Added audio_forward method for one-line processing of a single or multiple audio without postprocessing
- Hotfix applied - wrong model was uploaded
- Minor hotfix re. PyTorch version
- Python
Published by snakers4 over 3 years ago
silero-vad - New V3 ONNX VAD Released
We finally were able to port a model to ONNX:
- Compact model (~100k params);
- Both PyTorch and ONNX models are not quantized;
- Same quality model as the latest best PyTorch release;
- Only 16kHz available now (ONNX has some issues with if-statements and / or tracing vs scripting) with cryptic errors;
- In our tests, on short audios (chunks) ONNX is 2-3x faster than PyTorch (this is mitigated with larger batches or long audios);
- Audio examples and non-core models moved out of the repo to save space;
- Python
Published by snakers4 over 4 years ago
silero-vad - New V3 Silero VAD is Already Here
Main changes
- One VAD to rule them all! New model includes the functionality of the previous ones with improved quality and speed!
- Flexible sampling rate,
8000 Hzand16000 Hzare supported; - Flexible chunk size, minimum chunk size is just 30 milliseconds!
- 100k parameters;
- GPU and batching are supported;
- Radically simplified examples;
Migration
Please see the new examples.
New get_speech_timestamps is a simplified and unified version of the old deprecated get_speech_ts or get_speech_ts_adaptive methods.
speech_timestamps = get_speech_timestamps(wav, model, sampling_rate=16000)
New VADIterator class serves as an example for streaming tasks instead of old deprecated VADiterator and VADiteratorAdaptive.
``` vaditerator = VADIterator(model) windowsize_samples = 1536
for i in range(0, len(wav), windowsizesamples): speechdict = vaditerator(wav[i: i+ windowsizesamples], returnseconds=True) if speechdict: print(speechdict, end=' ') vaditerator.reset_states()
```
- Python
Published by snakers4 over 4 years ago
silero-vad - V2 Legacy Release for History
This is a technical tag, so that users, who do now want to use newer models, could just checkout this tag.
- Python
Published by snakers4 over 4 years ago