werpy

🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and transcription accuracy.

https://github.com/analyticsinmotion/werpy

Keywords

asr asr-evaluation automatic-speech-recognition levenshtein-distance metrics nlp python python-package speech-to-text stt stt-benchmark wer werpy word-error-rate

Keywords from Contributors

interactive mesh interpretability profiles sequences generic projection standardization optim embedded

Last synced: 6 months ago · JSON representation ·

Repository

🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and transcription accuracy.

Basic Info

Host: GitHub
Owner: analyticsinmotion
License: bsd-3-clause
Language: Python
Default Branch: main
Homepage: https://werpy.readthedocs.io/en/latest/
Size: 548 KB

Statistics

Stars: 16
Watchers: 3
Forks: 4
Open Issues: 3
Releases: 17

Topics

asr asr-evaluation automatic-speech-recognition levenshtein-distance metrics nlp python python-package speech-to-text stt stt-benchmark wer werpy word-error-rate

Created almost 3 years ago · Last pushed 7 months ago

Metadata Files

Readme Changelog Contributing License Code of conduct Citation Security

Word Error Rate for Python

| | | | --- | --- | | Meta | | | License | | | Security | | | Testing | | | Package | |

What is werpy?

werpy is an ultra-fast, lightweight Python package for calculating and analyzing Word Error Rate (WER) between two sets of text.

Built for flexibility and ease of use, it supports multiple input types such as strings, lists, and NumPy arrays. This makes it ideal for everything from quick experiments to large-scale evaluations.

With speed in mind at every scale, werpy harnesses the efficiency of C optimizations to accelerate processing, delivering ultra-fast results from small datasets to enterprise-level workloads.

It also comes packed with powerful features, including:
- 🔤 Built-in text normalization to handle data inconsistencies
- ⚙️ Customizable error penalties for insertions, deletions, and substitutions
- 📋 A detailed summary output for in-depth error analysis

werpy is a quality-focused package, built to production-grade standards for reliability and robustness.

Functions available in werpy

The following table provides an overview of the functions that can be used in werpy.

| Function | Description | | ------------- | ------------- | | normalize(text) | Preprocess input text to remove punctuation, remove duplicated spaces, leading/trailing blanks and convert all words to lowercase. | | wer(reference, hypothesis) | Calculate the overall Word Error Rate for the entire reference and hypothesis texts. | | wers(reference, hypothesis) | Calculates a list of the Word Error Rates for each of the reference and hypothesis texts. | | werp(reference, hypothesis) | Calculates a weighted Word Error Rate for the entire reference and hypothesis texts. | | werps(reference, hypothesis) | Calculates a list of weighted Word Error Rates for each of the reference and hypothesis texts. | | summary(reference, hypothesis) | Provides a comprehensive breakdown of the calculated results including the WER, Levenshtein Distance and all the insertion, deletion and substitution errors. | | summaryp(reference, hypothesis) | Delivers an in-depth breakdown of the results, covering metrics like WER, Levenshtein Distance, and a detailed account of insertion, deletion, and substitution errors, inclusive of the weighted WER. |

Installation

You can install the latest werpy release with Python's pip package manager:

```python

Install werpy from PyPi

pip install werpy ```

Usage

Import the werpy package

Python Code: python import werpy

Example 1 - Normalize a list of text

Python Code: python input_data = ["It's very popular in Antarctica.","The Sugar Bear character"] reference = werpy.normalize(input_data) print(reference)

Results Output: ['its very popular in antarctica', 'the sugar bear character']

Example 2 - Calculate the overall Word Error Rate on a set of strings

Python Code: python wer = werpy.wer('i love cold pizza', 'i love pizza') print(wer)

Results Output: 0.25

Example 3 - Calculate the overall Word Error Rate on a set of lists

Python Code: python ref = ['i love cold pizza','the sugar bear character was popular'] hyp = ['i love pizza','the sugar bare character was popular'] wer = werpy.wer(ref, hyp) print(wer)

Results Output: 0.2

Example 4 - Calculate the Word Error Rates for each set of texts

Python Code: python ref = ['no one else could claim that','she cited multiple reasons why'] hyp = ['no one else could claim that','she sighted multiple reasons why'] wers = werpy.wers(ref, hyp) print(wers)

Results Output: [0.0, 0.2]

Example 5 - Calculate the weighted Word Error Rates for the entire set of text

Python Code: python ref = ['it was beautiful and sunny today'] hyp = ['it was a beautiful and sunny day'] werp = werpy.werp(ref, hyp, insertions_weight=0.5, deletions_weight=0.5, substitutions_weight=1) print(werp)

Results Output: 0.25

Example 6 - Calculate a list of weighted Word Error Rates for each of the reference and hypothesis texts

Python Code: python ref = ['it blocked sight lines of central park', 'her father was an alderman in the city government'] hyp = ['it blocked sightlines of central park', 'our father was an elder man in the city government'] werps = werpy.werps(ref, hyp, insertions_weight = 0.5, deletions_weight = 0.5, substitutions_weight = 1) print(werps)

Results Output: [0.21428571428571427, 0.2777777777777778]

Example 7 - Provide a complete breakdown of the Word Error Rate calculations for each of the reference and hypothesis texts

Python Code: python ref = ['it is consumed domestically and exported to other countries', 'rufino street in makati right inside the makati central business district', 'its estuary is considered to have abnormally low rates of dissolved oxygen', 'he later cited his first wife anita as the inspiration for the song', 'no one else could claim that'] hyp = ['it is consumed domestically and exported to other countries', 'rofino street in mccauti right inside the macasi central business district', 'its estiary is considered to have a normally low rates of dissolved oxygen', 'he later sighted his first wife anita as the inspiration for the song', 'no one else could claim that'] summary = werpy.summary(ref, hyp) print(summary)

Results Output:

werpy-example-summary-results-word-error-rate-breakdown

Example 8 - Provide a complete breakdown of the Weighted Word Error Rate for each of the input texts

Python Code: python ref = ['the tower caused minor discontent because it blocked sight lines of central park', 'her father was an alderman in the city government', 'he was commonly referred to as the blacksmith of ballinalee'] hyp = ['the tower caused minor discontent because it blocked sightlines of central park', 'our father was an alderman in the city government', 'he was commonly referred to as the blacksmith of balen alley'] weighted_summary = werpy.summaryp(ref, hyp, insertions_weight = 0.5, deletions_weight = 0.5, substitutions_weight = 1) print(weighted_summary)

Results Output:

werpy-example-summaryp-results-word-error-rate-breakdown

Dependencies

NumPy - Provides an assortment of routines for fast operations on arrays
Pandas - Powerful data structures for data analysis, time series, and statistics

Licensing

werpy is released under the terms of the BSD 3-Clause License. Please refer to the LICENSE file for full details.

This project uses standard scientific Python libraries including NumPy and Pandas. For license details, please refer to their official repositories:

NumPy - https://github.com/numpy/numpy
Pandas - https://github.com/pandas-dev/pandas

Owner

Name: Analytics in Motion
Login: analyticsinmotion
Kind: organization
Email: pi@analyticsinmotion.com

Website: https://www.analyticsinmotion.com
Twitter: analyticsmotion
Repositories: 3
Profile: https://github.com/analyticsinmotion

Analytics in Motion ❤️ Open Source Programming, Data Science & AI/ML Projects

Citation (CITATION.cff)

cff-version: 1.2.0
message: 'If you use this software, please cite it as below.'
authors:
- family-names: "Armstrong"
  given-names: "Ross"
title: 'werpy - Word Error Rate for Python'
abstract: "A powerful Python package that rapidly calculates and analyzes the Word Error Rate (WER)."
license: BSD-3-Clause
license-url: "https://github.com/analyticsinmotion/werpy/blob/main/LICENSE"
repository-code: "https://github.com/analyticsinmotion/werpy"
keywords:
  - word error rate
  - wer
  - levenshtein distance
  - speech recognition
  - speech-to-text
  - stt
  - metrics
  - natural language processing
  - data science
  - python
  - python package
type: software
url: "https://github.com/analyticsinmotion/werpy"

GitHub Events

Total

Release event: 4
Watch event: 5
Delete event: 6
Issue comment event: 10
Push event: 150
Pull request event: 13
Create event: 10

Last Year

Release event: 4
Watch event: 5
Delete event: 6
Issue comment event: 10
Push event: 150
Pull request event: 13
Create event: 10

Committers

Last synced: 9 months ago

All Time

Total Commits: 463
Total Committers: 3
Avg Commits per committer: 154.333
Development Distribution Score (DDS): 0.035

Past Year

Commits: 175
Committers: 3
Avg Commits per committer: 58.333
Development Distribution Score (DDS): 0.057

Top Committers

Name	Email	Commits
Ross Armstrong	5****g	447
dependabot[bot]	4****]	13
doubleinfinity	r**g@z**m	3

Committer Domains (Top 20 + Academic)

zeusdb.com: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 0
Total pull requests: 29
Average time to close issues: N/A
Average time to close pull requests: about 1 month
Total issue authors: 0
Total pull request authors: 3
Average comments per issue: 0
Average comments per pull request: 1.28
Merged pull requests: 13
Bot issues: 0
Bot pull requests: 27

Past Year

Issues: 0
Pull requests: 17
Average time to close issues: N/A
Average time to close pull requests: about 2 months
Issue authors: 0
Pull request authors: 3
Average comments per issue: 0
Average comments per pull request: 1.29
Merged pull requests: 4
Bot issues: 0
Bot pull requests: 15

View more stats

Top Authors

Issue Authors

Pull Request Authors

dependabot[bot] (51)
fossabot (2)
LouisJalouzot (2)

Top Labels

Issue Labels

Pull Request Labels

dependencies (51) python (15)

Packages

Total packages: 1
Total downloads:
- pypi 4,166 last-month

Total dependent packages: 0
Total dependent repositories: 1
Total versions: 17
Total maintainers: 1

pypi.org: werpy

A powerful yet lightweight Python package to calculate and analyze the Word Error Rate (WER).

Documentation: https://werpy.readthedocs.io/
License: BSD 3-Clause License Copyright (c) 2023-2025, Analytics in Motion Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Latest release: 3.1.0
published 10 months ago

Versions: 17
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 4,166 Last month

Rankings

Downloads: 5.6%

Dependent packages count: 10.1%

Average: 15.0%

Stargazers count: 18.5%

Forks count: 19.1%

Dependent repos count: 21.6%

Maintainers (1)

analyticsinmotion

Last synced: 6 months ago

werpy

Science Score: 44.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Word Error Rate for Python

What is werpy?

Functions available in werpy

Installation

Install werpy from PyPi

Usage

Dependencies

Licensing

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: werpy

Rankings

Maintainers (1)