werpy
🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and transcription accuracy.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.6%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
🐍📦 Ultra-fast Python package for calculating and analyzing the Word Error Rate (WER). Built for the scalable evaluation of speech and transcription accuracy.
Basic Info
- Host: GitHub
- Owner: analyticsinmotion
- License: bsd-3-clause
- Language: Python
- Default Branch: main
- Homepage: https://werpy.readthedocs.io/en/latest/
- Size: 548 KB
Statistics
- Stars: 16
- Watchers: 3
- Forks: 4
- Open Issues: 3
- Releases: 17
Topics
Metadata Files
README.md

Word Error Rate for Python
| | |
| --- | --- |
| Meta |
|
| License |
|
| Security |
|
| Testing |
|
| Package |
|
What is werpy?
werpy is an ultra-fast, lightweight Python package for calculating and analyzing Word Error Rate (WER) between two sets of text.
Built for flexibility and ease of use, it supports multiple input types such as strings, lists, and NumPy arrays. This makes it ideal for everything from quick experiments to large-scale evaluations.
With speed in mind at every scale, werpy harnesses the efficiency of C optimizations to accelerate processing, delivering ultra-fast results from small datasets to enterprise-level workloads.
It also comes packed with powerful features, including:
- 🔤 Built-in text normalization to handle data inconsistencies
- ⚙️ Customizable error penalties for insertions, deletions, and substitutions
- 📋 A detailed summary output for in-depth error analysis
werpy is a quality-focused package, built to production-grade standards for reliability and robustness.
Functions available in werpy
The following table provides an overview of the functions that can be used in werpy.
| Function | Description | | ------------- | ------------- | | normalize(text) | Preprocess input text to remove punctuation, remove duplicated spaces, leading/trailing blanks and convert all words to lowercase. | | wer(reference, hypothesis) | Calculate the overall Word Error Rate for the entire reference and hypothesis texts. | | wers(reference, hypothesis) | Calculates a list of the Word Error Rates for each of the reference and hypothesis texts. | | werp(reference, hypothesis) | Calculates a weighted Word Error Rate for the entire reference and hypothesis texts. | | werps(reference, hypothesis) | Calculates a list of weighted Word Error Rates for each of the reference and hypothesis texts. | | summary(reference, hypothesis) | Provides a comprehensive breakdown of the calculated results including the WER, Levenshtein Distance and all the insertion, deletion and substitution errors. | | summaryp(reference, hypothesis) | Delivers an in-depth breakdown of the results, covering metrics like WER, Levenshtein Distance, and a detailed account of insertion, deletion, and substitution errors, inclusive of the weighted WER. |
Installation
You can install the latest werpy release with Python's pip package manager:
```python
Install werpy from PyPi
pip install werpy ```
Usage
Import the werpy package
Python Code:
python
import werpy
Example 1 - Normalize a list of text
Python Code:
python
input_data = ["It's very popular in Antarctica.","The Sugar Bear character"]
reference = werpy.normalize(input_data)
print(reference)
Results Output:
['its very popular in antarctica', 'the sugar bear character']
Example 2 - Calculate the overall Word Error Rate on a set of strings
Python Code:
python
wer = werpy.wer('i love cold pizza', 'i love pizza')
print(wer)
Results Output:
0.25
Example 3 - Calculate the overall Word Error Rate on a set of lists
Python Code:
python
ref = ['i love cold pizza','the sugar bear character was popular']
hyp = ['i love pizza','the sugar bare character was popular']
wer = werpy.wer(ref, hyp)
print(wer)
Results Output:
0.2
Example 4 - Calculate the Word Error Rates for each set of texts
Python Code:
python
ref = ['no one else could claim that','she cited multiple reasons why']
hyp = ['no one else could claim that','she sighted multiple reasons why']
wers = werpy.wers(ref, hyp)
print(wers)
Results Output:
[0.0, 0.2]
Example 5 - Calculate the weighted Word Error Rates for the entire set of text
Python Code:
python
ref = ['it was beautiful and sunny today']
hyp = ['it was a beautiful and sunny day']
werp = werpy.werp(ref, hyp, insertions_weight=0.5, deletions_weight=0.5, substitutions_weight=1)
print(werp)
Results Output:
0.25
Example 6 - Calculate a list of weighted Word Error Rates for each of the reference and hypothesis texts
Python Code:
python
ref = ['it blocked sight lines of central park', 'her father was an alderman in the city government']
hyp = ['it blocked sightlines of central park', 'our father was an elder man in the city government']
werps = werpy.werps(ref, hyp, insertions_weight = 0.5, deletions_weight = 0.5, substitutions_weight = 1)
print(werps)
Results Output:
[0.21428571428571427, 0.2777777777777778]
Example 7 - Provide a complete breakdown of the Word Error Rate calculations for each of the reference and hypothesis texts
Python Code:
python
ref = ['it is consumed domestically and exported to other countries', 'rufino street in makati right inside the makati central business district', 'its estuary is considered to have abnormally low rates of dissolved oxygen', 'he later cited his first wife anita as the inspiration for the song', 'no one else could claim that']
hyp = ['it is consumed domestically and exported to other countries', 'rofino street in mccauti right inside the macasi central business district', 'its estiary is considered to have a normally low rates of dissolved oxygen', 'he later sighted his first wife anita as the inspiration for the song', 'no one else could claim that']
summary = werpy.summary(ref, hyp)
print(summary)
Results Output:
<!--
-->
<!--
-->
<!--
-->

Example 8 - Provide a complete breakdown of the Weighted Word Error Rate for each of the input texts
Python Code:
python
ref = ['the tower caused minor discontent because it blocked sight lines of central park', 'her father was an alderman in the city government', 'he was commonly referred to as the blacksmith of ballinalee']
hyp = ['the tower caused minor discontent because it blocked sightlines of central park', 'our father was an alderman in the city government', 'he was commonly referred to as the blacksmith of balen alley']
weighted_summary = werpy.summaryp(ref, hyp, insertions_weight = 0.5, deletions_weight = 0.5, substitutions_weight = 1)
print(weighted_summary)
Results Output:
Dependencies
- NumPy - Provides an assortment of routines for fast operations on arrays
- Pandas - Powerful data structures for data analysis, time series, and statistics
Licensing
werpy is released under the terms of the BSD 3-Clause License. Please refer to the LICENSE file for full details.
This project uses standard scientific Python libraries including NumPy and Pandas. For license details, please refer to their official repositories:
- NumPy - https://github.com/numpy/numpy
- Pandas - https://github.com/pandas-dev/pandas
Owner
- Name: Analytics in Motion
- Login: analyticsinmotion
- Kind: organization
- Email: pi@analyticsinmotion.com
- Website: https://www.analyticsinmotion.com
- Twitter: analyticsmotion
- Repositories: 3
- Profile: https://github.com/analyticsinmotion
Analytics in Motion ❤️ Open Source Programming, Data Science & AI/ML Projects
Citation (CITATION.cff)
cff-version: 1.2.0 message: 'If you use this software, please cite it as below.' authors: - family-names: "Armstrong" given-names: "Ross" title: 'werpy - Word Error Rate for Python' abstract: "A powerful Python package that rapidly calculates and analyzes the Word Error Rate (WER)." license: BSD-3-Clause license-url: "https://github.com/analyticsinmotion/werpy/blob/main/LICENSE" repository-code: "https://github.com/analyticsinmotion/werpy" keywords: - word error rate - wer - levenshtein distance - speech recognition - speech-to-text - stt - metrics - natural language processing - data science - python - python package type: software url: "https://github.com/analyticsinmotion/werpy"
GitHub Events
Total
- Release event: 4
- Watch event: 5
- Delete event: 6
- Issue comment event: 10
- Push event: 150
- Pull request event: 13
- Create event: 10
Last Year
- Release event: 4
- Watch event: 5
- Delete event: 6
- Issue comment event: 10
- Push event: 150
- Pull request event: 13
- Create event: 10
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Ross Armstrong | 5****g | 447 |
| dependabot[bot] | 4****] | 13 |
| doubleinfinity | r****g@z****m | 3 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 29
- Average time to close issues: N/A
- Average time to close pull requests: about 1 month
- Total issue authors: 0
- Total pull request authors: 3
- Average comments per issue: 0
- Average comments per pull request: 1.28
- Merged pull requests: 13
- Bot issues: 0
- Bot pull requests: 27
Past Year
- Issues: 0
- Pull requests: 17
- Average time to close issues: N/A
- Average time to close pull requests: about 2 months
- Issue authors: 0
- Pull request authors: 3
- Average comments per issue: 0
- Average comments per pull request: 1.29
- Merged pull requests: 4
- Bot issues: 0
- Bot pull requests: 15
Top Authors
Issue Authors
Pull Request Authors
- dependabot[bot] (51)
- fossabot (2)
- LouisJalouzot (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 4,166 last-month
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 17
- Total maintainers: 1
pypi.org: werpy
A powerful yet lightweight Python package to calculate and analyze the Word Error Rate (WER).
- Documentation: https://werpy.readthedocs.io/
- License: BSD 3-Clause License Copyright (c) 2023-2025, Analytics in Motion Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
Latest release: 3.1.0
published 10 months ago