SeqMetrics

SeqMetrics: a unified library for performance metrics calculation in Python - Published in JOSS (2024)

https://github.com/atrcheema/seqmetrics

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 5 DOI reference(s) in README and JOSS metadata
✓
Academic publication links
Links to: joss.theoj.org, zenodo.org
✓
Committers with academic emails
2 of 6 committers (33.3%) from academic institutions
○
Institutional organization owner
✓
JOSS paper metadata
Published in Journal of Open Source Software

Keywords

errors loss performance-metrics score

Keywords from Contributors

ode differential-equations

Scientific Fields

Mathematics Computer Science - 63% confidence

Economics Social Sciences - 60% confidence

Earth and Environmental Sciences Physical Sciences - 40% confidence

Last synced: 6 months ago · JSON representation

Repository

Various errors for tabular/structured/time-series data

Basic Info

Host: GitHub
Owner: AtrCheema
License: gpl-3.0
Language: Python
Default Branch: master
Homepage: https://seqmetrics.readthedocs.io
Size: 3.92 MB

Statistics

Stars: 17
Watchers: 1
Forks: 4
Open Issues: 2
Releases: 3

Topics

errors loss performance-metrics score

Created almost 6 years ago · Last pushed about 1 year ago

Metadata Files

Readme Contributing

SeqMetrics: a unified library for performance metrics calculation in Python

GitHub code size in bytes GitHub contributors GitHub last commit (branch)

The purpose of this repository to collect various classification and regression performance metrics or errors which can be calculated for time-series/sequential/tabular data, at one place. Currently only 1-dimensional data is supported.

How to Install

You can install SeqMetrics using pip

pip install SeqMetrics

or using GitHub link for the latest code

python -m pip install git+https://github.com/AtrCheema/SeqMetrics.git

or using setup file, go to folder where repo is downloaded

python setup.py install

You can also install SeqMetrics with all of its dependencies by making use of all option

pip install SeqMetrics[all]

This will install scipy and easy_mpl libraries. The scipy library is used to calculate some additional metrics such as kendall_tau or mapeforpeaks while easy_mpl is used for plotting purpose.

How to Use

SeqMetrics provides a uniform API for calculation of both regression and classification metrics. It has a functional API and a class based API.

Regression Metrics

The use of the functional API is as straightforward as calling the required function and providing it with true and predicted arrays or array-like objects (lists, tuples, DataFrames, Series, tensors).

```python import numpy as np from SeqMetrics import nse

true = np.random.random((20, 1)) pred = np.random.random((20, 1))

nse(true, pred) # calculate Nash Sutcliff efficiency

```

The method for calling functions is consistent across all 100+ metrics.

Alternatively, the same outcome can be achieved using a class-based API.

```python import numpy as np from SeqMetrics import RegressionMetrics

true = np.random.random((20, 1)) pred = np.random.random((20, 1))

er = RegressionMetrics(true, pred)

for m in er.all_methods: print("{:20}".format(m)) # get names of all availabe methods

er.nse() # calculate Nash Sutcliff efficiency

er.calculate_all(verbose=True) # or calculate errors using all available methods ```

We can visualize the calcuated performance metrics if we have easy_mpl package installed. ```python import numpy as np from SeqMetrics import RegressionMetrics, plot_metrics

np.random.seed(313) true = np.random.random((20, 1)) pred = np.random.random((20, 1))

er = RegressionMetrics(true, pred)

plotmetrics(er.calculateall(), color="Blues") ```

RegressionMetrics currently, calculates following performane metrics for regression.

| Name | Name in this repository | | -------------------------- | ------------- | | Absolute Percent Bias | abs_pbias | | Agreement Index | agreement_index | | Aitchison Distance | aitchison | | Alpha decomposition of the NSE | nse_alpha | | Anomaly correction coefficient | acc | | Bias | bias | | Beta decomposition of NSE | nse_beta | | Bounded NSE | nse_bound | | Bounded KGE | kge_bound | | Brier Score | brier_score | | Correlation Coefficient | corr_coeff | | Coefficient of Determination | r2 | | Centered Root Mean Square Deviation | centered_rms_dev | | Covariances | covariance | | Decomposed Mean Square Error | decomposed_mse | | Explained variance score | exp_var_score | | Euclid Distance | euclid_distance | | Geometric Mean Difference | gmaen_diff | | Geometric Mean Absolute Error | gmae | | Geometric Mean Relative Absolute Error | gmrae | | Inertial Root Squared Error | irmse | | Integral Normalized Root Squared Error | inrse | | Inter-percentile Normalized Root Mean Squared Error | nrmse_ipercentile | | Jensen-shannon divergence | JS | | Kling-Gupta Efficiency | kge | | Legate-McCabe Efficiency Index | lm_index | | Logrithmic Nash Sutcliff Efficiency | log_nse | | Logrithmic probability distribution | log_prob | | maximum error | max_error | | Mean Absolute Error | mae | | Mean Absolute Percentage Deviation | mapd | | Mean Absolute Percentage Error | mape | | Mean Absolute Relative Error | mare | | Mean Absolute Scaled Error | mase | | Mean Arctangle Absolute Percentage Error | maape | | Mean Bias Error | mean_bias_error | | Mean Bounded relative Absolute Error | mbrae | | Mean Errors | me | | Mean Gamma Deviances | mean_gamma_deviance | | Mean Log Error | mle | | Mean Normalized Root Mean Square Error | nrmse_mean | | Mean Percentage Error | mpe | | Mean Poisson Deviance | mean_poisson_deviance | | Mean Relative Absolute Error | mrae | | Mean Square Error | mse | | Mean Square Logrithmic Errors | mean_square_log_error | | Mean Variance | mean_var | | Median Absolute Error | median_abs_error | | Median Absolute Percentage Error | mdape | | Median Dictionary Accuracy | | | Median Error | mde | | Median Relative Absolute Error | mdrae | | Median Squared Error | med_seq_error | | Mielke-Berry R | mb_r | | Modified Agreement of Index | mod_agreement_index | | Modified Kling-Gupta Efficiency | kge_mod | | Modified Nash-Sutcliff Efficiency | nse_mod | | Nash-Sutcliff Efficiency | nse | | Non parametric Kling-Gupta Efficiency | kge_np | | Normalized Absolute Error | norm_ae | | Normalized Absolute Percentage Error | norm_ape | | Normalized Euclid Distance | norm_euclid_distance | | Normalized Root Mean Square Error | nrmse | | Peak flow bias of the flow duration curve | fdc_fhv | | Pearson correlation coefficient | person_r | | Percent Bias | pbias | | Range Normalized root mean square | nrmse_range | | Refined Agreement of Index | ref_agreement_index | | Relative Agreement of Index | rel_agreement_index | | Relative Absolute Error | rae | | Relative Root Mean Squared Error | relative_rmse | | Relative Nash-Sutcliff Efficiency | nse_rel | | Root Mean Square Errors | rmse | | Root Mean Square Log Error | rmsle | | Root Mean Square Percentage Error | rmspe | | Root Mean Squared Scaled Error | rmsse | | Root Median Squared Scaled Error | rmsse | | Root Relative Squared Error | rrse | | RSR | rsr | | Separmann correlation coefficient | spearmann_corr | | Skill Score of Murphy | skill_score_murphy | | Spectral Angle | sa | | Spectral Correlation | sc | | Spectral Gradient Angle | sga | | Spectral Information Divergence | sid | | Symmetric kullback-leibler divergence | KLsym | | Symmetric Mean Absolute Percentage Error | smape | | Symmetric Median Absolute Percentage Error | smdape | | sum of squared errors | sse | | Volume Errors | volume_error | | Volumetric Efficiency | ve | | Unscaled Mean Bounded Relative Absolute Error | umbrae | | Watterson's M | watt_m | | Weighted Mean Absolute Percent Errors | wmape | | Weighted Absolute Percentage Error | wape |

Classification Metrics

The API is same for performance metrics of classification problem.

```python import numpy as np from SeqMetrics import ClassificationMetrics

boolean array

t = np.array([True, False, False, False]) p = np.array([True, True, True, True]) metrics = ClassificationMetrics(t, p) print(metrics.calculate_all())

binary classification with numerical labels

true = np.array([1, 0, 0, 0]) pred = np.array([1, 1, 1, 1]) metrics = ClassificationMetrics(true, pred) print(metrics.calculate_all())

multiclass classification with numerical labels

true = np.random.randint(1, 4, 100) pred = np.random.randint(1, 4, 100) metrics = ClassificationMetrics(true, pred) print(metrics.calculate_all())

You can also provide logits instead of labels.

predictions = np.array([[0.25, 0.25, 0.25, 0.25], [0.01, 0.01, 0.01, 0.96]]) targets = np.array([[0, 0, 0, 1], [0, 0, 0, 1]]) metrics = ClassificationMetrics(targets, predictions, multiclass=True) print(metrics.calculate_all())

Working with categorical values is seamless

true = np.array(['a', 'b', 'b', 'b']) pred = np.array(['a', 'a', 'a', 'a']) metrics = ClassificationMetrics(true, pred) print(metrics.calculate_all())

same goes for multiclass categorical labels

t = np.array(['car', 'truck', 'truck', 'car', 'bike', 'truck']) p = np.array(['car', 'car', 'bike', 'car', 'bike', 'truck']) metrics = ClassificationMetrics(targets, predictions, multiclass=True) print(metrics.calculate_all()) ```

SeqMetrics library currently calculates following performance metrics of classification.

| Name | Name in this repository | | -------------------------- | ------------- | | Accuracy | accuracy | | Balanced Accuracy | balanced_accuracy | | Error Rate | error_rate | | Recall | recall | | Precision | precision | | F1 score | f1_score | | F2 score | f2_score | | Specificity | specificity | | Cross Entropy | cross_entropy | | False Positive Rate | false_positive_rate | | False Negative Rate | false_negative_rate | | False Discovery Rate | false_discovery_rate | | False Omission Rate | false_omission_rate | | Negative Predictive Value | negative_predictive_value | | Positive Likelihood Ratio | positive_likelihood_ratio | | Negative Likelihood Ratio | negative_likelihood_ratio | | Prevalence Threshold | prevalence_threshold | | Youden Index | youden_index | | Confusion Matrix | confusion_matrix | | Fowlkes Mallows Index | fowlkes_mallows_index | | Mathews correlation Coefficient | mathews_corr_coeff |

Web App

The SeqMetrics library is available from the webapp which is deployed used stream https://seqmetrics.streamlit.app/

You can also launch the app locally if you do not wish to use the web-based app. Make sure you follow the below steps

git clone https://github.com/AtrCheema/SeqMetrics.git
cd SeqMetrics
pip install requirements.txt
pip install streamlit
streamlit run app.py

Usage of streamlit based application app involves, 1) providing the true and predicted arrays either by pasting the data in the boxes or by uploading a file, 2) Selecting the relevant performance metric and 3) calculating the performance metric. These steps are further illustrated below.

The method to provide data from a (excel/csv) file is described in below image

Owner

Name: Ather Abbas
Login: AtrCheema
Kind: user
Location: South Korea
Company: Environmental Modeling and Monitoring Lab, UNIST

Repositories: 7
Profile: https://github.com/AtrCheema

JOSS Publication

SeqMetrics: a unified library for performance metrics calculation in Python

Published

July 30, 2024

DOI

10.21105/joss.06450

Volume 9, Issue 99, Page 6450

Authors

Fazila Rubab

Environmental AI Research Group, Islamabad, Pakistan

Sara Iftikhar

Environmental AI Research Group, Islamabad, Pakistan

Ather Abbas

King Abdullah University of Science and Technology, Thuwal, Saudi Arabia

Editor

Marcel Stimberg

GitHub Events

Total

Issues event: 1
Watch event: 3
Push event: 8

Last Year

Issues event: 1
Watch event: 3
Push event: 8

Committers

Last synced: 7 months ago

All Time

Total Commits: 180
Total Committers: 6
Avg Commits per committer: 30.0
Development Distribution Score (DDS): 0.283

Past Year

Commits: 17
Committers: 2
Avg Commits per committer: 8.5
Development Distribution Score (DDS): 0.294

Top Committers

Name	Email	Commits
AtrCheema	a**6@y**m	129
FazilaRubab	f**3@g**m	21
Sara-Iftikhar	s**k@g**m	20
Marcel Stimberg	m**g@s**r	7
Daniel S. Katz	d**z@i**g	2
The Codacy Badger	b**r@c**m	1

Committer Domains (Top 20 + Academic)

codacy.com: 1 ieee.org: 1 sorbonne-universite.fr: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 2
Total pull requests: 4
Average time to close issues: 6 months
Average time to close pull requests: about 3 hours
Total issue authors: 2
Total pull request authors: 3
Average comments per issue: 0.5
Average comments per pull request: 0.0
Merged pull requests: 4
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 3
Average time to close issues: N/A
Average time to close pull requests: about 3 hours
Issue authors: 0
Pull request authors: 2
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

SkafteNicki (1)
AtrCheema (1)
saeedvzf (1)

Pull Request Authors

mstimberg (4)
danielskatz (2)
codacy-badger (1)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 171 last-month

Total dependent packages: 0
Total dependent repositories: 1
Total versions: 5
Total maintainers: 1

pypi.org: seqmetrics

SeqMetrics: a unified library for performance metrics calculation in Python

Homepage: https://github.com/AtrCheema/SeqMetrics
Documentation: https://seqmetrics.readthedocs.io/
License: gpl-3.0
Latest release: 2.0.0
published over 1 year ago

Versions: 5
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 171 Last month

Rankings

Dependent packages count: 10.1%

Stargazers count: 17.1%

Average: 17.3%

Downloads: 18.6%

Forks count: 19.1%

Dependent repos count: 21.6%

Maintainers (1)

atrcheema

Last synced: 6 months ago

Dependencies

dev_requirements.txt pypi

easy_mpl >=0.20.4 development
numpy * development
scikit-learn * development
scipy * development

docs/requirements.txt pypi

numpy *
scipy *
sphinx *
sphinx-prompt *
sphinx_copybutton *
sphinx_issues *
sphinx_rtd_theme *
sphinx_toggleprompt *

requirements.txt pypi

numpy *
scipy *

setup.py pypi

numpy *
scipy *

.github/workflows/py37.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/python-publish.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

SeqMetrics

Science Score: 95.0%

Keywords

Keywords from Contributors

Scientific Fields

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

SeqMetrics: a unified library for performance metrics calculation in Python

How to Install

How to Use

Regression Metrics

Classification Metrics

boolean array

binary classification with numerical labels

multiclass classification with numerical labels

You can also provide logits instead of labels.

Working with categorical values is seamless

same goes for multiclass categorical labels

Web App

Related

Owner

JOSS Publication

SeqMetrics: a unified library for performance metrics calculation in Python

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: seqmetrics

Rankings

Maintainers (1)

Dependencies