permutations-stats

Statistical package for fast exact tests and simulations

https://github.com/trevismd/permutations-stats

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 3 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.9%) to scientific vocabulary

Keywords

brunner-munzel-test exact-test friedman-test permutation-statistics permutation-test statistics

Last synced: 6 months ago · JSON representation

Repository

Statistical package for fast exact tests and simulations

Basic Info

Host: GitHub
Owner: trevismd
License: gpl-3.0
Language: Python
Default Branch: master
Homepage:
Size: 112 KB

Statistics

Stars: 6
Watchers: 1
Forks: 0
Open Issues: 3
Releases: 2

Topics

brunner-munzel-test exact-test friedman-test permutation-statistics permutation-test statistics

Created over 5 years ago · Last pushed over 3 years ago

Metadata Files

Readme License Citation

permutations-stats

Python-only permutation-based statistical tests, accelerated with numba.

Status

Statistical tests

Brunner Munzel [1], Mann Whitney Wilcoxon [2, 3], Wilcoxon signed rank test [3], and Friedman [4] tests are implemented.

Exact tests (all combinations) and approximate method (simulation) are available.

Functions for bootstrapping-based confidence intervals calculations for mean, median and standard deviation are also implemented (not documented yet).

Why this package ?

This work aims to provide fast permutation-based statistical tests in Python. Some tests are not available publicly in an exact mode (computing all possible permutations) or with simulations. If certain assumptions cannot be made about the data (such as normality) or if the sample is not large enough, the existing implementations should not be used.
For example, the Brunner Munzel [1] test is implemented in scipy but not with an exact calculation. The statistic can be used with the public API but it can take some time if ran several thousands of times (e.g. the p-value is also calculated for each iteration).

This packages reimplements the looping of permutations and statistical tests with numba. A few seconds are required to compile on the fly for the first function call. Then, acceleration is critical as shown in this output from tests comparing the Brunner Munzel statistic calculation with scipy:

tests/stat_tests.py 200 tests with 5 and 4 data points - Permutations-stats: 2.375s, Scipy: 0.076s, diff: -2.299s . 200 tests with 3 and 3 data points - Permutations-stats: 0.002s, Scipy: 0.077s, diff: 0.075s . 200 tests with 10 and 12 data points - Permutations-stats: 0.004s, Scipy: 0.079s, diff: 0.075s . 10000 tests with 18 and 10 data points - Permutations-stats: 0.220s, Scipy: 3.806s, diff: 3.586s . 20000 tests with 18 and 15 data points - Permutations-stats: 0.477s, Scipy: 7.645s, diff: 7.168s . 30000 tests with 18 and 19 data points - Permutations-stats: 0.769s, Scipy: 11.468s, diff: 10.699s

Dependencies

numpy
numba

And for development testing only * scipy ≥1.5 * pytest

Usage

Basic usage: ```python import numpy as np from permutationsstats.permutations import permutationtest

Sample data

x = np.arange(9) y = (np.arange(9) -0.2) * 1.1

permutationtest(x, y, test="brunnermunzel")

PermutationsResults(statistic=0.2776044311308564, pvalue=0.7475935828877005, permutations=24310, test='brunnermunzel', alternative='TWOSIDED', method='exact')

```

More examples on usage.ipynb and a detailed demonstration on doc/demo.ipynb.

Perspective

If sample sizes and/or the number of iterations are small, acceleration is not expected with numba, and using numpy alone should be the fastest option.
Thresholds for numba use will be better determined to decide the function to call (without user intervention).

License

GNU General Public License v3.0 only.

Cite

If you find this software useful for your academic work, please cite as below:

Florian Charlier. (2022). permutations-stats (v0.2). Zenodo. https://doi.org/10.5281/zenodo.7213305

Acknowledgements

We would like to thank Marianne Paesmans, Lieveke Ameye, and Luigi Moretti at Institut Jules Bordet for their support during the development of this package.

References

[1] Brunner, E. and Munzel, U. (2000), The Nonparametric Behrens‐Fisher Problem: Asymptotic Theory and a Small‐Sample Approximation. Biom. J., 42: 17-25. doi:10.1002/(SICI)1521-4036(200001)42:117::AID-BIMJ173.0.CO;2-U

[2] Mann, H. B. and D. R. Whitney (1947). "On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other." Ann. Math. Statist. 18(1): 50-60.

[3] Wilcoxon, F. (1945). "Individual Comparisons by Ranking Methods." Biometrics Bulletin 1(6): 80-83.

[4] Friedman, M. (1937). "The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance." Journal of the American Statistical Association 32(200): 675-701.

Owner

Name: Florian Charlier
Login: trevismd
Kind: user
Location: Belgium

Repositories: 3
Profile: https://github.com/trevismd

Developing with passion

GitHub Events

Total

Watch event: 1

Last Year

Watch event: 1

Committers

Last synced: almost 3 years ago

All Time

Total Commits: 46
Total Committers: 4
Avg Commits per committer: 11.5
Development Distribution Score (DDS): 0.196

Top Committers

Name	Email	Commits
Florian Charlier	f**r@u**e	37
Florian Charlier	f**r@u**e	4
Florian Charlier	4**d@u**m	4
Florian Charlier	t**s@c**e	1

Committer Domains (Top 20 + Academic)

cascliniques.be: 1 ulb.be: 1 ulb.ac.be: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 3
Total pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: about 1 year
Total issue authors: 2
Total pull request authors: 1
Average comments per issue: 0.67
Average comments per pull request: 1.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

trevismd (2)
mesamuels (1)

Pull Request Authors

trevismd (1)

Top Labels

Issue Labels

documentation (2)

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 15 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 1
Total maintainers: 1

pypi.org: permutations-stats

Permutation-based statistical tests in Python

Homepage: https://github.com/trevismd/permutations-stats
Documentation: https://permutations-stats.readthedocs.io/
License: GPL-3.0-only
Latest release: 0.2
published over 3 years ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 15 Last month

Rankings

Dependent packages count: 6.6%

Stargazers count: 25.5%

Average: 26.1%

Forks count: 30.5%

Dependent repos count: 30.6%

Downloads: 37.3%

Maintainers (1)

trevis

Last synced: 6 months ago

Dependencies

requirements.txt pypi

numba *
numpy *

setup.py pypi

numpy *

.github/workflows/python-package.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite
codecov/codecov-action v3 composite

permutations-stats

Science Score: 49.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

permutations-stats

Status

Statistical tests

Why this package ?

Dependencies

Usage

Sample data

PermutationsResults(statistic=0.2776044311308564, pvalue=0.7475935828877005, permutations=24310, test='brunnermunzel', alternative='TWOSIDED', method='exact')

Perspective

License

Cite

Acknowledgements

References

Owner

GitHub Events

Total

Last Year

Committers

All Time

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: permutations-stats

Rankings

Maintainers (1)

Dependencies