https://github.com/kukuster/ci_methods_analyser

Analyse efficacy of your own confidence interval (CI) methods

Keywords

accuracy analytical-solution ci-methods confidence-interval confidence-interval-plot confidence-intervals efficacy mathematics proportions rnd statistical-models statistical-significance statistics statistics-toolbox tool toolkit uncertainty wald-interval wilson-score z-test

Last synced: 6 months ago · JSON representation

Repository

Analyse efficacy of your own confidence interval (CI) methods

Basic Info

Host: GitHub
Owner: Kukuster
License: mit
Language: Python
Default Branch: master
Homepage: https://pypi.org/project/CI-methods-analyser/
Size: 5.89 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Topics

accuracy analytical-solution ci-methods confidence-interval confidence-interval-plot confidence-intervals efficacy mathematics proportions rnd statistical-models statistical-significance statistics statistics-toolbox tool toolkit uncertainty wald-interval wilson-score z-test

Created almost 5 years ago · Last pushed about 1 year ago

Metadata Files

Readme License

README.md

CI methods analyser

A toolkit for measuring the efficacy of various methods for calculating a confidence interval. Currently provides a toolkit for measuring the efficacy of methods for a confidence interval for the following statistics:

proportion
the difference between two proportions

This library was mainly inspired by the library: "Five Confidence Intervals for Proportions That You Should Know About" by Dr. Dennis Robert

Dependencies

python >=3.8
python libs:
- numpy
- scipy
- matplotlib
- tqdm

Installation

https://pypi.org/project/CI-methods-analyser/

Applications

Applied statistics and data science: compare multiple CI methods to select the most appropriate for specific scenarios (by its accuracy at a specific range of true population properties, by computational performance, etc.)

Education on statistics and CI: demonstrates how different CI methods perform under various conditions, helps to understand the concept of CI by comparing methods for evaluation of accuracy of CI methods

Usage

Testing Wald Interval - a popular method for calculating a confidence interval for proportion

Wald Interval is defined as so:

$(w^-, w^+) = p\,\pm\,z\sqrt{\frac{p(1-p)}{n}}$

How well does it approximate the confidence interval?

Let's assess what would be the quality of produced 95%CI with this method by testing on a range of proportions. We'll take 100 true proportions, with 1% step [0.001, 0.011, 0.021, ..., 0.991].

```python from CImethodsanalyser import CImethodForProportionefficacyToolkit as toolkit, methodsforCIfor_proportion

toolkit( method=methodsforCIforproportion.waldinterval, methodname="Wald Interval" ).calculatecoverageandshowplot( samplesize=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95, pltfigure_title="Wald Interval coverage" )

input('press Enter to exit') ```

This outputs the image:

Wald Interval - real coverage

The plot indicates the overall bad performance of the method and particularly poor performance for extreme proportions. While for some true proportions the calculated CI has true confidence of around 95%, most of the time the confidence is significantly lower. For the true proportions of <0.05 and >0.95 the true confidence of the generated CI is generally lower than 90%, as indicated by the steep descent on the left-most and right-most parts of the plot.

You really might want to use a different method. Check out this wonderful medium.com article by *Dr. Dennis Robert:* - ***Five Confidence Intervals for Proportions That You Should Know About [code in R]*

The function calculate_coverage_and_show_plot that we just used is a shortcut. The code below does the same calculations and yields the same result. It relies on the public properties and methods, giving more control over parts of the calculation:

```python from CImethodsanalyser import CImethodForProportionefficacyToolkit as toolkit, methodsforCIfor_proportion

take an already implemented method for calculating CI for proportions

waldinterval = methodsforCIforproportion.waldinterval

initialize the toolkit

waldintervaltesttoolkit = toolkit( method=waldinterval, method_name="Wald Interval")

calculate the real coverage that the method produces

for each case of a true population proportion (taken from the list `proportions`)

waldintervaltesttoolkit.calculatecoverageanalytically( samplesize=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95)

now you can access the calculated coverage and a few statistics:

waldintervaltest_toolkit.coverage # 1-d array of 0-100, the same shape as passed `proportions`

NOTE: `proportions`, when passed as a tuple of 3 float strings, expands to a list of evenly spaced float values where the #0 value is begin, #1 is end, #2 is step.

waldintervaltesttoolkit.averagecoverage # np.longdouble 0-100, avg of `coverage`

waldintervaltesttoolkit.averagedeviation # np.longdouble 0-100, avg abs diff w/ `confidence`

plots the calculated coverage in a matplotlib.pyplot figure

waldintervaltesttoolkit.plotcoverage( pltfiguretitle="Wald Interval coverage")

you can access the figure here:

waldintervaltest_toolkit.figure

shows the figure (non-blocking)

waldintervaltesttoolkit.showplot()

because show_plot() is non-blocking,

you have to pause the execution in order for the figure to be rendered completely

input('press Enter to exit') ```

I expose some style/color settings used by matplotlib.

My preference goes to the night light-friendly styling:

```python from CImethodsanalyser import CImethodForProportionefficacyToolkit as toolkit, methodsforCIfor_proportion

toolkit( method=methodsforCIforproportion.waldinterval, methodname="Wald Interval" ).calculatecoverageandshowplot( samplesize=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95, pltfiguretitle="Wald Interval coverage", theme='darkbackground', plotcolor="green", linecolor="orange" )

input('press Enter to exit') ```

Wald Interval - real coverage (dark theme)

Testing custom method for CI for proportion

You can implement your own methods and test them:

```python from CImethodsanalyser import CImethodForProportionefficacyToolkit as toolkit from CImethodsanalyser.mathfunctions import normalzscoretwotailed from functools import lru_cache

not a particularly good method for calculating CI for proportion

@lrucache(100000) def imtellingyatest(x: int, n: int, conflevel: float = 0.95): z = normalzscoretwo_tailed(conflevel)

p = float(x)/n
return (
    p - 0.02*z,
    p + 0.02*z
)

toolkit( method=imtellingyatest, methodname='"I\'m telling ya" test' ).calculatecoverageandshowplot( samplesize=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95, pltfiguretitle='"I\'m telling ya" coverage', theme='darkbackground', plotcolor="green", linecolor="orange" )

input('press Enter to exit')

``` "I'm telling ya" test - real coverage

This is the kind of test one would not trust. It shows very unreliable performance for the majority of the true proportions, as indicated by an extremely high discrepancy between the "ordered" confidence level of 95% and the true confidence of the CI range provided by this method. This means the output CIs are generally smaller than should be, therefore there's less confidence that the true value lies within the range of a CI. One could say, this method overestimates its ability to generate a confident range.

Let's try another custom method: "God is my witness" score

```python from CImethodsanalyser import CImethodForProportionefficacyToolkit as toolkit from CImethodsanalyser.mathfunctions import normalzscoretwotailed from functools import lru_cache

you could say, this method is "too good"

@lrucache(100000) def Godismywitnessscore(x: int, n: int, conflevel: float = 0.95): z = normalzscoretwotailed(conflevel)

p = float(x)/n
return (
    (0 + p)/2 - 0.005*z,
    (1 + p)/2 + 0.005*z
)

toolkit( method=Godismywitnessscore, methodname='"God is my witness" score' ).calculatecoverageandshowplot( samplesize=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95, pltfiguretitle='"God is my witness" score coverage', theme='dark_background' )

input('press Enter to exit') ```

"God is my witness" score - real coverage

This method clearly overdid the estimates. While one expects 95%CI, the output range is less clear, as it allows for a very wide range of possibilities. In a stats lingo one would say that this method is way too conservative.

Testing methods for CI for the difference between two proportions

Let's use the implemented Pooled Z test:

$(\delta^-, \delta^+) = \hat{p}_T - \hat{p}_C \pm z_{\alpha}\sqrt{\bar{p}(1-\bar{p})(\frac{1}{n_T}+\frac{1}{n_C})}$

, where:

$\bar{p} = \frac{n_T\hat{p}_T + n_C\hat{p}_C}{n_T + n_C}$

```python from CImethodsanalyser import CImethodForDiffBetwTwoProportionsefficacyToolkit as toolkitd, methodsforCIfordiffbetwtwo_proportions as methods

toolkitd( method=methods.Ztestpooled, methodname='Z test pooled' ).calculatecoverageandshowplot( samplesize1=100, samplesize2=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95, pltfiguretitle='Z test pooled', theme='dark_background', )

input('press Enter to exit') ```

Z test (unpooled) - real coverage

As you can see, this test is generally perfect for close proportions (along y = x line) [WHITE], unless proportions have extreme values, where confidence of the outputted CIs is lower than expected [PURPLE]

Also, this test is extremely conservative for the high and extreme differences between two proportions, i.e. for proportions whose values are far apart [GREEN]

You may want to change the color palette (although I wouldn't):

```python from CImethodsanalyser import CImethodForDiffBetwTwoProportionsefficacyToolkit as toolkitd, methodsforCIfordiffbetwtwo_proportions as methods

toolkitd( method=methods.Ztestpooled, methodname='Z test pooled' ).calculatecoverageandshowplot( samplesize1=100, samplesize2=100, proportions=('0.001', '0.999', '0.01'), confidence=0.95, pltfiguretitle='Z test pooled', theme='dark_background', colors=("gray", "purple", "white", "orange", "#d62728") )

input('press Enter to exit') ```

Z test (unpooled) - real coverage

NOTES

Methods for measuring the efficacy of CI methods

Two ways can be used to calculate the efficacy of CI methods for a given confidence and a true population proportion: - approximately, with random simulation (as implemented in R by Dr. Dennis Robert, see link above). Here: calculate_coverage_randomly. - precisely, with the analytical solution. Here: calculate_coverage_analytically

By default, always prefer the analytical solution.

Sampling the same binomial distribution n times, as it's typically done, (called "random experiments", or "simulations") is inefficient, because the binomial distribution is already fully determined by the given true population proportion.

By relying on the binomial distribution from scipy, the analytical solution provides 100% accuracy for any method (defined as a python function), any confidence level, any true population proportion(s), any sample and population size(s).

Mathematical proof of the analytical solution:

Proof of the analytical solution

Both "simulation" and "analytical" methods are implemented for CI for both statistics: proportion, and the difference between two proportions. For the precise analytical solution, an optimization was made. Theoretically, it is lossy, but practically, the error is always negligible (as shown by test_z_precision_difference.py) and is less significant than a 64-bit floating point precision error between the closest float representation and the true Real value. Optimization is regulated with the parameter z_precision, which is automatically estimated by default.

Various links

1. Equivalence and Noninferiority Testing (as I understand, are fancy terms for 2-sided and 1-sided p tests for the difference between two proportions) - https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/PASS/ConfidenceIntervalsfortheDifferenceBetweenTwo_Proportions.pdf - https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/PASS/Non-InferiorityTestsfortheDifferenceBetweenTwo_Proportions.pdf - https://www.ncss.com/wp-content/themes/ncss/pdf/Procedures/NCSS/TwoProportions-Non-Inferiority,Superiority,Equivalence,andTwo-SidedTestsvsa_Margin.pdf - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3019319/ - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2701110/ - https://pubmed.ncbi.nlm.nih.gov/9595617/ - http://thescipub.com/pdf/10.3844/amjbsp.2010.23.31

2. Biostatistics course (Dr. Nicolas Padilla Raygoza, et al.) - https://docs.google.com/presentation/d/1t1DowyVDDRFYGHDlJgmYMRN4JCrvFl3q/edit#slide=id.p1 - https://www.google.com/search?q=Dr.+Sc.+Nicolas+Padilla+Raygoza+Biostatistics+course+Part+10&oq=Dr.+Sc.+Nicolas+Padilla+Raygoza+Biostatistics+course+Part+10&aqs=chrome..69i57.3448j0j7&sourceid=chrome&ie=UTF-8 - https://slideplayer.com/slide/9837395/

3. Using z-test instead of a binomial test: - When can use https://stats.stackexchange.com/questions/424446/when-can-we-use-a-z-test-instead-of-a-binomial-test - How to use https://cogsci.ucsd.edu/~dgroppe/STATZ/binomial_ztest.pdf

I accept donations!

Paypal

Cryptocurrency

You can add a transaction message with the name of a project or a custom message if your wallet and the blockchain support this

Preferred blockchains:

blockchain | address | --- | --- | --- | bc1pjd2c4xcgq978979htc9admycue4nqqhda3vwsc38agked8yya50qz454xc | | 0x176D1b6c3Fc1db5f7f967Fdc735f8267cCe741F3 | Tether supports USDT ERC-20 | TMuNqEgEeBQ2GseWsqgaSdbtqasnJi8ePw | Tether supports USDT TRC-20

Alternative options (Ethereum L2, LN, EVM)

blockchain | address --- | ---

| `0x176D1b6c3Fc1db5f7f967Fdc735f8267cCe741F3`

Owner

Name: Mykyta Matushyn
Login: Kukuster
Kind: user
Location: Ukraine

Repositories: 4
Profile: https://github.com/Kukuster

GitHub Events

Total

Push event: 6

Last Year

Push event: 6

Committers

Last synced: almost 3 years ago

All Time

Total Commits: 21
Total Committers: 1
Avg Commits per committer: 21.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Kukuster	K**P@g**m	21

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 14 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 5
Total maintainers: 1

pypi.org: ci-methods-analyser

Analyse efficacy of your own confidence interval (CI) methods

Homepage: https://github.com/Kukuster/CI_methods_analyser
Documentation: https://ci-methods-analyser.readthedocs.io/
License: MIT
Latest release: 1.1.0
published over 4 years ago

Versions: 5
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 14 Last month

Rankings

Dependent packages count: 10.1%

Forks count: 29.8%

Downloads: 33.9%

Average: 36.0%

Stargazers count: 38.8%

Dependent repos count: 67.3%

Maintainers (1)

Kukuster

Last synced: 6 months ago

https://github.com/kukuster/ci_methods_analyser

Science Score: 36.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

CI methods analyser

Dependencies

Installation

Applications

Usage

Testing Wald Interval - a popular method for calculating a confidence interval for proportion

take an already implemented method for calculating CI for proportions

initialize the toolkit

calculate the real coverage that the method produces

for each case of a true population proportion (taken from the list proportions)

now you can access the calculated coverage and a few statistics:

waldintervaltest_toolkit.coverage # 1-d array of 0-100, the same shape as passed proportions

NOTE: proportions, when passed as a tuple of 3 float strings, expands to a list of evenly spaced float values where the #0 value is begin, #1 is end, #2 is step.

waldintervaltesttoolkit.averagecoverage # np.longdouble 0-100, avg of coverage

waldintervaltesttoolkit.averagedeviation # np.longdouble 0-100, avg abs diff w/ confidence

plots the calculated coverage in a matplotlib.pyplot figure

you can access the figure here:

waldintervaltest_toolkit.figure

shows the figure (non-blocking)

because show_plot() is non-blocking,

you have to pause the execution in order for the figure to be rendered completely

Testing custom method for CI for proportion

not a particularly good method for calculating CI for proportion

you could say, this method is "too good"

Testing methods for CI for the difference between two proportions

NOTES

Methods for measuring the efficacy of CI methods

Various links

I accept donations!

Paypal

Cryptocurrency

Owner

GitHub Events

Total

Last Year

Committers

All Time

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: ci-methods-analyser

Rankings

Maintainers (1)

Dependencies

for each case of a true population proportion (taken from the list `proportions`)

waldintervaltest_toolkit.coverage # 1-d array of 0-100, the same shape as passed `proportions`

NOTE: `proportions`, when passed as a tuple of 3 float strings, expands to a list of evenly spaced float values where the #0 value is begin, #1 is end, #2 is step.

waldintervaltesttoolkit.averagecoverage # np.longdouble 0-100, avg of `coverage`

waldintervaltesttoolkit.averagedeviation # np.longdouble 0-100, avg abs diff w/ `confidence`