https://github.com/cokelaer/fitter

Fit data to many distributions

https://github.com/cokelaer/fitter

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (18.2%) to scientific vocabulary

Keywords

distribution fit python statistics

Keywords from Contributors

cycles sequences shellcode genomics interactive parallel serializer autograding projection routing
Last synced: 6 months ago · JSON representation

Repository

Fit data to many distributions

Basic Info
Statistics
  • Stars: 398
  • Watchers: 11
  • Forks: 58
  • Open Issues: 34
  • Releases: 11
Topics
distribution fit python statistics
Created over 11 years ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.rst


#############################
FITTER documentation
#############################

.. image:: https://badge.fury.io/py/fitter.svg
    :target: https://pypi.python.org/pypi/fitter

.. image:: https://github.com/cokelaer/fitter/actions/workflows/main.yml/badge.svg?branch=main
    :target: https://github.com/cokelaer/fitter/actions/workflows/main.yml

.. image:: https://coveralls.io/repos/cokelaer/fitter/badge.png?branch=main
    :target: https://coveralls.io/r/cokelaer/fitter?branch=main

.. image:: http://readthedocs.org/projects/fitter/badge/?version=latest
    :target: http://fitter.readthedocs.org/en/latest/?badge=latest
    :alt: Documentation Status

.. image:: https://zenodo.org/badge/23078551.svg
   :target: https://zenodo.org/badge/latestdoi/23078551

Compatible with Python 3.7, and 3.8, 3.9


What is it ?
################

The **fitter** package is a Python library used for fitting probability distributions to data. It provides a straightforward and and intuitive interface to estimate parameters for various types of distributions, both continuous and discrete. Using **fitter**, you can easily fit a range of distributions to your data and compare their fit, aiding in the selection of the most suitable distribution. The package is designed to be user-friendly and requires minimal setup, making it a useful tool for data scientists and statisticians working with probability distributions.

Installation
###################

::

    pip install fitter

**fitter** is also available on **conda** (bioconda channel)::

     conda install fitter


Usage
##################

standalone
===========

A standalone application (very simple) is also provided and works with input CSV
files::

    fitter fitdist data.csv --column-number 1 --distributions gamma,normal

It creates a file called fitter.png and a log fitter.log

From Python shell
==================

First, let us create a data samples with N = 10,000 points from a gamma distribution::

    from scipy import stats
    data = stats.gamma.rvs(2, loc=1.5, scale=2, size=10000)

.. note:: the fitting is slow so keep the size value to reasonable value.

Now, without any knowledge about the distribution or its parameter, what is the distribution that fits the data best ? Scipy has 80 distributions and the **Fitter** class will scan all of them, call the fit function for you, ignoring those that fail or run forever and finally give you a summary of the best distributions in the sense of sum of the square errors. The best is to give an example::


    from fitter import Fitter
    f = Fitter(data)
    f.fit()
    # may take some time since by default, all distributions are tried
    # but you call manually provide a smaller set of distributions
    f.summary()


.. image:: http://pythonhosted.org/fitter/_images/index-1.png
    :target: http://pythonhosted.org/fitter/_images/index-1.png


See the `online `_ documentation for details.


Contributors
=============


Setting up and maintaining Fitter has been possible thanks to users and contributors.
Thanks to all:

.. image:: https://contrib.rocks/image?repo=cokelaer/fitter
    :target: https://github.com/cokelaer/fitter/graphs/contributors




Changelog
~~~~~~~~~
========= ==========================================================================
Version   Description
========= ==========================================================================
1.7.1     * integrate PR github.com/cokelaer/fitter/pull/100 from @vitorandreazza
            to speedup multiprocessing run.
1.7.0     * replace logging with loguru
          * main application update to add missing --output-image option and use
            rich_click
          * replace pkg_resources with importlib
1.6.0     * for developers: uses pyproject.toml instead of setup.py
          * Fix progress bar fixing https://github.com/cokelaer/fitter/pull/74
          * Fix BIC formula https://github.com/cokelaer/fitter/pull/77
1.5.2     * PR https://github.com/cokelaer/fitter/pull/74 to fix logger
1.5.1     * fixed regression putting back joblib
1.5.0     * removed easydev and replaced by tqdm for progress bar
          * progressbar from tqdm also allows replacement of joblib need
1.4.1     * Update timeout in docs from 10 to 30 seconds by @mpadge in
            https://github.com/cokelaer/fitter/pull/47
          * Add Kolmogorov-Smirnov goodness-of-fit statistic by @lahdjirayhan in
            https://github.com/cokelaer/fitter/pull/58
          * switch branch from master to main
1.4.0     * get_best function now returns the parameters as a dictionary
            of parameter names and their values rather than just a list of
            values (https://github.com/cokelaer/fitter/issues/23) thanks to
            contributor @kabirmdasraful
          * Accepting PR to fix progress bar issue reported in
            https://github.com/cokelaer/fitter/pull/37
1.3.0     * parallel process implemented https://github.com/cokelaer/fitter/pull/25
            thanks to @arsenyinfo
1.2.3     * remove vervose arguments in Fitter class. Using the logging module
            instead
          * the Fitter.fit has now a progress bar
          * add a standalone application called … fitter (see the doc)
1.2.2     was not released
1.2.1     adding new class called histfit (see documentation)
1.2       * Fixed the version. Previous version switched from
            1.0.9 to 1.1.11. To start a fresh version, we increase to 1.2.0
          * Merged pull request required by bioconda
          * Merged pull request related to implementation of
            AIC/BIC/KL criteria (https://github.com/cokelaer/fitter/pull/19).
            This also fixes https://github.com/cokelaer/fitter/issues/9
          * Implement two functions to get all distributions, or a list of
            common distributions to help users decreading computational time
            (https://github.com/cokelaer/fitter/issues/20). Also added a FAQS
            section.
          * travis tested Python 3.6 and 3.7 (not 3.5 anymore)
1.1       * Fixed deprecated warning
          * fitter is now in readthedocs at fitter.readthedocs.io
1.0.9     * https://github.com/cokelaer/fitter/pull/8 and 11
            PR https://github.com/cokelaer/fitter/pull/8
1.0.6     * summary() now returns the dataframe (instead of printing it)
1.0.5      https://github.com/cokelaer/fitter/issues
1.0.2     add manifest to fix missing source in the pypi repository.
========= ==========================================================================

Owner

  • Name: Thomas Cokelaer
  • Login: cokelaer
  • Kind: user
  • Location: Paris, France
  • Company: Institut Pasteur

Bioinformatician, Scientific Software Developer, Python developer

GitHub Events

Total
  • Issues event: 1
  • Watch event: 34
  • Delete event: 2
  • Issue comment event: 6
  • Push event: 2
  • Pull request event: 5
  • Fork event: 2
Last Year
  • Issues event: 1
  • Watch event: 34
  • Delete event: 2
  • Issue comment event: 6
  • Push event: 2
  • Pull request event: 5
  • Fork event: 2

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 162
  • Total Committers: 18
  • Avg Commits per committer: 9.0
  • Development Distribution Score (DDS): 0.191
Past Year
  • Commits: 7
  • Committers: 3
  • Avg Commits per committer: 2.333
  • Development Distribution Score (DDS): 0.429
Top Committers
Name Email Commits
Thomas Cokelaer c****r@g****m 131
dependabot[bot] 4****] 9
Arseny Kravchenko me@a****o 3
Nuclear03020704 5****n 3
msat59 m****i@g****m 2
vitor v****a@h****m 2
Asraful Kabir a****r@i****e 1
Ashutosh Varma a****1@l****m 1
Brian b****w@g****m 1
Caio Stringari c****i@g****m 1
Christian Brueffer c****n@b****o 1
Eike Broda g****t@e****e 1
Elmar Pruesse e****e 1
Karthikeyan Singaravelan t****i@g****m 1
Stefano Alberto Russo s****o@g****m 1
Zhengyi Li l****u@g****m 1
mark padgham m****m@e****m 1
negodfre n****y@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 72
  • Total pull requests: 36
  • Average time to close issues: 4 months
  • Average time to close pull requests: 2 months
  • Total issue authors: 65
  • Total pull request authors: 21
  • Average comments per issue: 1.63
  • Average comments per pull request: 1.08
  • Merged pull requests: 29
  • Bot issues: 0
  • Bot pull requests: 10
Past Year
  • Issues: 1
  • Pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 2
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 1
Top Authors
Issue Authors
  • cokelaer (3)
  • msat59 (2)
  • sarusso (2)
  • zunzun (2)
  • findyy99 (2)
  • AnthoineResea (1)
  • AlessandroMinervini (1)
  • dulti (1)
  • alekin2 (1)
  • Camil88 (1)
  • MadRabbit05 (1)
  • Linduri (1)
  • kharelg1 (1)
  • odsogunro (1)
  • helios1014 (1)
Pull Request Authors
  • dependabot[bot] (15)
  • cokelaer (4)
  • sarusso (2)
  • vitorandreazza (2)
  • tg12 (2)
  • lahdjirayhan (2)
  • ErnestDong (1)
  • ebroda (1)
  • H3cth0r (1)
  • ashutoshvarma (1)
  • cbrueffer (1)
  • arsenyinfo (1)
  • mpadge (1)
  • msat59 (1)
  • li-positive-one (1)
Top Labels
Issue Labels
question / problems (7) enhancement (3) bug (2) Feature request (1)
Pull Request Labels
dependencies (15) enhancement (2)

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 24,931 last-month
  • Total docker downloads: 657
  • Total dependent packages: 6
    (may contain duplicates)
  • Total dependent repositories: 79
    (may contain duplicates)
  • Total versions: 29
  • Total maintainers: 2
pypi.org: fitter

A tool to fit data to many distributions and get the best one(s)

  • Versions: 26
  • Dependent Packages: 6
  • Dependent Repositories: 79
  • Downloads: 24,931 Last month
  • Docker Downloads: 657
Rankings
Downloads: 1.5%
Average: 1.6%
Dependent packages count: 1.6%
Docker downloads count: 1.6%
Dependent repos count: 1.7%
Maintainers (1)
Last synced: 6 months ago
spack.io: py-fitter

fitter package provides a simple class to identify the distribution from which a data samples is generated from. It uses 80 distributions from Scipy and allows you to plot the results to check what is the most probable distribution and the best parameters.

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Stargazers count: 13.6%
Forks count: 16.6%
Average: 21.9%
Dependent packages count: 57.3%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/main.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/pypi.yml actions
  • actions/checkout main composite
  • actions/setup-python v1 composite
  • pypa/gh-action-pypi-publish release/v1 composite
poetry.lock pypi
  • certifi 2023.7.22
  • charset-normalizer 3.2.0
  • click 8.1.6
  • colorama 0.4.6
  • contourpy 1.1.0
  • coverage 6.5.0
  • coveralls 3.3.1
  • cycler 0.11.0
  • docopt 0.6.2
  • exceptiongroup 1.1.2
  • execnet 2.0.2
  • fonttools 4.42.0
  • idna 3.4
  • importlib-resources 6.0.1
  • iniconfig 2.0.0
  • joblib 1.3.1
  • kiwisolver 1.4.4
  • matplotlib 3.7.2
  • numpy 1.24.4
  • packaging 23.1
  • pandas 2.0.3
  • pillow 10.0.1
  • pluggy 1.2.0
  • pyparsing 3.0.9
  • pytest 7.4.0
  • pytest-cov 4.1.0
  • pytest-mock 3.11.1
  • pytest-timeout 1.4.2
  • pytest-xdist 3.3.1
  • python-dateutil 2.8.2
  • pytz 2023.3
  • requests 2.31.0
  • scipy 1.9.3
  • six 1.16.0
  • tomli 2.0.1
  • tqdm 4.65.1
  • tzdata 2023.3
  • urllib3 2.0.7
  • zipp 3.16.2
pyproject.toml pypi
  • click ^8.1.6
  • joblib ^1.3.1
  • matplotlib >=3.7.2
  • numpy >=1.20
  • pandas ^2.0.3
  • python ^3.8
  • scipy >=0.18
  • tqdm ^4.65.1