statsmodels

Statsmodels: statistical modeling and econometrics in Python

https://github.com/statsmodels/statsmodels

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    43 of 461 committers (9.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.7%) to scientific vocabulary

Keywords

count-model data-analysis data-science econometrics forecasting generalized-linear-models hypothesis-testing prediction python regression-models robust-estimation statistics timeseries-analysis

Keywords from Contributors

closember ecog eeg meg neuroimaging neuroscience magnetoencephalography electroencephalography electrocorticography data-mining

Scientific Fields

Agricultural and Biological Sciences Life Sciences - 40% confidence
Last synced: 4 months ago · JSON representation ·

Repository

Statsmodels: statistical modeling and econometrics in Python

Basic Info
Statistics
  • Stars: 10,920
  • Watchers: 288
  • Forks: 3,284
  • Open Issues: 2,924
  • Releases: 30
Topics
count-model data-analysis data-science econometrics forecasting generalized-linear-models hypothesis-testing prediction python regression-models robust-estimation statistics timeseries-analysis
Created over 14 years ago · Last pushed 5 months ago
Metadata Files
Readme Changelog Contributing License Citation

README.rst

.. image:: docs/source/images/statsmodels-logo-v2-horizontal.svg
  :alt: Statsmodels logo

|PyPI Version| |Conda Version| |License| |Azure CI Build Status|
|Codecov Coverage| |Coveralls Coverage| |PyPI downloads| |Conda downloads|

About statsmodels
=================

statsmodels is a Python package that provides a complement to scipy for
statistical computations including descriptive statistics and estimation
and inference for statistical models.


Documentation
=============

The documentation for the latest release is at

https://www.statsmodels.org/stable/

The documentation for the development version is at

https://www.statsmodels.org/dev/

Recent improvements are highlighted in the release notes

https://www.statsmodels.org/stable/release/

Backups of documentation are available at https://statsmodels.github.io/stable/
and https://statsmodels.github.io/dev/.


Main Features
=============

* Linear regression models:

  - Ordinary least squares
  - Generalized least squares
  - Weighted least squares
  - Least squares with autoregressive errors
  - Quantile regression
  - Recursive least squares

* Mixed Linear Model with mixed effects and variance components
* GLM: Generalized linear models with support for all of the one-parameter
  exponential family distributions
* Bayesian Mixed GLM for Binomial and Poisson
* GEE: Generalized Estimating Equations for one-way clustered or longitudinal data
* Discrete models:

  - Logit and Probit
  - Multinomial logit (MNLogit)
  - Poisson and Generalized Poisson regression
  - Negative Binomial regression
  - Zero-Inflated Count models

* RLM: Robust linear models with support for several M-estimators.
* Time Series Analysis: models for time series analysis

  - Complete StateSpace modeling framework

    - Seasonal ARIMA and ARIMAX models
    - VARMA and VARMAX models
    - Dynamic Factor models
    - Unobserved Component models

  - Markov switching models (MSAR), also known as Hidden Markov Models (HMM)
  - Univariate time series analysis: AR, ARIMA
  - Vector autoregressive models, VAR and structural VAR
  - Vector error correction model, VECM
  - exponential smoothing, Holt-Winters
  - Hypothesis tests for time series: unit root, cointegration and others
  - Descriptive statistics and process models for time series analysis

* Survival analysis:

  - Proportional hazards regression (Cox models)
  - Survivor function estimation (Kaplan-Meier)
  - Cumulative incidence function estimation

* Multivariate:

  - Principal Component Analysis with missing data
  - Factor Analysis with rotation
  - MANOVA
  - Canonical Correlation

* Nonparametric statistics: Univariate and multivariate kernel density estimators
* Datasets: Datasets used for examples and in testing
* Statistics: a wide range of statistical tests

  - diagnostics and specification tests
  - goodness-of-fit and normality tests
  - functions for multiple testing
  - various additional statistical tests

* Imputation with MICE, regression on order statistic and Gaussian imputation
* Mediation analysis
* Graphics includes plot functions for visual analysis of data and model results

* I/O

  - Tools for reading Stata .dta files, but pandas has a more recent version
  - Table output to ascii, latex, and html

* Miscellaneous models
* Sandbox: statsmodels contains a sandbox folder with code in various stages of
  development and testing which is not considered "production ready".  This covers
  among others

  - Generalized method of moments (GMM) estimators
  - Kernel regression
  - Various extensions to scipy.stats.distributions
  - Panel data models
  - Information theoretic measures

How to get it
=============
The main branch on GitHub is the most up to date code

https://www.github.com/statsmodels/statsmodels

Source download of release tags are available on GitHub

https://github.com/statsmodels/statsmodels/tags

Binaries and source distributions are available from PyPi

https://pypi.org/project/statsmodels/

Binaries can be installed in Anaconda

conda install statsmodels


Getting the latest code
=======================

Installing the most recent nightly wheel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The most recent nightly wheel can be installed using pip.

.. code:: bash

   python -m pip install -i https://pypi.anaconda.org/scientific-python-nightly-wheels/simple statsmodels --upgrade --use-deprecated=legacy-resolver

Installing from sources
~~~~~~~~~~~~~~~~~~~~~~~

See INSTALL.txt for requirements or see the documentation

https://statsmodels.github.io/dev/install.html

Contributing
============
Contributions in any form are welcome, including:

* Documentation improvements
* Additional tests
* New features to existing models
* New models

https://www.statsmodels.org/stable/dev/test_notes

for instructions on installing statsmodels in *editable* mode.

License
=======

Modified BSD (3-clause)

Discussion and Development
==========================

Discussions take place on the mailing list

https://groups.google.com/group/pystatsmodels

and in the issue tracker. We are very interested in feedback
about usability and suggestions for improvements.

Bug Reports
===========

Bug reports can be submitted to the issue tracker at

https://github.com/statsmodels/statsmodels/issues

.. |Azure CI Build Status| image:: https://dev.azure.com/statsmodels/statsmodels-testing/_apis/build/status/statsmodels.statsmodels?branchName=main
   :target: https://dev.azure.com/statsmodels/statsmodels-testing/_build/latest?definitionId=1&branchName=main
.. |Codecov Coverage| image:: https://codecov.io/gh/statsmodels/statsmodels/branch/main/graph/badge.svg
   :target: https://codecov.io/gh/statsmodels/statsmodels
.. |Coveralls Coverage| image:: https://coveralls.io/repos/github/statsmodels/statsmodels/badge.svg?branch=main
   :target: https://coveralls.io/github/statsmodels/statsmodels?branch=main
.. |PyPI downloads| image:: https://img.shields.io/pypi/dm/statsmodels?label=PyPI%20Downloads
   :alt: PyPI - Downloads
   :target: https://pypi.org/project/statsmodels/
.. |Conda downloads| image:: https://img.shields.io/conda/dn/conda-forge/statsmodels.svg?label=Conda%20downloads
   :target: https://anaconda.org/conda-forge/statsmodels/
.. |PyPI Version| image:: https://img.shields.io/pypi/v/statsmodels.svg
   :target: https://pypi.org/project/statsmodels/
.. |Conda Version| image:: https://anaconda.org/conda-forge/statsmodels/badges/version.svg
   :target: https://anaconda.org/conda-forge/statsmodels/
.. |License| image:: https://img.shields.io/pypi/l/statsmodels.svg
   :target: https://github.com/statsmodels/statsmodels/blob/main/LICENSE.txt

Owner

  • Name: statsmodels
  • Login: statsmodels
  • Kind: organization
  • Email: pystatsmodels@googlegroups.com

Citation (CITATION.cff)

cff-version: 1.2.0
title: statsmodels
message: >-
  Please use following citation to cite statsmodels in
  scientific publications
type: software
authors:
  - given-names: Seabold
    family-names: Skipper
  - given-names: Perktold
    family-names: Josef
repository-code: 'https://github.com/statsmodels/statsmodels'
url: 'https://www.statsmodels.org/'
keywords:
  - python
  - data-science
  - statistics
  - prediction
  - econometrics
  - forecasting
  - data-analysis
  - regression-models
  - hypothesis-testing
  - generalized-linear-models
  - timeseries-analysis
  - robust-estimation
  - count-model
license: BSD-3-Clause
preferred-citation:
  type: article
  authors:
    - given-names: Seabold
      family-names: Skipper
    - given-names: Perktold
      family-names: Josef
  title: "statsmodels: Econometric and statistical modeling with python"
  journal: "9th Python in Science Conference"
  year: 2010

Committers

Last synced: 6 months ago

All Time
  • Total Commits: 12,403
  • Total Committers: 461
  • Avg Commits per committer: 26.905
  • Development Distribution Score (DDS): 0.785
Past Year
  • Commits: 202
  • Committers: 31
  • Avg Commits per committer: 6.516
  • Development Distribution Score (DDS): 0.441
Top Committers
Name Email Commits
Josef Perktold j****d@g****m 2,668
Skipper Seabold j****d@g****m 2,411
Kevin Sheppard k****d@g****m 1,388
Chad Fulton c****d@c****m 1,223
Kerby Shedden k****n@u****u 962
Brock Mendel j****l@g****m 473
Justin Grana j****a@s****u 204
thequackdaddy p****k@g****m 161
Vincent Arel-Bundock v****l@u****u 141
langmore i****e@g****m 130
Jonathan Taylor j****o@m****u 119
tim.leslie 115
Ralf Gommers r****s@g****m 101
Bart Baker b****r@g****m 100
vegcev a****s@s****t 97
Wes McKinney w****n@g****m 93
Agriya Khetarpal 7****l 87
Christopher Burns c****s@b****u 82
Samuel Scherrer s****r@p****e 82
Evgeny Zhurko e****o@g****m 79
Kevin Sheppard k****d@g****m 64
Enrico Giampieri e****i@u****t 50
Yichuan Liu y****4@g****m 44
Paul Hobson p****n@g****m 43
Jarrod Millman j****n@g****m 42
Matthew Brett m****t@g****m 39
Pamphile ROY r****e@g****m 36
tvanzyl t****l@g****m 30
Vincent Davis v****t@v****t 29
Alex Griffing a****i@n****u 25
and 431 more...

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 713
  • Total pull requests: 495
  • Average time to close issues: 8 months
  • Average time to close pull requests: 5 months
  • Total issue authors: 375
  • Total pull request authors: 114
  • Average comments per issue: 3.17
  • Average comments per pull request: 3.03
  • Merged pull requests: 296
  • Bot issues: 0
  • Bot pull requests: 18
Past Year
  • Issues: 155
  • Pull requests: 193
  • Average time to close issues: 7 days
  • Average time to close pull requests: 4 days
  • Issue authors: 94
  • Pull request authors: 37
  • Average comments per issue: 0.97
  • Average comments per pull request: 1.4
  • Merged pull requests: 119
  • Bot issues: 0
  • Bot pull requests: 10
Top Authors
Issue Authors
  • josef-pkt (285)
  • bashtage (12)
  • jseabold (5)
  • louisabraham (3)
  • kloczek (3)
  • quant12345 (3)
  • larsoner (3)
  • luke396 (3)
  • dblim (3)
  • celestinoxp (2)
  • EBoiSha (2)
  • marcdelabarrera (2)
  • amaranthjinn (2)
  • ahbon123 (2)
  • hoechenberger (2)
Pull Request Authors
  • bashtage (206)
  • josef-pkt (42)
  • dependabot[bot] (18)
  • jbrockmendel (12)
  • agriyakhetarpal (9)
  • luke396 (8)
  • boringbyte (6)
  • star1327p (5)
  • aglebov (5)
  • jseabold (5)
  • maxkuttner (4)
  • quant12345 (4)
  • EBoisseauSierra (4)
  • jalopezp (4)
  • mdruiter (4)
Top Labels
Issue Labels
type-enh (237) comp-stats (101) comp-discrete (72) comp-genmod (57) type-bug (52) comp-regression (51) comp-base (49) comp-robust (42) comp-docs (34) comp-tsa (31) topic-diagnostic (28) topic-post_estim (25) prio-elev (24) type-refactor (23) design (21) FAQ (20) topic-predict (14) question (13) comp-tsa-statespace (13) wishlist (12) topic-penalization (12) comp-multivariate (12) comp-causal (12) topic-covtype (11) comp-othermod (10) maintenance (10) prio-high (9) comp-tools (9) type-test (8) roadmap (7)
Pull Request Labels
type-enh (60) comp-stats (31) type-bug (30) maintenance (23) comp-docs (22) dependencies (18) comp-robust (16) comp-tsa (15) superseded (13) comp-genmod (13) prio-elev (13) type-refactor (9) comp-regression (8) comp-discrete (7) Documentation (7) comp-tsa-statespace (7) comp-graphics (6) comp-formula (5) github_actions (5) comp-io (3) needs discussion (3) prio-high (3) comp-base (3) comp-multivariate (3) backport maintenance/0.14.x (3) comp-distributions (2) corner-case (2) rejected (2) type-test (2) Performance (2)

Packages

  • Total packages: 6
  • Total downloads:
    • pypi 21,722,685 last-month
  • Total docker downloads: 384,168,899
  • Total dependent packages: 1,708
    (may contain duplicates)
  • Total dependent repositories: 25,954
    (may contain duplicates)
  • Total versions: 110
  • Total maintainers: 7
pypi.org: statsmodels

Statistical computations and models for Python

  • Versions: 40
  • Dependent Packages: 1,565
  • Dependent Repositories: 23,993
  • Downloads: 21,721,993 Last month
  • Docker Downloads: 384,168,899
Rankings
Dependent packages count: 0.0%
Dependent repos count: 0.0%
Downloads: 0.1%
Average: 0.2%
Docker downloads count: 0.3%
Forks count: 0.4%
Stargazers count: 0.6%
Last synced: 4 months ago
conda-forge.org: statsmodels
  • Versions: 16
  • Dependent Packages: 130
  • Dependent Repositories: 979
Rankings
Dependent packages count: 0.6%
Dependent repos count: 0.8%
Average: 1.7%
Forks count: 2.1%
Stargazers count: 3.3%
Last synced: 4 months ago
proxy.golang.org: github.com/statsmodels/statsmodels
  • Versions: 31
  • Dependent Packages: 0
  • Dependent Repositories: 1
Rankings
Forks count: 0.1%
Stargazers count: 0.7%
Average: 3.5%
Dependent repos count: 4.8%
Dependent packages count: 8.5%
Last synced: 4 months ago
anaconda.org: statsmodels

Statsmodels is a Python module that allows users to explore data, estimate statistical models, and perform statistical tests. An extensive list of descriptive statistics, statistical tests, plotting functions, and result statistics are available for different types of data and each estimator. Researchers across fields may find that statsmodels fully meets their needs for statistical computing and data analysis in Python.

  • Versions: 16
  • Dependent Packages: 13
  • Dependent Repositories: 979
Rankings
Dependent packages count: 4.1%
Dependent repos count: 4.7%
Average: 6.0%
Forks count: 6.1%
Stargazers count: 8.9%
Last synced: 4 months ago
pypi.org: sm2

Bugfix Fork of statsmodels

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 605 Last month
Rankings
Dependent packages count: 10.0%
Average: 16.8%
Downloads: 18.6%
Dependent repos count: 21.7%
Maintainers (1)
Last synced: about 1 year ago
pypi.org: statsmodels-dq

Statistical computations and models for Python

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 87 Last month
Rankings
Dependent packages count: 9.8%
Downloads: 15.6%
Average: 20.4%
Dependent repos count: 21.8%
Forks count: 22.7%
Stargazers count: 32.0%
Maintainers (1)
Last synced: 4 months ago

Dependencies

.github/workflows/backport.yml actions
  • tibdex/backport v2 composite
.github/workflows/codeql.yml actions
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
.github/workflows/generate-documentation.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • r-lib/actions/setup-pandoc v2 composite
  • ts-graphviz/setup-graphviz v1 composite
requirements-dev.txt pypi
  • colorama * development
  • cython >=0.29.28,<3.0.0 development
  • flake8 * development
  • isort * development
  • joblib * development
  • matplotlib >=3 development
  • oldest-supported-numpy >=2022.4.18 development
  • pytest * development
  • pytest-randomly * development
  • pytest-xdist * development
  • pywinpty * development
  • setuptools_scm * development
requirements-doc.txt pypi
  • arviz *
  • jinja2 ==3.0.3
  • jupyter *
  • nbconvert *
  • nbsphinx *
  • notebook *
  • numpydoc *
  • pandas-datareader *
  • pymc3 *
  • pyyaml *
  • seaborn *
  • simplegeneric *
  • sphinx ==5.3.0
  • sphinx-material *
  • theano-pymc *
requirements.txt pypi
  • numpy >=1.22.3
  • numpy >=1.18
  • packaging >=21.3
  • pandas >=1.0
  • patsy >=0.5.2
  • scipy >=1.4,
tools/R2nparray/DESCRIPTION cran
docs/source/_static/versions.json meteor
pyproject.toml pypi
setup.py pypi