Pyglmnet

Pyglmnet: Python implementation of elastic-net regularized generalized linear models - Published in JOSS (2020)

https://github.com/glm-tools/pyglmnet

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 9 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Committers with academic emails
    3 of 29 committers (10.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

data-science elastic-net glm lasso machine-learning python

Scientific Fields

Mathematics Computer Science - 84% confidence
Last synced: 4 months ago · JSON representation

Repository

Python implementation of elastic-net regularized generalized linear models

Basic Info
Statistics
  • Stars: 284
  • Watchers: 12
  • Forks: 84
  • Open Issues: 39
  • Releases: 3
Topics
data-science elastic-net glm lasso machine-learning python
Created over 9 years ago · Last pushed over 2 years ago
Metadata Files
Readme License Zenodo

README.rst

pyglmnet
========

A python implementation of elastic-net regularized generalized linear models
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

|License| |Travis| |Codecov| |Circle| |Gitter| |DOI| |JOSS|

`[Documentation (stable version)]`_ `[Documentation (development version)]`_

.. image:: https://user-images.githubusercontent.com/15852194/67919367-70482600-fb76-11e9-9b86-891969bd2bee.jpg

-  Pyglmnet provides a wide range of noise models (and paired canonical
   link functions): ``'gaussian'``, ``'binomial'``, ``'probit'``,
   ``'gamma'``, '``poisson``', and ``'softplus'``.

-  It supports a wide range of regularizers: ridge, lasso, elastic net,
   `group
   lasso `__,
   and `Tikhonov
   regularization `__.

-  We have implemented a cyclical coordinate descent optimizer with
   Newton update, active sets, update caching, and warm restarts. This
   optimization approach is identical to the one used in R package.

-  A number of Python wrappers exist for the R glmnet package (e.g.
   `here `__ and
   `here `__) but in contrast to
   these, Pyglmnet is a pure python implementation. Therefore, it is
   easy to modify and introduce additional noise models and regularizers
   in the future.

Installation
~~~~~~~~~~~~

Install the stable PyPI version with ``pip``

.. code:: bash

    $ pip install pyglmnet

For the bleeding edge development version:

Clone the repository.

.. code:: bash

    $ pip install https://api.github.com/repos/glm-tools/pyglmnet/zipball/master

Getting Started
~~~~~~~~~~~~~~~


Here is an example on how to use the ``GLM`` estimator.

.. code:: python

    import numpy as np
    import scipy.sparse as sps

    import matplotlib.pyplot as plt
    from pyglmnet import GLM, simulate_glm

    n_samples, n_features = 1000, 100
    distr = 'poisson'

    # sample a sparse model
    np.random.seed(42)
    beta0 = np.random.rand()
    beta = sps.random(1, n_features, density=0.2).toarray()[0]

    # simulate data
    Xtrain = np.random.normal(0.0, 1.0, [n_samples, n_features])
    ytrain = simulate_glm('poisson', beta0, beta, Xtrain)
    Xtest = np.random.normal(0.0, 1.0, [n_samples, n_features])
    ytest = simulate_glm('poisson', beta0, beta, Xtest)

    # create an instance of the GLM class
    glm = GLM(distr='poisson', score_metric='pseudo_R2', reg_lambda=0.01)

    # fit the model on the training data
    glm.fit(Xtrain, ytrain)

    # predict using fitted model on the test data
    yhat = glm.predict(Xtest)

    # score the model on test data
    pseudo_R2 = glm.score(Xtest, ytest)
    print('Pseudo R^2 is %.3f' % pseudo_R2)

    # plot the true coefficients and the estimated ones
    plt.stem(beta, markerfmt='r.', label='True coefficients')
    plt.stem(glm.beta_, markerfmt='b.', label='Estimated coefficients')
    plt.ylabel(r'$\beta$')
    plt.legend(loc='upper right')

    # plot the true vs predicted label
    plt.figure()
    plt.plot(ytest, yhat, '.')
    plt.xlabel('True labels')
    plt.ylabel('Predicted labels')
    plt.plot([0, ytest.max()], [0, ytest.max()], 'r--')
    plt.show()

`More pyglmnet examples and use
cases `__.

Tutorial
~~~~~~~~

Here is an `extensive
tutorial `__ on GLMs,
optimization and pseudo-code.

Here are
`slides `__ from a
talk at `PyData Chicago
2016 `__,
corresponding `tutorial
notebooks `__ and a
`video `__.

How to contribute?
~~~~~~~~~~~~~~~~~~

We welcome pull requests. Please see our `developer documentation
page `__ for more
details.

Citation
~~~~~~~~

If you use ``pyglmnet`` package in your publication, please cite us from
our `JOSS publication `__ using the following BibTex

.. code::

   @article{Jas2020,
   doi = {10.21105/joss.01959},
   url = {https://doi.org/10.21105/joss.01959},
   year = {2020},
   publisher = {The Open Journal},
   volume = {5},
   number = {47},
   pages = {1959},
   author = {Mainak Jas and Titipat Achakulvisut and Aid Idrizović
             and Daniel Acuna and Matthew Antalek and Vinicius Marques
             and Tommy Odland and Ravi Garg and Mayank Agrawal
             and Yu Umegaki and Peter Foley and Hugo Fernandes
             and Drew Harris and Beibin Li and Olivier Pieters
             and Scott Otterson and Giovanni De Toni and Chris Rodgers
             and Eva Dyer and Matti Hamalainen and Konrad Kording and Pavan Ramkumar},
   title = {{P}yglmnet: {P}ython implementation of elastic-net regularized generalized linear models},
   journal = {Journal of Open Source Software}
   }

Acknowledgments
~~~~~~~~~~~~~~~

-  `Konrad Kording `__ for funding and support
-  `Sara
   Solla `__
   for masterful GLM lectures

License
~~~~~~~

MIT License Copyright (c) 2016-2019 Pavan Ramkumar

.. |License| image:: https://img.shields.io/badge/license-MIT-blue.svg?style=flat
   :target: https://github.com/glm-tools/pyglmnet/blob/master/LICENSE
.. |Travis| image:: https://api.travis-ci.org/glm-tools/pyglmnet.svg?branch=master
   :target: https://travis-ci.org/glm-tools/pyglmnet
.. |Codecov| image:: https://codecov.io/github/glm-tools/pyglmnet/coverage.svg?precision=0
   :target: https://codecov.io/gh/glm-tools/pyglmnet
.. |Circle| image:: https://circleci.com/gh/glm-tools/pyglmnet.svg?style=svg
   :target: https://circleci.com/gh/glm-tools/pyglmnet
.. |Gitter| image:: https://badges.gitter.im/glm-tools/pyglmnet.svg
   :target: https://gitter.im/pavanramkumar/pyglmnet?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge
.. |DOI| image:: https://zenodo.org/badge/55302570.svg
   :target: https://zenodo.org/badge/latestdoi/55302570
.. |JOSS| image:: https://joss.theoj.org/papers/10.21105/joss.01959/status.svg
   :target: https://doi.org/10.21105/joss.01959
.. _[Documentation (stable version)]: http://glm-tools.github.io/pyglmnet

Owner

  • Name: glm-tools
  • Login: glm-tools
  • Kind: organization

JOSS Publication

Pyglmnet: Python implementation of elastic-net regularized generalized linear models
Published
March 01, 2020
Volume 5, Issue 47, Page 1959
Authors
Mainak Jas ORCID
Massachusetts General Hospital, Harvard Medical School
Titipat Achakulvisut ORCID
University of Pennsylvania
Aid Idrizović
Loyola University
Daniel Acuna
University of Syracuse
Matthew Antalek
Northwestern University
Vinicius Marques
Loyola University
Tommy Odland
Sonat Consulting
Ravi Prakash Garg
Northwestern University
Mayank Agrawal
Princeton University
Yu Umegaki
NTT DATA Mathematical Systems Inc
Peter Foley ORCID
605
Hugo Fernandes ORCID
Rockets of Awesome
Drew Harris
Epoch Capital
Beibin Li
University of Washington
Olivier Pieters ORCID
IDLab-AIRO -- Ghent University -- imec, Research Institute for Agriculture, Fisheries and Food
Scott Otterson
Clean Power Research
Giovanni De Toni ORCID
University of Trento
Chris Rodgers ORCID
Columbia University
Eva Dyer
Georgia Tech
Matti Hamalainen
Massachusetts General Hospital, Harvard Medical School
Konrad Kording
University of Pennsylvania
Pavan Ramkumar ORCID
System1 Biosciences Inc
Editor
Ariel Rokem ORCID
Tags
glm machine-learning lasso elastic-net group-lasso

Papers & Mentions

Total mentions: 2

Active learning of cortical connectivity from two-photon imaging data
Last synced: 3 months ago
Modern Machine Learning as a Benchmark for Fitting Neural Responses
Last synced: 3 months ago

GitHub Events

Total
  • Watch event: 6
  • Fork event: 1
Last Year
  • Watch event: 6
  • Fork event: 1

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 736
  • Total Committers: 29
  • Avg Commits per committer: 25.379
  • Development Distribution Score (DDS): 0.598
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Pavan Ramkumar p****r@g****m 296
Mainak Jas m****s@g****m 177
Titipat Achakulvisut m****t@g****m 59
Giovanni De Toni g****t@g****m 37
titipata t****a@u****u 28
Aid Idrizović i****d@g****m 17
Matthew Antalek m****k@g****m 16
Daniel E. Acuna d****a@g****m 15
Vinicius de F. Marques v****e@g****m 14
Tommy Odland t****d@g****m 13
Mayank Agrawal m****6@g****m 9
Ravi Prakash Garg r****7@g****m 9
Peter Foley p****y@g****m 7
Yu Umegaki y****i@g****m 7
Hugo Fernandes h****h@g****m 4
Olivier Pieters o****s@u****e 4
Hugo L Fernandes h****s@n****u 4
Thomas Kunwar y****i 3
Beibin Li B****i 2
Chris Rodgers x****s@g****m 2
Drew Harris d****z@g****m 2
Pattarawat Chormai p****i@g****m 2
Scott Otterson s****o@s****g 2
Talles Alves t****v@l****u 2
Eva Dyer e****h@g****m 1
Ilias Tapeinos I****9@g****m 1
Yu Umegaki u****i@m****p 1
Zan Markan z****n@m****e 1
debroize 4****e 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 49
  • Total pull requests: 53
  • Average time to close issues: 9 months
  • Average time to close pull requests: 19 days
  • Total issue authors: 25
  • Total pull request authors: 12
  • Average comments per issue: 3.49
  • Average comments per pull request: 5.68
  • Merged pull requests: 42
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • pavanramkumar (12)
  • jasmainak (10)
  • titipata (3)
  • cxrodgers (2)
  • geektoni (2)
  • whoisnnamdi (1)
  • foster999 (1)
  • jpainam (1)
  • cyradil (1)
  • AnchorBlues (1)
  • ivarzap (1)
  • themantalope (1)
  • idc9 (1)
  • ravigarg27 (1)
  • cruyffturn (1)
Pull Request Authors
  • jasmainak (17)
  • pavanramkumar (9)
  • titipata (8)
  • yathomasi (6)
  • geektoni (5)
  • p16i (2)
  • timshell (1)
  • opieters (1)
  • zmarkan (1)
  • AnchorBlues (1)
  • ita07 (1)
Top Labels
Issue Labels
MAINT (9) enhancement (6) easy (6) docs (5) bug (3) API (1) tests (1)
Pull Request Labels
docs (1)

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 2,295 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 5
    (may contain duplicates)
  • Total versions: 3
  • Total maintainers: 2
pypi.org: pyglmnet

Elastic-net regularized generalized linear models.

  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 5
  • Downloads: 2,295 Last month
  • Docker Downloads: 0
Rankings
Docker downloads count: 3.7%
Stargazers count: 3.9%
Forks count: 4.9%
Average: 6.0%
Dependent repos count: 6.6%
Downloads: 6.6%
Dependent packages count: 10.1%
Maintainers (2)
Last synced: 4 months ago
conda-forge.org: pyglmnet
  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Forks count: 18.9%
Stargazers count: 21.7%
Average: 31.5%
Dependent repos count: 34.0%
Dependent packages count: 51.2%
Last synced: 4 months ago

Dependencies

requirements.txt pypi
  • coverage *
  • flake8 *
  • numpy >=1.11
  • pandas >=0.20
  • pydocstyle *
  • pytest *
  • pytest-cov *
  • scikit-learn >=0.18
  • scipy >=0.17
  • tqdm >=4.46