Osprey

Osprey: Hyperparameter Optimization for Machine Learning - Published in JOSS (2016)

https://github.com/msmbuilder/osprey

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 9 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    7 of 14 committers (50.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

cross-validation hyperparameter-optimization machine-learning models optimization pretty-logo python scikit-learn

Scientific Fields

Artificial Intelligence and Machine Learning Computer Science - 32% confidence
Last synced: 4 months ago · JSON representation

Repository

🦅Hyperparameter optimization for machine learning pipelines 🦅

Basic Info
  • Host: GitHub
  • Owner: msmbuilder
  • License: apache-2.0
  • Language: Python
  • Default Branch: master
  • Homepage: http://msmbuilder.org/osprey
  • Size: 974 KB
Statistics
  • Stars: 73
  • Watchers: 11
  • Forks: 26
  • Open Issues: 18
  • Releases: 3
Topics
cross-validation hyperparameter-optimization machine-learning models optimization pretty-logo python scikit-learn
Created about 11 years ago · Last pushed about 5 years ago
Metadata Files
Readme Contributing License

README.md

Osprey

Build Status Coverage Status PyPi version License DOI Research software impact Documentation

Logo

Osprey is an easy-to-use tool for hyperparameter optimization of machine learning algorithms in Python using scikit-learn (or using scikit-learn compatible APIs).

Each Osprey experiment combines an dataset, an estimator, a search space (and engine), cross validation and asynchronous serialization for distributed parallel optimization of model hyperparameters.

Documentation

For full documentation, please visit the Osprey homepage.

Installation

If you have an Anaconda Python distribution, installation is as easy as: $ conda install -c omnia osprey

You can also install Osprey with pip: $ pip install osprey

Alternatively, you can install directly from this GitHub repo: $ git clone https://github.com/msmbuilder/osprey.git $ cd osprey && git checkout 1.1.0 $ python setup.py install

Example using MSMBuilder

Below is an example of an osprey config file to cross validate Markov state models based on varying the number of clusters and dihedral angles used in a model: ```yaml estimator: evalscope: msmbuilder eval: | Pipeline([ ('featurizer', DihedralFeaturizer(types=['phi', 'psi'])), ('cluster', MiniBatchKMeans()), ('msm', MarkovStateModel(ntimescales=5, verbose=False)), ])

searchspace: clusternclusters: min: 10 max: 100 type: int featurizer__types: choices: - ['phi', 'psi'] - ['phi', 'psi', 'chi1'] type: enum

cv: 5

dataset_loader: name: mdtraj params: trajectories: ~/local/msmbuilder/Tutorial/XTC//.xtc topology: ~/local/msmbuilder/Tutorial/native.pdb stride: 1

trials: uri: sqlite:///osprey-trials.db ```

Then run osprey worker. You can run multiple parallel instances of osprey worker simultaneously on a cluster too.

``` $ osprey worker config.yaml

...


Beginning iteration 1 / 1

History contains: 0 trials Choosing next hyperparameters with random... {'clustern_clusters': 20, 'featurizertypes': ['phi', 'psi']}

Fitting 5 folds for each of 1 candidates, totalling 5 fits [Parallel(n_jobs=1)]: Done 1 jobs | elapsed: 0.3s

[Parallel(n_jobs=1)]: Done 5 out of 5 | elapsed: 1.8s finished

Success! Model score = 4.080646

(best score so far = 4.080646)

1/1 models fit successfully. time: October 27, 2014 10:44 PM elapsed: 4 seconds. osprey worker exiting. `` You can dump the database to JSON or CSV withosprey dump`.

Dependencies

  • python>=2.7.11
  • six>=1.10.0
  • pyyaml>=3.11
  • numpy>=1.10.4
  • scipy>=0.17.0
  • scikit-learn>=0.17.0
  • sqlalchemy>=1.0.10
  • bokeh>=0.12.0
  • matplotlib>=1.5.0
  • pandas>=0.18.0
  • GPy (optional, required for gp strategy)
  • hyperopt (optional, required for hyperopt_tpe strategy)
  • nose (optional, for testing)

Contributing

In case you encounter any issues with this package, please consider submitting a ticket to the GitHub Issue Tracker. We also welcome any feature requests and highly encourage users to submit pull requests for bug fixes and improvements.

For more detailed information, please refer to our documentation.

Citing

If you use Osprey in your research, please cite:

bibtex @misc{osprey, author = {Robert T. McGibbon and Carlos X. Hernández and Matthew P. Harrigan and Steven Kearnes and Mohammad M. Sultan and Stanislaw Jastrzebski and Brooke E. Husic and Vijay S. Pande}, title = {Osprey: Hyperparameter Optimization for Machine Learning}, month = sep, year = 2016, doi = {10.21105/joss.000341}, url = {http://dx.doi.org/10.21105/joss.00034} }

Owner

  • Name: MSMBuilder
  • Login: msmbuilder
  • Kind: organization
  • Email: msmbuilder-user@lists.stanford.edu

Statistical models for biomolecular dynamics

JOSS Publication

Osprey: Hyperparameter Optimization for Machine Learning
Published
September 07, 2016
Volume 1, Issue 5, Page 34
Authors
Robert T. McGibbon
Stanford University
Carlos X. Hernández ORCID
Stanford University
Matthew P. Harrigan
Stanford University
Steven Kearnes
Stanford University
Mohammad M. Sultan
Stanford University
Stanislaw Jastrzebski
Jagiellonian University
Brooke E. Husic
Stanford University
Vijay S. Pande
Stanford University
Editor
Arfon Smith ORCID
Tags
optimization cross-validation machine learning

Papers & Mentions

Total mentions: 3

Visualization of protein interaction networks: problems and solutions
Last synced: 2 months ago
Novel, provable algorithms for efficient ensemble-based computational protein design and their application to the redesign of the c-Raf-RBD:KRas protein-protein interface
Last synced: 2 months ago
A critical analysis of computational protein design with sparse residue interaction graphs
Last synced: 2 months ago

GitHub Events

Total
Last Year

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 531
  • Total Committers: 14
  • Avg Commits per committer: 37.929
  • Development Distribution Score (DDS): 0.682
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Robert McGibbon r****o@g****m 169
Carlos Hernandez c****h@s****u 126
Robert Arbon r****n@g****m 48
Carlos Hernández c****z 46
Matthew Harrigan h****n@s****u 45
skearnes k****s@s****u 33
Unknown r****n@b****k 31
Mohammad Muneeb Sultan m****n@g****m 21
Steven Kearnes s****s@g****m 3
Stanislw Jastrzebski s****i@g****m 2
Joshua L. Adelman j****n@g****m 2
Brooke Husic b****c@s****u 2
Juan Eiros j****4@i****k 2
bhusic@stanford.edu b****c@s****u 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 52
  • Total pull requests: 48
  • Average time to close issues: 4 months
  • Average time to close pull requests: about 2 months
  • Total issue authors: 11
  • Total pull request authors: 6
  • Average comments per issue: 2.13
  • Average comments per pull request: 0.33
  • Merged pull requests: 42
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • cxhernandez (22)
  • RobertArbon (7)
  • synapticarbors (5)
  • jeiros (5)
  • brookehus (4)
  • mpharrigan (4)
  • msultan (1)
  • orestxherija (1)
  • rafwiewiora (1)
  • rsatijaUT (1)
  • nhstanley (1)
Pull Request Authors
  • cxhernandez (31)
  • RobertArbon (9)
  • mpharrigan (5)
  • synapticarbors (1)
  • brookehus (1)
  • jeiros (1)
Top Labels
Issue Labels
bug (8) docs (7)
Pull Request Labels
docs (7) bug (4) needs docs (2) needs test (1) tested (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 63 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 2
  • Total versions: 6
  • Total maintainers: 2
pypi.org: osprey

|Build Status| |Coverage Status| |PyPi version| [|License|] (http://www.apache.org/licenses/LICENSE-2.0) |DOI| [|Documentation|] (http://msmbuilder.org/osprey)

  • Versions: 6
  • Dependent Packages: 0
  • Dependent Repositories: 2
  • Downloads: 63 Last month
Rankings
Forks count: 7.6%
Stargazers count: 8.0%
Dependent packages count: 10.1%
Dependent repos count: 11.5%
Average: 11.9%
Downloads: 22.3%
Maintainers (2)
Last synced: 4 months ago

Dependencies

docs/requirements.txt pypi
  • numpydoc =0.7
  • python *
requirements.txt pypi
  • bokeh >=0.12.0
  • matplotlib >=1.5.0
  • numpy >=1.10.4
  • pandas >=0.18.0
  • pyyaml >=3.11
  • scikit-learn >=0.17.0
  • scipy >=0.17.0
  • six >=1.10.0
  • sqlalchemy >=1.0.10