openml
OpenML's Python API for a World of Data and More 💫
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
â—‹DOI references
-
â—‹Academic publication links
-
✓Committers with academic emails
5 of 53 committers (9.4%) from academic institutions -
â—‹Institutional organization owner
-
â—‹JOSS paper metadata
-
â—‹Scientific vocabulary similarity
Low similarity (15.9%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
OpenML's Python API for a World of Data and More 💫
Basic Info
- Host: GitHub
- Owner: openml
- License: other
- Language: Python
- Default Branch: develop
- Homepage: http://openml.github.io/openml-python/latest/
- Size: 201 MB
Statistics
- Stars: 304
- Watchers: 22
- Forks: 153
- Open Issues: 94
- Releases: 14
Topics
Metadata Files
README.md
OpenML-Python
The Python API for a World of Data and More :dizzy:
OpenML-Python provides an easy-to-use and straightforward Python interface for OpenML, an online platform for open science collaboration in machine learning. It can download or upload data from OpenML, such as datasets and machine learning experiment results.
:joystick: Minimal Example
Use the following code to get the credit-g dataset:
```python import openml
dataset = openml.datasets.getdataset("credit-g") # or by ID getdataset(31) X, y, categoricalindicator, attributenames = dataset.get_data(target="class") ```
Get a task for supervised classification on credit-g:
```python import openml
task = openml.tasks.gettask(31) dataset = task.getdataset() X, y, categoricalindicator, attributenames = dataset.getdata(target=task.targetname)
get splits for the first fold of 10-fold cross-validation
trainindices, testindices = task.gettraintestsplitindices(fold=0) ```
Use an OpenML benchmarking suite to get a curated list of machine-learning tasks: ```python import openml
suite = openml.study.getsuite("amlb-classification-all") # Get a curated list of tasks for classification for taskid in suite.tasks: task = openml.tasks.gettask(taskid) ```
:magic_wand: Installation
OpenML-Python is supported on Python 3.8 - 3.13 and is available on Linux, MacOS, and Windows.
You can install OpenML-Python with:
bash
pip install openml
:pagefacingup: Citing OpenML-Python
If you use OpenML-Python in a scientific publication, we would appreciate a reference to the following paper:
Bibtex entry:
bibtex
@article{JMLR:v22:19-920,
author = {Matthias Feurer and Jan N. van Rijn and Arlind Kadra and Pieter Gijsbers and Neeratyoy Mallik and Sahithya Ravi and Andreas Müller and Joaquin Vanschoren and Frank Hutter},
title = {OpenML-Python: an extensible Python API for OpenML},
journal = {Journal of Machine Learning Research},
year = {2021},
volume = {22},
number = {100},
pages = {1--5},
url = {http://jmlr.org/papers/v22/19-920.html}
}
Owner
- Name: OpenML
- Login: openml
- Kind: organization
- Email: openmlhq@googlegroups.com
- Location: The Future
- Website: http://www.openml.org
- Twitter: open_ml
- Repositories: 56
- Profile: https://github.com/openml
Open, Networked Machine Learning
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software in a publication, please cite the metadata from preferred-citation."
preferred-citation:
type: article
authors:
- family-names: "Feurer"
given-names: "Matthias"
orcid: "https://orcid.org/0000-0001-9611-8588"
- family-names: "van Rijn"
given-names: "Jan N."
orcid: "https://orcid.org/0000-0003-2898-2168"
- family-names: "Kadra"
given-names: "Arlind"
- family-names: "Gijsbers"
given-names: "Pieter"
orcid: "https://orcid.org/0000-0001-7346-8075"
- family-names: "Mallik"
given-names: "Neeratyoy"
orcid: "https://orcid.org/0000-0002-0598-1608"
- family-names: "Ravi"
given-names: "Sahithya"
- family-names: "Müller"
given-names: "Andreas"
orcid: "https://orcid.org/0000-0002-2349-9428"
- family-names: "Vanschoren"
given-names: "Joaquin"
orcid: "https://orcid.org/0000-0001-7044-9805"
- family-names: "Hutter"
given-names: "Frank"
orcid: "https://orcid.org/0000-0002-2037-3694"
journal: "Journal of Machine Learning Research"
title: "OpenML-Python: an extensible Python API for OpenML"
abstract: "OpenML is an online platform for open science collaboration in machine learning, used to share datasets and results of machine learning experiments. In this paper, we introduce OpenML-Python, a client API for Python, which opens up the OpenML platform for a wide range of Python-based machine learning tools. It provides easy access to all datasets, tasks and experiments on OpenML from within Python. It also provides functionality to conduct machine learning experiments, upload the results to OpenML, and reproduce results which are stored on OpenML. Furthermore, it comes with a scikit-learn extension and an extension mechanism to easily integrate other machine learning libraries written in Python into the OpenML ecosystem. Source code and documentation are available at https://github.com/openml/openml-python/."
volume: 22
year: 2021
start: 1
end: 5
pages: 5
number: 100
url: https://jmlr.org/papers/v22/19-920.html
GitHub Events
Total
- Create event: 35
- Release event: 1
- Issues event: 25
- Watch event: 25
- Delete event: 42
- Issue comment event: 95
- Push event: 189
- Pull request review comment event: 48
- Pull request review event: 82
- Pull request event: 83
- Fork event: 12
Last Year
- Create event: 35
- Release event: 1
- Issues event: 25
- Watch event: 25
- Delete event: 42
- Issue comment event: 95
- Push event: 189
- Pull request review comment event: 48
- Pull request review event: 82
- Pull request event: 83
- Fork event: 12
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Matthias Feurer | f****m@i****e | 411 |
| Jan van Rijn | j****n@g****m | 289 |
| PGijsbers | p****s@t****l | 159 |
| Andreas Mueller | a****r@n****u | 134 |
| neeratyoy | n****y@g****m | 103 |
| Joaquin Vanschoren | j****n@g****m | 32 |
| Lennart Purucker | p****r@c****e | 30 |
| sahithyaravi1493 | s****3@g****m | 29 |
| Arlind Kadra | a****a@g****m | 28 |
| Zardaloop | f****i@g****m | 22 |
| dependabot[bot] | 4****] | 13 |
| Sahithya Ravi | 4****3 | 13 |
| Eddie Bergman | e****s@g****m | 12 |
| pre-commit-ci[bot] | 6****] | 12 |
| janvanrijn | v****n@c****e | 11 |
| Guillaume Lemaitre | g****8@g****m | 7 |
| a-moadel | 4****l | 4 |
| Jesper van Engelen | c****t@j****l | 3 |
| Vishal Parmar | v****2@g****m | 3 |
| allcontributors[bot] | 4****] | 2 |
| nabenabe0928 | s****o@g****m | 2 |
| prabhant | p****h@g****m | 2 |
| toon | t****k@g****m | 2 |
| Pieter Gijsbers | P****s | 1 |
| Abraham Francis | a****9@g****m | 1 |
| zikun | 3****n | 1 |
| chadmarchand | 3****d | 1 |
| William Raynaut | w****t@g****m | 1 |
| Will Martin | 3****n | 1 |
| Tim Andrews | t****1@g****m | 1 |
| and 23 more... | ||
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 108
- Total pull requests: 207
- Average time to close issues: about 2 years
- Average time to close pull requests: 3 months
- Total issue authors: 36
- Total pull request authors: 30
- Average comments per issue: 2.55
- Average comments per pull request: 2.05
- Merged pull requests: 138
- Bot issues: 0
- Bot pull requests: 30
Past Year
- Issues: 7
- Pull requests: 84
- Average time to close issues: 3 months
- Average time to close pull requests: 15 days
- Issue authors: 6
- Pull request authors: 13
- Average comments per issue: 0.71
- Average comments per pull request: 1.13
- Merged pull requests: 45
- Bot issues: 0
- Bot pull requests: 4
Top Authors
Issue Authors
- PGijsbers (32)
- mfeurer (11)
- joaquinvanschoren (7)
- eddiebergman (7)
- amueller (5)
- ArlindKadra (5)
- ArturDev42 (4)
- LennartPurucker (2)
- janvanrijn (2)
- Neeratyoy (2)
- learsi1911 (2)
- pseudotensor (2)
- Taniya-Das (2)
- remram44 (1)
- xieleo5 (1)
Pull Request Authors
- LennartPurucker (68)
- PGijsbers (58)
- dependabot[bot] (28)
- eddiebergman (21)
- pre-commit-ci[bot] (19)
- mfeurer (16)
- v-parmar (10)
- SubhadityaMukherjee (9)
- samplecatalina (6)
- knyazer (4)
- janvanrijn (4)
- Kang13531 (4)
- ArlindKadra (3)
- BrunoBelucci (2)
- Taniya-Das (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 3
-
Total downloads:
- pypi 29,364 last-month
- Total docker downloads: 2,304
-
Total dependent packages: 29
(may contain duplicates) -
Total dependent repositories: 188
(may contain duplicates) - Total versions: 22
- Total maintainers: 3
pypi.org: openml
Python API for OpenML
- Documentation: https://openml.readthedocs.io/
- License: BSD 3-Clause License Copyright (c) 2014-2019, Matthias Feurer, Jan van Rijn, Andreas Müller, Joaquin Vanschoren and others. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. License of the files CONTRIBUTING.md, ISSUE_TEMPLATE.md and PULL_REQUEST_TEMPLATE.md: Those files are modifications of the respecting templates in scikit-learn and they are licensed under a New BSD license: New BSD License Copyright (c) 2007–2018 The scikit-learn developers. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: a. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. b. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. c. Neither the name of the Scikit-learn Developers nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
Latest release: 0.15.1
published about 1 year ago
Rankings
Maintainers (3)
conda-forge.org: openml
- Homepage: https://openml.org/
- License: BSD-3-Clause
-
Latest release: 0.12.2
published over 4 years ago
Rankings
anaconda.org: openml
OpenML-Python provides an easy-to-use and straightforward Python interface for OpenML, an online platform for open science collaboration in machine learning. It can download or upload data from OpenML, such as datasets and machine learning experiment results.
- Homepage: https://openml.org
- License: BSD-3-Clause
-
Latest release: 0.15.1
published 7 months ago
Rankings
Dependencies
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v2 composite
- docker/build-push-action v2 composite
- docker/login-action v1 composite
- docker/setup-buildx-action v1 composite
- docker/setup-qemu-action v1 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- codecov/codecov-action v1 composite
- python 3 build
- liac-arff >=2.4.0
- minio *
- numpy >=1.6.2
- pandas >=1.0.0
- pyarrow *
- python-dateutil *
- requests *
- scikit-learn >=0.18
- scipy >=0.13.3
- xmltodict *