autopytorch
Automatic architecture search and hyperparameter optimization for PyTorch
Science Score: 77.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org -
✓Committers with academic emails
2 of 13 committers (15.4%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.7%) to scientific vocabulary
Keywords
Repository
Automatic architecture search and hyperparameter optimization for PyTorch
Basic Info
Statistics
- Stars: 2,477
- Watchers: 46
- Forks: 300
- Open Issues: 75
- Releases: 4
Topics
Metadata Files
README.md
Auto-PyTorch
Copyright (C) 2021 AutoML Groups Freiburg and Hannover
While early AutoML frameworks focused on optimizing traditional ML pipelines and their hyperparameters, another trend in AutoML is to focus on neural architecture search. To bring the best of these two worlds together, we developed Auto-PyTorch, which jointly and robustly optimizes the network architecture and the training hyperparameters to enable fully automated deep learning (AutoDL).
Auto-PyTorch is mainly developed to support tabular data (classification, regression) and time series data (forecasting). The newest features in Auto-PyTorch for tabular data are described in the paper "Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL" (see below for bibtex ref). Details about Auto-PyTorch for multi-horizontal time series forecasting tasks can be found in the paper "Efficient Automated Deep Learning for Time Series Forecasting" (also see below for bibtex ref).
Also, find the documentation here.
From v0.1.0, AutoPyTorch has been updated to further improve usability, robustness and efficiency by using SMAC as the underlying optimization package as well as changing the code structure. Therefore, moving from v0.0.2 to v0.1.0 will break compatibility.
In case you would like to use the old API, you can find it at master_old.
Workflow
The rough description of the workflow of Auto-Pytorch is drawn in the following figure.

In the figure, Data is provided by user and
Portfolio is a set of configurations of neural networks that work well on diverse datasets.
The current version only supports the greedy portfolio as described in the paper Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL
This portfolio is used to warm-start the optimization of SMAC.
In other words, we evaluate the portfolio on a provided data as initial configurations.
Then API starts the following procedures:
1. Validate input data: Process each data type, e.g. encoding categorical data, so that Auto-Pytorch can handled.
2. Create dataset: Create a dataset that can be handled in this API with a choice of cross validation or holdout splits.
3. Evaluate baselines
* Tabular dataset 1: Train each algorithm in the predefined pool with a fixed hyperparameter configuration and dummy model from sklearn.dummy that represents the worst possible performance.
* *Time Series Forecasting dataset** : Train a dummy predictor that repeats the last observed value in each series
4. Search by SMAC:\
a. Determine budget and cut-off rules by Hyperband\
b. Sample a pipeline hyperparameter configuration *2 by SMAC\
c. Update the observations by obtained results\
d. Repeat a. -- c. until the budget runs out
5. Build the best ensemble for the provided dataset from the observations and model selection of the ensemble.
*1: Baselines are a predefined pool of machine learning algorithms, e.g. LightGBM and support vector machine, to solve either regression or classification task on the provided dataset
*2: A pipeline hyperparameter configuration specifies the choice of components, e.g. target algorithm, the shape of neural networks, in each step and (which specifies the choice of components in each step and their corresponding hyperparameters.
Installation
PyPI Installation
```sh
pip install autoPyTorch
```
Auto-PyTorch for Time Series Forecasting requires additional dependencies
```sh
pip install autoPyTorch[forecasting]
```
Manual Installation
We recommend using Anaconda for developing as follows:
```sh
Following commands assume the user is in a cloned directory of Auto-Pytorch
We also need to initialize the automl_common repository as follows
You can find more information about this here:
https://github.com/automl/automl_common/
git submodule update --init --recursive
Create the environment
conda create -n auto-pytorch python=3.8 conda activate auto-pytorch conda install swig python setup.py install
```
Similarly, to install all the dependencies for Auto-PyTorch-TimeSeriesForecasting:
```sh
git submodule update --init --recursive
conda create -n auto-pytorch python=3.8 conda activate auto-pytorch conda install swig pip install -e[forecasting]
```
Examples
In a nutshell:
```py from autoPyTorch.api.tabular_classification import TabularClassificationTask
data and metric imports
import sklearn.modelselection import sklearn.datasets import sklearn.metrics X, y = sklearn.datasets.loaddigits(returnXy=True) Xtrain, Xtest, ytrain, ytest = \ sklearn.modelselection.traintestsplit(X, y, randomstate=1)
initialise Auto-PyTorch api
api = TabularClassificationTask()
Search for an ensemble of machine learning algorithms
api.search( Xtrain=Xtrain, ytrain=ytrain, Xtest=Xtest, ytest=ytest, optimizemetric='accuracy', totalwalltimelimit=300, funcevaltimelimit_secs=50 )
Calculate test accuracy
ypred = api.predict(Xtest) score = api.score(ypred, ytest) print("Accuracy score", score) ```
For Time Series Forecasting Tasks ```py
from autoPyTorch.api.timeseriesforecasting import TimeSeriesForecastingTask
data and metric imports
from sktime.datasets import loadlongley targets, features = loadlongley()
define the forecasting horizon
forecasting_horizon = 3
Dataset optimized by APT-TS can be a list of np.ndarray/ pd.DataFrame where each series represents an element in the
list, or a single pd.DataFrame that records the series
index information: to which series the timestep belongs? This id can be stored as the DataFrame's index or a separate
column
Within each series, we take the last forecasting_horizon as test targets. The items before that as training targets
Normally the value to be forecasted should follow the training sets
ytrain = [targets[: -forecastinghorizon]] ytest = [targets[-forecastinghorizon:]]
same for features. For uni-variant models, Xtrain, Xtest can be omitted and set as None
Xtrain = [features[: -forecastinghorizon]]
Here x_test indicates the 'known future features': they are the features known previously, features that are unknown
could be replaced with NAN or zeros (which will not be used by our networks). If no feature is known beforehand,
we could also omit X_test
knownfuturefeatures = list(features.columns) Xtest = [features[-forecastinghorizon:]]
starttimes = [targets.index.totimestamp()[0]] freq = '1Y'
initialise Auto-PyTorch api
api = TimeSeriesForecastingTask()
Search for an ensemble of machine learning algorithms
api.search( Xtrain=Xtrain, ytrain=ytrain, Xtest=Xtest, optimizemetric='meanMAPEforecasting', npredictionsteps=forecastinghorizon, memorylimit=16 * 1024, # Currently, forecasting models use much more memories freq=freq, starttimes=starttimes, funcevaltimelimitsecs=50, totalwalltimelimit=60, minnumtestinstances=1000, # proxy validation sets. This only works for the tasks with more than 1000 series knownfuturefeatures=knownfuturefeatures, )
our dataset could directly generate sequences for new datasets
testsets = api.dataset.generatetest_seqs()
Calculate test accuracy
ypred = api.predict(testsets) score = api.score(ypred, ytest) print("Forecasting score", score) ```
For more examples including customising the search space, parellising the code, etc, checkout the examples folder
sh
$ cd examples/
Code for the paper is available under examples/ensemble in the TPAMI.2021.3067763 branch.
Contributing
If you want to contribute to Auto-PyTorch, clone the repository and checkout our current development branch
sh
$ git checkout development
License
This program is free software: you can redistribute it and/or modify it under the terms of the Apache license 2.0 (please see the LICENSE file).
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
You should have received a copy of the Apache license 2.0 along with this program (see LICENSE file).
Reference
Please refer to the branch TPAMI.2021.3067763 to reproduce the paper Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL.
bibtex
@article{zimmer-tpami21a,
author = {Lucas Zimmer and Marius Lindauer and Frank Hutter},
title = {Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
year = {2021},
note = {also available under https://arxiv.org/abs/2006.13799},
pages = {3079 - 3090}
}
bibtex
@incollection{mendoza-automlbook18a,
author = {Hector Mendoza and Aaron Klein and Matthias Feurer and Jost Tobias Springenberg and Matthias Urban and Michael Burkart and Max Dippel and Marius Lindauer and Frank Hutter},
title = {Towards Automatically-Tuned Deep Neural Networks},
year = {2018},
month = dec,
editor = {Hutter, Frank and Kotthoff, Lars and Vanschoren, Joaquin},
booktitle = {AutoML: Methods, Sytems, Challenges},
publisher = {Springer},
chapter = {7},
pages = {141--156}
}
bibtex
@article{deng-ecml22,
author = {Difan Deng and Florian Karl and Frank Hutter and Bernd Bischl and Marius Lindauer},
title = {Efficient Automated Deep Learning for Time Series Forecasting},
year = {2022},
booktitle = {Machine Learning and Knowledge Discovery in Databases. Research Track
- European Conference, {ECML} {PKDD} 2022},
url = {https://doi.org/10.48550/arXiv.2205.05511},
}
Contact
Auto-PyTorch is developed by the AutoML Groups of the University of Freiburg and Hannover.
Owner
- Name: AutoML-Freiburg-Hannover
- Login: automl
- Kind: organization
- Location: Freiburg and Hannover, Germany
- Website: www.automl.org
- Repositories: 186
- Profile: https://github.com/automl
Citation (CITATION.cff)
preferred-citation:
type: article
authors:
- family-names: "Zimmer"
given-names: "Lucas"
affiliation: "University of Freiburg, Germany"
- family-names: "Lindauer"
given-names: "Marius"
affiliation: "University of Freiburg, Germany"
- family-names: "Hutter"
given-names: "Frank"
affiliation: "University of Freiburg, Germany"
doi: "10.1109/TPAMI.2021.3067763"
journal-title: "IEEE Transactions on Pattern Analysis and Machine Intelligence"
title: "Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL"
year: 2021
note: "also available under https://arxiv.org/abs/2006.13799"
start: 3079
end: 3090
GitHub Events
Total
- Issues event: 2
- Watch event: 109
- Member event: 3
- Issue comment event: 9
- Fork event: 17
Last Year
- Issues event: 2
- Watch event: 109
- Member event: 3
- Issue comment event: 9
- Fork event: 17
Committers
Last synced: 12 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| chico | f****e@g****m | 60 |
| Lucas Zimmer | z****l@i****e | 53 |
| Ravin Kohli | 1****i | 43 |
| Matthias Urban | u****m@i****e | 27 |
| nabenabe0928 | s****o@g****m | 14 |
| Marius Lindauer | m****s@g****m | 8 |
| LMZimmer | 5****r | 7 |
| dwoiwode | f****n@d****e | 5 |
| bastiscode | s****8@g****m | 3 |
| ntnguyen88 | 6****8 | 1 |
| Tim Hatch | t****m@t****m | 1 |
| Johnny Burns | j****2@g****m | 1 |
| Daiki Katsuragawa | 5****a | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 65
- Total pull requests: 55
- Average time to close issues: 6 months
- Average time to close pull requests: 4 months
- Total issue authors: 42
- Total pull request authors: 10
- Average comments per issue: 1.89
- Average comments per pull request: 0.69
- Merged pull requests: 18
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 3
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 3
- Pull request authors: 0
- Average comments per issue: 2.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- nabenabe0928 (11)
- RobbyW551 (5)
- ravinkohli (4)
- franchuterivera (3)
- ArlindKadra (3)
- shabir1 (2)
- Yuang-Deng (2)
- CHDNY (1)
- jmrichardson (1)
- Songenyu (1)
- zym604 (1)
- alirostami9972 (1)
- LokeshBadisa (1)
- LuciusMos (1)
- bakirillov (1)
Pull Request Authors
- ravinkohli (31)
- nabenabe0928 (7)
- dengdifan (5)
- ArlindKadra (3)
- theodorju (2)
- franchuterivera (2)
- marcelovca90 (2)
- Borda (1)
- dwoiwode (1)
- na018 (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 2
-
Total downloads:
- pypi 181 last-month
-
Total dependent packages: 0
(may contain duplicates) -
Total dependent repositories: 1
(may contain duplicates) - Total versions: 9
- Total maintainers: 4
proxy.golang.org: github.com/automl/Auto-PyTorch
- Documentation: https://pkg.go.dev/github.com/automl/Auto-PyTorch#section-documentation
- License: apache-2.0
-
Latest release: v0.2.1
published over 3 years ago
Rankings
pypi.org: autopytorch
Auto-PyTorch searches neural architectures using smac
- Homepage: https://github.com/automl/Auto-PyTorch
- Documentation: https://autopytorch.readthedocs.io/
- License: 3-clause BSD
-
Latest release: 0.2.1
published over 3 years ago
Rankings
Maintainers (4)
Dependencies
- ConfigSpace >=0.4.14,<0.5
- catboost *
- dask *
- distributed >=2.2.0
- flaky *
- imgaug >=0.4.0
- lightgbm *
- lockfile *
- numpy *
- pandas *
- pynisher >=0.6.3
- pyrfr >=0.7,<0.9
- scikit-learn >=0.24.0,<0.25.0
- scipy >=1.7
- smac ==0.14.0
- tabulate *
- tensorboard *
- torch *
- torchvision *
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v2 composite
- docker/build-push-action ad44023a93711e3deb337508980b4b5e9bcdc5dc composite
- docker/login-action f054a8b539a109f9f41c372932f1ae047eff08c9 composite
- docker/metadata-action 98669ae865ea3cffbcbaa878cf57c20bbf1c6c38 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- codecov/codecov-action v1 composite
- actions/checkout master composite
- actions/setup-python v2 composite
- pypa/gh-action-pypi-publish master composite
- ubuntu 20.04 build