autopytorch

Automatic architecture search and hyperparameter optimization for PyTorch

https://github.com/automl/auto-pytorch

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org
✓
Committers with academic emails
2 of 13 committers (15.4%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.7%) to scientific vocabulary

Keywords

automl deep-learning pytorch tabular-data

Last synced: 6 months ago · JSON representation ·

Repository

Automatic architecture search and hyperparameter optimization for PyTorch

Basic Info

Host: GitHub
Owner: automl
License: apache-2.0
Language: Python
Default Branch: master
Homepage:
Size: 19.4 MB

Statistics

Stars: 2,477
Watchers: 46
Forks: 300
Open Issues: 75
Releases: 4

Topics

automl deep-learning pytorch tabular-data

Created about 7 years ago · Last pushed almost 2 years ago

Metadata Files

Readme License Citation

Auto-PyTorch

While early AutoML frameworks focused on optimizing traditional ML pipelines and their hyperparameters, another trend in AutoML is to focus on neural architecture search. To bring the best of these two worlds together, we developed Auto-PyTorch, which jointly and robustly optimizes the network architecture and the training hyperparameters to enable fully automated deep learning (AutoDL).

Auto-PyTorch is mainly developed to support tabular data (classification, regression) and time series data (forecasting). The newest features in Auto-PyTorch for tabular data are described in the paper "Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL" (see below for bibtex ref). Details about Auto-PyTorch for multi-horizontal time series forecasting tasks can be found in the paper "Efficient Automated Deep Learning for Time Series Forecasting" (also see below for bibtex ref).

Also, find the documentation here.

From v0.1.0, AutoPyTorch has been updated to further improve usability, robustness and efficiency by using SMAC as the underlying optimization package as well as changing the code structure. Therefore, moving from v0.0.2 to v0.1.0 will break compatibility. In case you would like to use the old API, you can find it at master_old.

Workflow

The rough description of the workflow of Auto-Pytorch is drawn in the following figure.

AutoPyTorch Workflow

In the figure, Data is provided by user and Portfolio is a set of configurations of neural networks that work well on diverse datasets. The current version only supports the greedy portfolio as described in the paper Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL This portfolio is used to warm-start the optimization of SMAC. In other words, we evaluate the portfolio on a provided data as initial configurations. Then API starts the following procedures: 1. Validate input data: Process each data type, e.g. encoding categorical data, so that Auto-Pytorch can handled. 2. Create dataset: Create a dataset that can be handled in this API with a choice of cross validation or holdout splits. 3. Evaluate baselines * Tabular dataset 1: Train each algorithm in the predefined pool with a fixed hyperparameter configuration and dummy model from sklearn.dummy that represents the worst possible performance. * *Time Series Forecasting dataset** : Train a dummy predictor that repeats the last observed value in each series 4. Search by SMAC:\ a. Determine budget and cut-off rules by Hyperband\ b. Sample a pipeline hyperparameter configuration *2 by SMAC\ c. Update the observations by obtained results\ d. Repeat a. -- c. until the budget runs out 5. Build the best ensemble for the provided dataset from the observations and model selection of the ensemble.

*1: Baselines are a predefined pool of machine learning algorithms, e.g. LightGBM and support vector machine, to solve either regression or classification task on the provided dataset

*2: A pipeline hyperparameter configuration specifies the choice of components, e.g. target algorithm, the shape of neural networks, in each step and (which specifies the choice of components in each step and their corresponding hyperparameters.

Installation

PyPI Installation

```sh

pip install autoPyTorch

```

Auto-PyTorch for Time Series Forecasting requires additional dependencies

```sh

pip install autoPyTorch[forecasting]

```

Manual Installation

We recommend using Anaconda for developing as follows:

```sh

Following commands assume the user is in a cloned directory of Auto-Pytorch

We also need to initialize the automl_common repository as follows

You can find more information about this here:

https://github.com/automl/automl_common/

git submodule update --init --recursive

Create the environment

conda create -n auto-pytorch python=3.8 conda activate auto-pytorch conda install swig python setup.py install

```

Similarly, to install all the dependencies for Auto-PyTorch-TimeSeriesForecasting:

```sh

git submodule update --init --recursive

conda create -n auto-pytorch python=3.8 conda activate auto-pytorch conda install swig pip install -e[forecasting]

```

Examples

In a nutshell:

```py from autoPyTorch.api.tabular_classification import TabularClassificationTask

data and metric imports

import sklearn.modelselection import sklearn.datasets import sklearn.metrics X, y = sklearn.datasets.loaddigits(returnXy=True) Xtrain, Xtest, ytrain, ytest = \ sklearn.modelselection.traintestsplit(X, y, randomstate=1)

initialise Auto-PyTorch api

api = TabularClassificationTask()

Search for an ensemble of machine learning algorithms

api.search( Xtrain=Xtrain, ytrain=ytrain, Xtest=Xtest, ytest=ytest, optimizemetric='accuracy', totalwalltimelimit=300, funcevaltimelimit_secs=50 )

Calculate test accuracy

ypred = api.predict(Xtest) score = api.score(ypred, ytest) print("Accuracy score", score) ```

For Time Series Forecasting Tasks ```py

from autoPyTorch.api.timeseriesforecasting import TimeSeriesForecastingTask

data and metric imports

from sktime.datasets import loadlongley targets, features = loadlongley()

define the forecasting horizon

forecasting_horizon = 3

Dataset optimized by APT-TS can be a list of np.ndarray/ pd.DataFrame where each series represents an element in the

list, or a single pd.DataFrame that records the series

index information: to which series the timestep belongs? This id can be stored as the DataFrame's index or a separate

column

Within each series, we take the last forecasting_horizon as test targets. The items before that as training targets

Normally the value to be forecasted should follow the training sets

ytrain = [targets[: -forecastinghorizon]] ytest = [targets[-forecastinghorizon:]]

same for features. For uni-variant models, Xtrain, Xtest can be omitted and set as None

Xtrain = [features[: -forecastinghorizon]]

Here x_test indicates the 'known future features': they are the features known previously, features that are unknown

could be replaced with NAN or zeros (which will not be used by our networks). If no feature is known beforehand,

we could also omit X_test

knownfuturefeatures = list(features.columns) Xtest = [features[-forecastinghorizon:]]

starttimes = [targets.index.totimestamp()[0]] freq = '1Y'

initialise Auto-PyTorch api

api = TimeSeriesForecastingTask()

Search for an ensemble of machine learning algorithms

api.search( Xtrain=Xtrain, ytrain=ytrain, Xtest=Xtest, optimizemetric='meanMAPEforecasting', npredictionsteps=forecastinghorizon, memorylimit=16 * 1024, # Currently, forecasting models use much more memories freq=freq, starttimes=starttimes, funcevaltimelimitsecs=50, totalwalltimelimit=60, minnumtestinstances=1000, # proxy validation sets. This only works for the tasks with more than 1000 series knownfuturefeatures=knownfuturefeatures, )

our dataset could directly generate sequences for new datasets

testsets = api.dataset.generatetest_seqs()

Calculate test accuracy

ypred = api.predict(testsets) score = api.score(ypred, ytest) print("Forecasting score", score) ```

For more examples including customising the search space, parellising the code, etc, checkout the examples folder

sh $ cd examples/

Code for the paper is available under examples/ensemble in the TPAMI.2021.3067763 branch.

Contributing

If you want to contribute to Auto-PyTorch, clone the repository and checkout our current development branch

sh $ git checkout development

License

This program is free software: you can redistribute it and/or modify it under the terms of the Apache license 2.0 (please see the LICENSE file).

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

You should have received a copy of the Apache license 2.0 along with this program (see LICENSE file).

Reference

Please refer to the branch TPAMI.2021.3067763 to reproduce the paper Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL.

bibtex @article{zimmer-tpami21a, author = {Lucas Zimmer and Marius Lindauer and Frank Hutter}, title = {Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL}, journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence}, year = {2021}, note = {also available under https://arxiv.org/abs/2006.13799}, pages = {3079 - 3090} }

bibtex @incollection{mendoza-automlbook18a, author = {Hector Mendoza and Aaron Klein and Matthias Feurer and Jost Tobias Springenberg and Matthias Urban and Michael Burkart and Max Dippel and Marius Lindauer and Frank Hutter}, title = {Towards Automatically-Tuned Deep Neural Networks}, year = {2018}, month = dec, editor = {Hutter, Frank and Kotthoff, Lars and Vanschoren, Joaquin}, booktitle = {AutoML: Methods, Sytems, Challenges}, publisher = {Springer}, chapter = {7}, pages = {141--156} }

bibtex @article{deng-ecml22, author = {Difan Deng and Florian Karl and Frank Hutter and Bernd Bischl and Marius Lindauer}, title = {Efficient Automated Deep Learning for Time Series Forecasting}, year = {2022}, booktitle = {Machine Learning and Knowledge Discovery in Databases. Research Track - European Conference, {ECML} {PKDD} 2022}, url = {https://doi.org/10.48550/arXiv.2205.05511}, }

Contact

Auto-PyTorch is developed by the AutoML Groups of the University of Freiburg and Hannover.

Owner

Name: AutoML-Freiburg-Hannover
Login: automl
Kind: organization
Location: Freiburg and Hannover, Germany

Website: www.automl.org
Repositories: 186
Profile: https://github.com/automl

Citation (CITATION.cff)

preferred-citation:
  type: article
  authors:
  - family-names: "Zimmer"
    given-names: "Lucas"
    affiliation: "University of Freiburg, Germany"    
  - family-names: "Lindauer"
    given-names: "Marius"
    affiliation: "University of Freiburg, Germany"    
  - family-names: "Hutter"
    given-names: "Frank"
    affiliation: "University of Freiburg, Germany"
  doi: "10.1109/TPAMI.2021.3067763"
  journal-title: "IEEE Transactions on Pattern Analysis and Machine Intelligence"
  title: "Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL"
  year: 2021
  note: "also available under https://arxiv.org/abs/2006.13799"
  start: 3079
  end: 3090

GitHub Events

Total

Issues event: 2
Watch event: 109
Member event: 3
Issue comment event: 9
Fork event: 17

Last Year

Issues event: 2
Watch event: 109
Member event: 3
Issue comment event: 9
Fork event: 17

Committers

Last synced: 12 months ago

All Time

Total Commits: 224
Total Committers: 13
Avg Commits per committer: 17.231
Development Distribution Score (DDS): 0.732

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
chico	f**e@g**m	60
Lucas Zimmer	z**l@i**e	53
Ravin Kohli	1****i	43
Matthias Urban	u**m@i**e	27
nabenabe0928	s**o@g**m	14
Marius Lindauer	m**s@g**m	8
LMZimmer	5****r	7
dwoiwode	f**n@d**e	5
bastiscode	s**8@g**m	3
ntnguyen88	6****8	1
Tim Hatch	t**m@t**m	1
Johnny Burns	j**2@g**m	1
Daiki Katsuragawa	5****a	1

Committer Domains (Top 20 + Academic)

informatik.uni-freiburg.de: 2 timhatch.com: 1 dominik-woiwode.de: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 65
Total pull requests: 55
Average time to close issues: 6 months
Average time to close pull requests: 4 months
Total issue authors: 42
Total pull request authors: 10
Average comments per issue: 1.89
Average comments per pull request: 0.69
Merged pull requests: 18
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 3
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 3
Pull request authors: 0
Average comments per issue: 2.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

nabenabe0928 (11)
RobbyW551 (5)
ravinkohli (4)
franchuterivera (3)
ArlindKadra (3)
shabir1 (2)
Yuang-Deng (2)
CHDNY (1)
jmrichardson (1)
Songenyu (1)
zym604 (1)
alirostami9972 (1)
LokeshBadisa (1)
LuciusMos (1)
bakirillov (1)

Pull Request Authors

ravinkohli (31)
nabenabe0928 (7)
dengdifan (5)
ArlindKadra (3)
theodorju (2)
franchuterivera (2)
marcelovca90 (2)
Borda (1)
dwoiwode (1)
na018 (1)

Top Labels

Issue Labels

enhancement (13) bug (9) fix-later (9) not urgent (6) Documentation (2) refactoring (2) needs-more-information (1)

Pull Request Labels

enhancement (6) bug (2) first priority (1) refactoring (1)

Packages

Total packages: 2
Total downloads:
- pypi 181 last-month

Total dependent packages: 0
(may contain duplicates)
Total dependent repositories: 1
(may contain duplicates)
Total versions: 9
Total maintainers: 4

proxy.golang.org: github.com/automl/Auto-PyTorch

Documentation: https://pkg.go.dev/github.com/automl/Auto-PyTorch#section-documentation
License: apache-2.0
Latest release: v0.2.1
published over 3 years ago

Versions: 4
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 7.0%

Average: 8.2%

Dependent repos count: 9.3%

Last synced: 6 months ago

pypi.org: autopytorch

Auto-PyTorch searches neural architectures using smac

Homepage: https://github.com/automl/Auto-PyTorch
Documentation: https://autopytorch.readthedocs.io/
License: 3-clause BSD
Latest release: 0.2.1
published over 3 years ago

Versions: 5
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 181 Last month

Rankings

Stargazers count: 1.5%

Forks count: 3.2%

Average: 9.3%

Dependent packages count: 10.1%

Downloads: 10.1%

Dependent repos count: 21.6%

Maintainers (4)

eddiebergman mlindauer LMZimmer ravinkohli

Last synced: 6 months ago

Dependencies

requirements.txt pypi

ConfigSpace >=0.4.14,<0.5
catboost *
dask *
distributed >=2.2.0
flaky *
imgaug >=0.4.0
lightgbm *
lockfile *
numpy *
pandas *
pynisher >=0.6.3
pyrfr >=0.7,<0.9
scikit-learn >=0.24.0,<0.25.0
scipy >=1.7
smac ==0.14.0
tabulate *
tensorboard *
torch *
torchvision *

.github/workflows/dist.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/docker-publish.yml actions

actions/checkout v2 composite
docker/build-push-action ad44023a93711e3deb337508980b4b5e9bcdc5dc composite
docker/login-action f054a8b539a109f9f41c372932f1ae047eff08c9 composite
docker/metadata-action 98669ae865ea3cffbcbaa878cf57c20bbf1c6c38 composite

.github/workflows/docs.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/long_regression_test.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/pre-commit.yaml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/pytest.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite
codecov/codecov-action v1 composite

.github/workflows/release.yml actions

actions/checkout master composite
actions/setup-python v2 composite
pypa/gh-action-pypi-publish master composite

Dockerfile docker

ubuntu 20.04 build

.binder/requirements.txt pypi

setup.py pypi

autopytorch

Science Score: 77.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Auto-PyTorch

Workflow

Installation

PyPI Installation

Manual Installation

Following commands assume the user is in a cloned directory of Auto-Pytorch

We also need to initialize the automl_common repository as follows

You can find more information about this here:

https://github.com/automl/automl_common/

Create the environment

Examples

data and metric imports

initialise Auto-PyTorch api

Search for an ensemble of machine learning algorithms

Calculate test accuracy

data and metric imports

define the forecasting horizon

Dataset optimized by APT-TS can be a list of np.ndarray/ pd.DataFrame where each series represents an element in the

list, or a single pd.DataFrame that records the series

index information: to which series the timestep belongs? This id can be stored as the DataFrame's index or a separate

column

Within each series, we take the last forecasting_horizon as test targets. The items before that as training targets

Normally the value to be forecasted should follow the training sets

same for features. For uni-variant models, Xtrain, Xtest can be omitted and set as None

Here x_test indicates the 'known future features': they are the features known previously, features that are unknown

could be replaced with NAN or zeros (which will not be used by our networks). If no feature is known beforehand,

we could also omit X_test

initialise Auto-PyTorch api

Search for an ensemble of machine learning algorithms

our dataset could directly generate sequences for new datasets

Calculate test accuracy

Contributing

License

Reference

Contact

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

proxy.golang.org: github.com/automl/Auto-PyTorch

Rankings

pypi.org: autopytorch

Rankings

Maintainers (4)

Dependencies