Recent Releases of amlb

amlb - Bump NAML version, fix evaluation sparse target

This update bumps NAML from pointing to a commit on a fork to the stable NAML release. This stable NAML release addresses several issues, most importantly an issue that introduced a memory leak which lead to high failure rate, and the ability to get stuck in an infinite loop. Without these fixes, NAML was too unstable to evaluate.

This also includes a small patch to address a bug where if the target was provided in sparse format and returned as such in the integration script, the evaluation script would crash.

- Python
Published by PGijsbers over 2 years ago

amlb - v2.1.6 - Fixes

Fixes: - Set task type explicitly for naive automl - Unsparsify target variable for naive automl (required to work wit sparse data) - Use numpy data for autosklearn if the pandas dataframe is sparse, as sparse dataframes are not supported (yet).

- Python
Published by PGijsbers over 2 years ago

amlb - Use different NAML version instead

The previous one had a fixed based on master, but that contains a bug for regression datasets. So instead we point to a version, as far as I can tell, has neither bug.

- Python
Published by PGijsbers over 2 years ago

amlb - v2.1.4

Adds NaiveAutoML (https://github.com/fmohr/naiveautoml) as new integrated framework! The framework isn't really designed to run for a long time, and currently may encountered segmentation faults.

- Python
Published by PGijsbers over 2 years ago

amlb - v2.1.3

  • Store more artifacts by default
  • Retry on AWS VolumeLimitExceeded
  • Fix issues with storing autosklearn artifacts and add print logging to work around their logging configuration.

- Python
Published by PGijsbers over 2 years ago

amlb - v2.1.2

What's Changed

  • Add an option to keep columns with all missing data around when using impute_array
  • Fix a bug where inference batches were not generated correctly if the data contained columns with all missing values.
  • Fix a bug where the arff header of the split arff files incorrectly could label booleans as numeric when they should be treated as categorical (this only affected frameworks that depend on ARFF).

Full Changelog: https://github.com/openml/automlbenchmark/compare/v2.1.1...v2.1.2

- Python
Published by PGijsbers over 2 years ago

amlb - v2.1.1

What's Changed

AWS: - start, stop, and log time to failures.csv log

Docker: - No longer assign user and user permissions when creating docker - Introduce docker.run_as configuration option, which lets you specify under which user the docker container should execute the benchmark script. - Further cut down on the files included in the docker image

Frameworks: - Add additional logging to framework integration scripts

AutoGluon: - reduce maximum runtime for good_quality and high_quality presets, which otherwise exceed the runtime by design - allow larger models to persist in memory, this matches an upcoming default

GAMA: - update for 23.0.0 release

Full Changelog: https://github.com/openml/automlbenchmark/compare/v2.1.0...v2.1.1

- Python
Published by PGijsbers over 2 years ago

amlb - v2.1.0

Highlights: * The benchmark now requires Py3.9+ and its dependencies are updated. * AMIs and Docker images now use Ubuntu 22.04 * Upgrades support for newer versions of the various frameworks. * Support for uploading results to OpenML and connecting to the OpenML test server * Experimental support for time series with AutoGluon * Results can now be stored incrementally * Add option to measure inference time in more standardized fashion for most frameworks.

Note that sharing built docker images currently has some complications due to permission issues, as a work around patch start as root (see also: https://github.com/openml/automlbenchmark/pull/495#issuecomment-1598703676) GAMA integration is currently broken, as the goal parameter was incorrectly removed in the last release, this will be fixed next GAMA release.

Thanks to all contributors!

Full Changelog: https://github.com/openml/automlbenchmark/compare/v2.0.6...v2.1.0

- Python
Published by PGijsbers over 2 years ago

amlb - v2.0.5

What's Changed

  • Signal to encode predictions as proba now works by @PGijsbers in https://github.com/openml/automlbenchmark/pull/447
  • Monkeypatch openml to keep whitespace in features by @PGijsbers in https://github.com/openml/automlbenchmark/pull/446
  • fix for mlr3automl installation by @Coorsaa in https://github.com/openml/automlbenchmark/pull/443

Full Changelog: https://github.com/openml/automlbenchmark/compare/v2.0.4...v2.0.5

- Python
Published by PGijsbers about 4 years ago

amlb - v2.0.4

What's Changed

  • Fix a bug which could prevent building docker images by @sebhrusen in https://github.com/openml/automlbenchmark/pull/437
  • Avoid querying terminated instance with CloudWatch by @PGijsbers in https://github.com/openml/automlbenchmark/pull/438
  • Add precision to runtimes in results.csv by @ledell in https://github.com/openml/automlbenchmark/pull/433
  • Iteratively build the forest to honor constraints by @PGijsbers in https://github.com/openml/automlbenchmark/pull/439
  • Iterative fit for TunedRandomForest to meet memory and time constraints by @PGijsbers in https://github.com/openml/automlbenchmark/pull/441

Full Changelog: https://github.com/openml/automlbenchmark/compare/v2.0.3...v2.0.4

- Python
Published by PGijsbers about 4 years ago

amlb - v2.0.2

  • Add constraint sets used for the new evaluation (includes 100gb of gp3 ssd)
  • Log information about the used AWS volumes (type, size and id)

- Python
Published by PGijsbers over 4 years ago

amlb - v2.0.1

  • if a container image is built from a clean state on a commit with a version tag, this version tag will be appended to the image tag
  • randomforest:latest and tunedrandomforest:latest now correctly pull from main instead of master (thanks to @eddiebergman)

- Python
Published by PGijsbers over 4 years ago

amlb - V2.0

V2.0

Almost a year has passed since the last release, and too much has changed to list everything. Some highlights include:

  • AWS spot instance support
  • Sparse dataset support
  • Optimized data loading from OpenML
  • Added frameworks:
    • MLNET
    • FLAML
    • Light AutoML
    • mlr3automl
  • Many bug fixes and improvements

Going forward we hope to release new versions more intermittently.


Thanks to everyone who contributed through commits, issues, discussions or any other way. In particular we would like the following contributors for their code contributions since v1.6:

  • @mfeurer
  • @Innixma
  • @franchuterivera
  • @LittleLittleCloud
  • @pplonski
  • @qingyun-wu
  • @a-hanf
  • @dev-rinchin
  • @mwever

- Python
Published by PGijsbers over 4 years ago

amlb - Adding support for new frameworks

New frameworks added since 1.5:

  • AutoXGBoost
  • MLJar-supervised
  • MLPlan

Upgraded existing frameworks versions.

Improved frameworks version management

In most cases, users can try an older or a more recent version of a given framework, simply by creating a local framework definition with the version they want to use (https://github.com/openml/automlbenchmark/blob/master/docs/HOWTO.md#framework-definition), and force the framework setup (python runbenchmark.py my_framework -s force).

Run on OpenML suites and tasks directly

Specify benchmark as openml/s/X or openml/t/Y to run on a suite or task, e.g.: python runbenchmark.py randomforest openml/s/218.

Bug fixes

- Python
Published by sebhrusen over 5 years ago