Releases | Open Source Science

zeldarose - v0.12.0

Changed

Python bumped to >= 3.10, tests up to 3.13
Bumped datasets to >= 3.0, < 3.2
Bumped lightning to < 2.6
Remove hard dependency on sentencepiece. Users can still install it if the tokenizer they use needs it, but their release policy is too brittle to allow them to block us, especially since it's only a quality of life dependency for us.
Bumped torch to < 2.7
Bumped tokenizers to < 0.22
Bumped transformers to allow < 5.0, skipping versions from 4.41 to 4.43

Full Changelog: https://github.com/LoicGrobol/zeldarose/compare/v0.11.0...v0.12.0

- Python
Published by LoicGrobol over 1 year ago

zeldarose - v0.11.0

Changed

Several dumps of environments added to the output dir of transformer training to help with reproducibility and bug reporting.

Full Changelog: https://github.com/LoicGrobol/zeldarose/compare/v0.10.0...v0.11.0

- Python
Published by LoicGrobol about 2 years ago

zeldarose - v0.10.0

Changed

Bumped minimal (Pytorch) Lightning version to 2.0.0
Pytorch compatibility changed to >= 2.0, < 2.4
🤗 datasets compatibility changed to >= 2.18, < 2.20
Added support for the new lightning precision plugins.

Full Changelog: https://github.com/LoicGrobol/zeldarose/compare/v0.9.0...v0.10.0

- Python
Published by LoicGrobol about 2 years ago

zeldarose - v0.9.0

Fixed

Training a m2m100 model on a language (code) not originally included in its tokenizer now works.

Changed

Pytorch compatibility changed to >= 2.0, < 2.3
🤗 datasets compatibility changed to >= 2.18, < 2.19

Full Changelog: https://github.com/LoicGrobol/zeldarose/compare/v0.8.0...v0.9.0

- Python
Published by LoicGrobol about 2 years ago

zeldarose - v0.8.0

Fixed

Fixed multiple save when using step-save-period in conjunction with bach accumulation (close #30)

Changed

Maximum Pyorch compatibility bumped to 2.1
max_steps and max_epochs can now be set in the tuning config. Setting them via command line options is deprecated and will be removed in a future version.

- Python
Published by LoicGrobol over 2 years ago

zeldarose - v0.7.3 — Bug Fix

Fixed

Behaviour when asking for denoising in mBART with a model that has no mask token.

- Python
Published by LoicGrobol over 3 years ago

zeldarose - v0.7.2 — Now with a doc??!?

Fixed

In mBART training, loss scaling now works as it was supposed to.
We have a documentation now! Check it out at https://zeldarose.readthedocs.io, it will get better over time (hopefully!).

- Python
Published by LoicGrobol over 3 years ago

zeldarose - v0.7.1 Bug fix

Fixed

Translate loss logging is not always zero anymore.

- Python
Published by LoicGrobol over 3 years ago

zeldarose - Now with mBART translations!

The main highlight of this release is the addition of mBART training as a task, so far slightly different from the original one, but similar enough to work in our tests.

Added

The --tf32-mode option allows to select the level of NVidia Ampère matmul otpimisations.
The --seed option allows to fix a random seed.
The mbart task allows training general seq2seq and translation models.
A zeldarose command that serves as entry point for both tokenizer and transformer training.

Changed

BREAKING --use-fp16 has been replaced by --precision, which allows to also use fp64 and bfloat. Previous behaviour can be emulated with --precision 16.
Remove the GPU stats logging from the profile mode since Lightning stopped supporting it
Switched TOML library from toml to tomli
BREAKING Bumped the min version of several dependency
- pytorch-lightning >= 1.8.0
- torch >= 1.12
Bumped max version of several dependency
- datasets < 2.10
- pytorch-lightning < 1.9
- tokenizers < 0.14

- Python
Published by LoicGrobol over 3 years ago

zeldarose - v0.6.0 — Dependencies compatibilities

This one to fix compatibilities issues with our dependencies. Bumps minimal versions and add upper version limits.

Changed

Bumped torchmetrics minimal version to 0.9
Bumped datasets minimal version to 2.4
Bumped torch max version to 1.12

Fixed

Dataset fingerprinting/caching issues #31

Full Changelog: https://github.com/LoicGrobol/zeldarose/compare/v0.5.0...v0.6.0

- Python
Published by LoicGrobol almost 4 years ago

zeldarose - v0.5.0 — Housekeeping

The minor bump is because we have several new minimal version requirements (and to fairly recent versions with that). Otherwise, this is mostly internal stuff.

Added

lint extra that install linting tools and plugins
Config for flakeheaven
Support for pytorch-lightning 1.6

Changed

Move packaging config to pyproject.toml and require setuptools>=61.
click_pathlib is no longer a dependency and click has a minimal version of 8.0.3

Full Changelog: https://github.com/LoicGrobol/zeldarose/compare/v0.4.0...v0.5.0

- Python
Published by LoicGrobol about 4 years ago

zeldarose - v0.4.0 — experimental ELECTRA

Added

Replaced Token Detection (ELECTRA-like) pretraining
- Some of the API is still provisional, the priority was to get it out, a nicer interface will hopefully come later.
--val-check-period and --step-save-period allowing to evaluate and save a model decoupled from epochs. This should be useful for training with very long epochs.
The datasets path in zeldarose-transformer can now be 🤗 hub handles. See --help.

Changed

The command line options have been changed to reflect change in Lightning
- --accelerator is now used for devices, tested values are "cpu" and "gpu"
- --strategy now specifies how to train, tested values are None (missing), "ddp", "ddp_sharded" "ddp_spawn" and"ddp_sharded_spawn".
- No more option to select sharded training, use the strategy alias for that
- --n-gpus has been renamed to --num-devices.
- --n-workers and --n-nodes have been respectively renamed to --num-workers and --num-nodes.
Training task configs now have a type config key to specify the task type
Lightning progress bars are now provided by Rich
Now supports Pytorch 1.11 and Python 3.10

Internal

Tests now run in Pytest using the console-scripts plugin for smoke tests.
Smoke tests now include ddp_spawn tests and tests on gpu devices if available.
Some refactoring for better factorization of the common utilities for MLM and RTD.

- Python
Published by LoicGrobol over 4 years ago

zeldarose - v0.3.4 —Lightning bump

Just bumping pytorch-lightning to the current minor version.

- Python
Published by LoicGrobol over 4 years ago

zeldarose - v0.3.3 — bugfix release

Changed

max_steps is automatically inferred from the tuning config if a number of lr decay steps is given
max_epochs is now optional (if both max_steps and max_epochs are unset and no lr schedule is provided, Lightning's default will be used)
find_unused_parameters is now disabled in DDP mode, unless in profile mode
Bumped lightning to 1.4.x

Fixed

Linear decay now properly takes the warmup period into account

- Python
Published by LoicGrobol over 4 years ago

zeldarose - v0.3.2 — switch to torchmetrics

We now internally use torchmetrics, which improves the stability of accuracy computations

Fixed

Accuracy should stop NaN-ing
Empty lines in datasets are now ignored

- Python
Published by LoicGrobol about 5 years ago

zeldarose - v0.3.0 — flattening some creases

Changed

Stop saving tokenizers in legacy format
Create data dir if they don't exist

- Python
Published by LoicGrobol about 5 years ago

zeldarose - v0.2.0 – Now eating less RAM

Added

--checkpoint option to load an existing lightning checkpoint
DDP sharding is now also possible with ddp_spawn

Changed

Text datasets are now loaded line-by-line by default and the block mode has been removed.
We now use 🤗 datasets as backend, so the datasets are implemented as memory-mapped files with dynamic loaders instead of being held in RAM. This significantly decrease RAM consumption for a very decent speed cost and allows us to train on much larger datasets.
GPU usage is now logged in --profile mode when relevant.
LR is now logged.

Removed

The --line-by-line flag has been removed, since this is now the default behaviour.
The zeldarose-create-cache has been removed, since dataset processing now works correctly in ddp.
The data module has been completely rewritten and the Dataset classes are no more.
mlm.masked_accuracy since it was not used anywhere.

Fixed

Logging has been improved for internal pytorch warnings and pytorch-lightning and 🤗 transformers.

- Python
Published by LoicGrobol about 5 years ago

zeldarose - v0.1.1

- Python
Published by LoicGrobol about 5 years ago

Recent Releases of zeldarose

zeldarose - v0.12.0

Changed

zeldarose - v0.11.0

Changed

zeldarose - v0.10.0

Changed

zeldarose - v0.9.0

Fixed

Changed

zeldarose - v0.8.0

Fixed

Changed

zeldarose - v0.7.3 — Bug Fix

Fixed

zeldarose - v0.7.2 — Now with a doc??!?

Fixed

zeldarose - v0.7.1 Bug fix

Fixed

zeldarose - Now with mBART translations!

Added

Changed

zeldarose - v0.6.0 — Dependencies compatibilities

Changed

Fixed

zeldarose - v0.5.0 — Housekeeping

Added

Changed

zeldarose - v0.4.0 — experimental ELECTRA

Added

Changed

Internal

zeldarose - v0.3.4 —Lightning bump

zeldarose - v0.3.3 — bugfix release

Changed

Fixed

zeldarose - v0.3.2 — switch to torchmetrics

Fixed

zeldarose - v0.3.0 — flattening some creases

Changed

zeldarose - v0.2.0 – Now eating less RAM

Added

Changed

Removed

Fixed

zeldarose - v0.1.1

zeldarose - v0.3.4 —Lightning bump

zeldarose - v0.3.3 — bugfix release

zeldarose - v0.3.2 — switch to torchmetrics

zeldarose - v0.3.0 — flattening some creases

zeldarose - v0.2.0 – Now eating less RAM