Recent Releases of tango

tango - v1.3.2

What's new

Fixed ✅

  • Fix issues with gcloud auth in beaker executor.

Commits

2ad79f8 Fix issues with gcloud auth (#609) 4a1ebea Fix readthedocs config (#608)

- Python
Published by github-actions[bot] over 2 years ago

tango - v1.3.1

What's new

Fixed ✅

  • Minor bugs in the GSWorkspace().

Changed ⚠️

  • Added CLI-style execution functions for experiments defined in Python.
  • Added display() to ExecutorOutput for producing a table that summarizes the run.

Commits

4c8ae5a CLI-style execution for Python-defined experiments (#600) 8bb3472 GS workspace bug fixes (#607)

- Python
Published by github-actions[bot] over 2 years ago

tango - v1.3.0

What's new

Added 🎉

  • Added the Workspace.remove_step() method to safely remove steps.
    • The GSWorkspace() can now be initialized with google cloud bucket subfolders.

Changed ⚠️

  • The BeakerExecutor now uses the HEAD commit at the time the executor is instantiated to executor a step instead of the HEAD commit at the time the step is run.

Fixed ✅

  • Removed unnecessary code coverage dev requirements.
  • Fixed issue where new version of torch caused no LR schedulers to be registered.
  • Updated pinned versions of jax, jaxlib, and flax.

Commits

ed72140 Prepare for release v1.3.0 5d776d5 Revert "Prepare for release v1.3.0" 8eec3df Revert "remove folder" 671525f remove folder 1a384f7 Prepare for release v1.3.0 56c1476 Use commit at time executor is instantiated (#605) 11b5229 'Remove step' feature for local workspaces (#588) 3857415 GS workspaces can be bucket subfolders (#604) b955ef7 CI errors be gone (#601) 01077eb LR schedulers (#573) 3a19688 Bug fix in getting results from GS workspace (#582) 4c3edce style: migrate to ruff (#562) 416ffa6 Add CITATION.cff file (#572) 70f2681 Update torch requirement from <1.14,>=1.9 to >=1.9,<2.1 (#539) 1ee2c56 Update wandb requirement from <0.13.11,>=0.12 to >=0.12,<0.14.3 (#547) fcf1010 Bump sentencepiece from 0.1.97 to 0.1.98 (#548) 7d04cde Bump mypy from 1.0.1 to 1.2.0 (#555) fbb068b Bump allenai/beaker-run-action from 1.1 to 1.2 (#534) a717d70 Bump black from 23.1.0 to 23.3.0 (#554) 42322bb fix readthedocs 825ec19 remove stray pkl file 3638cdc refactor: move packaging information to pyproject.toml (#549) e86ad65 Bump black from 23.1.0 to 23.3.0 (#543) 69e7574 remove unnecessary code coverage deps (#550)

- Python
Published by github-actions[bot] over 2 years ago

tango - v1.2.1

What's new

Added 🎉

  • Added the following workspace methods to support the Tango viz UI: Workspace.search_registered_runs(), Workspace.search_step_info(), Workspace.num_registered_runs(), and Workspace.num_steps().

Fixed ✅

  • Fixes a bug where FromParams would fail to parse when an object takes a Step argument directly.
  • Changed a name so we don't override the built-in name set.
  • Fixed a bug that would cause O(n^2) memory consumption in dense step graphs.

Commits

258f440 Only one step object (#545) 07abee5 Don't override the name set (#544) 4784bab Return more info from Workspace.search_registered_runs() (#536) 3d2d890 Fix bug when a FromParams object takes a Step argument directly (#535) 38561d0 New paginated workspace search methods (#489) 7a25e3e Fix datasets typing issue (#531) 810b742 minor updates to CI (#529)

- Python
Published by github-actions[bot] almost 3 years ago

tango - v1.2.0

What's new

Added 🎉

  • You can now add arguments to steps without invalidating the cache. See Step.SKIP_DEFAULT_ARGUMENTS.
  • Fixed integration status messages in tango info command.
  • Added abstractions for RemoteClient, RemoteStepCache, and RemoteWorkspace.
  • Added a GS integration that comes with GSWorkspace, a remote Workspace implementation that uses google cloud storage.
  • You can now bind functional steps to the underlying Step instance with @step(bind=True), meaning the first argument to the function will be a Step.
  • Added ShellStep for running arbitrary shell commands.
  • Added @make_registrable decorator to make arbitrary functions registrable, to make it easier to refer to them in tango configurations.

Fixed ✅

  • Jsonnet parsing is now much faster and works on Windows.
  • Warnings about locks are now reliably printed every 30 seconds
  • We now make sure Beaker jobs have the latest version of beaker-py, so that we're compatible with the latest API changes.
  • Stopping early now works when the metric doesn't change at all.
  • Fixed bug with FromParams which didn't handle variable length tuples correctly.

Changed ⚠️

  • The default log level for Tango is now warning.
  • You can specify multiple steps with -s from the tango run command.

Commits

985f6fa fix lint f77c0e0 fix release_notes script 2c9456d Prepare for release v1.2.0 f1dc63d Update wandb requirement from <=0.13.5,>=0.12 to >=0.12,<0.13.11 (#523) 32598c4 Update rich requirement from <13.0,>=12.3 to >=12.3,<14.0 (#498) 35ca0f0 Bump actions/checkout from 1 to 3 (#524) 739e40c Various dependencies (#525) 49f3afc Fix bug with variable length tuples (#527) 28ea796 minor workspace fixes (#526) 379095d GCSWorkspace (#417) c949416 Shell step + Registrable functions (#521) 308689b Allow specifying multiple steps with -s from tango run (#516) 9002169 Stop early same metric (#515) 4a21132 Add bind option to @step decorator (#512) 8a6775e Beaker-py upgrade in Beaker jobs (#509) 679700b Rjsonnet (#505) 3760419 Default log level warning (#508) 8d29321 Lock warning (#506) 521de99 Pin wandb requirement until fixed (#507) 6618a04 upgrades for beaker-py upgrades (#504) 34dbec4 Fix integration status messages in tango info command (#502) fbb7581 quick fix for Beaker-py upgrade a10fb3b clone to a src dir (#495) 65f699d Fix #483 (#484) 4a183de Fix bug with extra uncacheable dependencies (#480) ac0a193 Skip default args (#481)

- Python
Published by github-actions[bot] about 3 years ago

tango - v1.1.0

What's new

Added 🎉

  • Added gpu_type field to StepResources. The BeakerExecutor can use this to determine which clusters to a submit a step to.
  • Added machine field to StepResources. You can set this to "local" when using the BeakerExecutor to force it to run the step locally.
  • Added --ext-var argument to tango run for setting JSONNET external variables when loading the experiment config.
  • Added @step() decorator to create Step classes from functions.
  • Added the transformers::with_soft_prompt integration, to make soft-prompted prefix transformers easy.

Removed 👋

  • Removed PyTorch Lightning integration.
  • Removed tango server command and --serve/--no-serve option for tango run.
  • Removed source_release.py, which was checked in by accident.

Fixed ✅

  • Fixed issue where Executor parallelism option in a Tango settings file would be ignored.
  • Fixed a bug where the unique ID of a step that depends on a key-value of the result of another step could change if the name of the other step changes.
  • Fixed a bug where importing certain libraries (like torchmetrics) would mess with our exception handling because they set sys.excepthook for some reason. Now we always reset sys.excepthook after importing.
  • The type hints for the flax trainer suggested that the training split is optional when in fact it's mandatory.
  • Made BeakerWorkspace / BeakerStepLock more robust when a job is preempted.
  • Minor performance improvements for the Beaker executor and workspace.

Commits

73bfa86 Soft prompts (#231) 79b7d01 Beaker integration perf improvements d455541 Train is not optional (#474) 3eab580 Beaker integration perf improvements (#475) 241b4eb Remove step that was never supposed to be there (#478) 5f5ba41 Add @step() decorator (#476) c0c4ae0 Make BeakerStepLock robust to preempted jobs 39d3d66 Remove tango server (#470) fef9bba reset sys.excepthook after importing other modules, add --ext-var CLI option (#471) ccebb4c Automatically remove ephemeral Beaker datasets 81d773e BeakerExecutor improvements, fix bug with StepIndexer (#469) c05a80a Bump sphinx-copybutton from 0.5.0 to 0.5.1 (#467) 147e408 Remove PyTorch Lightning integration (#468) cd4f626 Update more-itertools requirement from <9.0,>=8.0 to >=8.0,<10.0 (#456) d219dff Fix doc build failure ee71c09 Fix bug with setting parallelism in settings file (#466) 1609eb4 Bump mypy from 0.982 to 0.991 (#465)

- Python
Published by github-actions[bot] about 3 years ago

tango - v1.0.2

What's new

Changed ⚠️

  • BeakerScheduler can now return a list of clusters.

Commits

64215d8 Bump mypy from 0.982 to 0.990 (#463) acc1082 Update torch requirement from <1.13,>=1.9 to >=1.9,<1.14 (#460) 570d24e Use new constraints field for cluster assignment (#462) 5afd89f Set Beaker client User-Agent header to Tango v..* (#459) 9d9628f Fix progress logging statement from BeakerExecutor (#458)

- Python
Published by github-actions[bot] over 3 years ago

tango - v1.0.1

What's new

Fixed ✅

  • LightningTrainStep now can take a Lazy model object which results in a gauranteed deterministic hash.
  • Fixed issue where remote Workspace implementations like WandbWorkspace and BeakerWorkspace would use the same local cache regardless of the W&B / Beaker workspace being used.
  • Fixed bug with TorchEvalStep when constructing callbacks.
  • Fixed some import error issues caused when an integration is not installed.
  • Fix incorrect reporting of final results in MulticoreExecutor.

Changed ⚠️

  • Wandb step cache retries api call in case of timeout
  • beaker-py >= 1.11 required.

Commits

26e6416 Bump sphinx from 5.2.3 to 5.3.0 (#455) c10cec5 Retry wandb call (#450) f950b12 Use separate local cache dirs for each workspace (#451) 4c70161 Uncacheable step failure should be reported (#453) 9476942 Bump black from 22.8.0 to 22.10.0 (#445) 821c0fc Bump furo from 2022.9.15 to 2022.9.29 (#435) 8d0b146 Allow Lazy model input to LightningTrainStep (#448) 12eec56 Fix import issues when missing integrations (#447) d16997c Fix bug with TorchEvalStep (#442) 9a5f792 Bump beaker-py to version >=1.11 (#443) f5c2e52 Add to FAQ (#440)

- Python
Published by github-actions[bot] over 3 years ago

tango - v1.0.0

This is the first stable release of AI2 Tango, the culmination of over a year's work and nearly 1000 commits!

We've been working on this project quietly for a while on the AllenNLP team, and it's being used daily by researchers here. So we're excited to officially announce it now that we're happy with the API 🎉

What's new since the last release

Added 🎉

  • Added step_extra_dependencies input field to Step class that can be used to force a dependency on another step even if the current step doesn't directly depend on the output of the other step. See #418 for more context.

Changed ⚠️

  • beaker-py >= 1.10 required.

Fixed ✅

  • Long log lines will be soft-wrapped to ensure that links are clickable.
  • Fixed a bug where some workspaces could be left in a bad state if a step's Format failed to serialize the step's result in Workspace.step_finished().
  • Sometimes functions and methods end up as arguments to steps, which means we have to hash them. Instead of taking a hash of the function, we now take a hash of the function's module and name.
  • Fixed a bug with the Beaker executor where it would hang at the end of a run if a step failed that is a dependency of another step.
  • Fixed tests to work with new version of transformers.
  • Fixed Executor.execute_sub_graph_for_step() to be able to run the step's dependencies in parallel.

Commits

a48a825 Only install gh in entrypoint if needed (#439) 69b948d Improve error handling for step state edge case (#429) 39b3132 Fix Executor.execute_sub_graph_for_step() (#438) 408944e Loosen range on some dependencies (#437) 76cd46d Follow up fix for #401 (#434) 04e7963 Fix bug with Beaker executor (#430) fb077e6 Bump fairscale from 0.4.9 to 0.4.11 (#432) 25f7373 Bump furo from 2022.6.21 to 2022.9.15 (#410) f11925c Bump mypy from 0.971 to 0.982 (#433) 4845317 Bump sphinx from 5.1.1 to 5.2.3 (#431) 572f83b Bump myst-parser from 0.18.0 to 0.18.1 (#426) e1fc2d3 Add step_extra_dependencies option to Step class (#419) 19927d3 Don't pin protobuf anymore (#428) 3dbe8c7 We need click 8 to work. (#427) cbdbe68 Hashing functions (#424) 747469f Mark Format serialization failures as step failure (#421) b1d6431 Ensure base path for included modules kept in sys.path (#406) e1a1cd1 Minor improvements to BeakerExecutor internals (#416) d3e1891 Ensure log lines get soft-wrapped so links are always clickable (#415)

- Python
Published by github-actions[bot] over 3 years ago

tango - v0.14.0

What's new

Added 🎉

  • Adds a function to modify a Hugging Face transformer with IA3 adaptors
  • Added a BeakerScheduler registrable class, specified as the argument scheduler to BeakerExecutor, which controls the resources assigned to steps ran on Beaker. Users can implement their own BeakerScheduler subclasses to customize the resource assignment behavior.

Changed ⚠️

  • In the tango run command, --no-server is now the default. Use --server to start the server.

Fixed ✅

  • Made BeakerExecutor more robust to connection, timeout, SSL, and other recoverable HTTP errors.
  • Made the BeakerStepLock more robust, and as a result BeakerWorkspace is more robust and should require less manual intervention for locks in a bad state.
  • Fixed a bug with the internal scheduling logic of the BeakerExecutor which could delay submitting some steps in parallel.
  • Fixed a bug where creating a StepInfo object from params might result in unnecessary imports.
  • Fixed a bug where canceling the Beaker executor might not work properly.
  • Fixed a bug where the trainer trains too much when train_epochs is set and you're using gradient accumulation.
  • Fixed how the results of uncacheable steps are displayed by tango run.
  • Beaker executor won't run duplicate cacheable steps at the same time.

Commits

0828adc BeakerExecutor won't run duplicate cacheable steps (#414) 7382019 IA3 adaptors (#403) d498cf7 Hot fix to final output bff9ebf Add warning when steps can't be run yet, bug fixes (#408) c72552e Don't start server by default (#409) d34fe09 Added BeakerScheduler class for handling resource assignment (#407) 5dcbb56 Gradient accumulation and train_epochs (#402) d27bbef Make BeakerStepLock more robust (#401) cd9b5fd Fix bug with StepInfo.from_params(), canceling BeakerExecutor, reserve "ref" name (#400) 6ff6b9e Fix bug with scheduling logic (#399) 15196f2 Deterministic hashing for tensors (#398) 230d78e Make BeakerExecutor more robust to all recoverable errors types (connection, HTTP, SSL, timeout, etc) (#397) 5f63a27 Bump fairscale from 0.4.8 to 0.4.9 (#391)

- Python
Published by github-actions[bot] over 3 years ago

tango - v0.13.0

What's new

Added 🎉

  • You can now reference into a particular index of the result of another step in a config. For example: {type: "ref", ref: "some_previous_step", key: 0}. The key field can be an integer if the result of the referenced step is a list or tuple, or a string if the result of the referenced step is a dictionary.
  • Added priority parameter to Beaker executor for setting the default task priority for Beaker jobs.
  • Added Workspace.step_result() method for getting a step's result from the latest run.
  • tango run will now display a URL to the logs for failed steps when you use the BeakerExecutor.

Changed ⚠️

  • The TorchTrainStep now enables monitoring arbitrary model outputs during training. TorchTrainEngine.forward_train now returns a tuple loss, model_outputs for each micro batch and the list of model outputs for all micro batches in a batch is passed to the TrainCallback.log_batch and TrainCallback.post_batch.
  • Tango will now automatically search Python modules in the current working directory for registered classes so that you don't always need to use the --include-package setting.
  • The minimum supported Python version is now 3.8.
  • Added support for PyTorch Lightning 1.7.x
  • The Beaker Executor will no-longer live-stream logs from Beaker jobs, but logs will be viewable on Beaker and more readable.
  • Only the Beaker executor requires a clean working directory

Fixed ✅

  • Fixed a bug that did not allow a wandb artifact's type to be set from a step's metadata dictionary.
  • Fixed a bug with how the Beaker executor streams log lines from Beaker which sometimes resulted in messages missing some starting characters, and tqdm lines being duplicated.
  • Fixed a bug in the Beaker workspace where the lock dataset wouldn't be removed if the step was found to be in an invalid state.
  • Improved cluster choice logic in BeakerExecutor to ensure greater diversity of clusters when submitting many steps at once.
  • Fixed bug where sub-processes of the multicore executor would use the wrong executor if executor was defined in a tango.yml file.

Commits

4f89d55 Improve Beaker cluster choice logic (#392) e1ceae2 Display URL to logs for failed steps (#390) 3dc9591 Bump black from 22.6.0 to 22.8.0 (#380) c9ce257 Catch when Beaker experiments are stopped (#389) 0fe12e9 Fix issues with WandbWorkspace causing CI crash (#388) 342eb26 Keep parameters in Params objects to make error messages more readable (#375) 92f0354 Simplified beaker logging (#383) fd9d3cc Only the Beaker executor needs clean working directories (#373) 06f26ae Update wandb artifact type (#378) f6a6b70 Update base images, get us out of the latest infinite loop of pip madness (#382) 306986b Catch all errors when attempting log record decode (#379) 628caff Allowing indexing into step results in config (#371) 858cef8 Minor improvement to Beaker logging (#377) 7a5619e Add Workspace.step_result() method (#374) 0750d76 Fix bugs with how BeakerExecutor streams logs (#372) 6e8b107 Detailed train outputs (#369) bcd50d8 Update pytorch-lightning requirement from <1.7,>=1.6 to >=1.6,<1.8 (#349) 8ed0c86 Bump fairscale from 0.4.6 to 0.4.8 (#347) 62f2746 Python minimum version is 3.8 (#368) 45e02fe Auto import local Python modules when searching for registered classes (#367)

- Python
Published by github-actions[bot] over 3 years ago

tango - v0.12.0

What's new

Added 🎉

  • Step resources:
    • Added a step_resources parameter to the Step class which should be used to describe the computational resources required to run a step. Executor implementations can use this information. For example, if your step needs 2 GPUs, you should set step_resources=StepResources(gpu_count=2) ("step_resources": {"gpu_count": 2} in the configuration language).
    • Added a Step.resources() property method. By default this returns the value specified by the step_resources parameter. If your step implementation always requires the same resources, you can just override this method so you don't have to provide the step_resources parameter.
  • Step execution:
    • Added an executor field to the tango.yml settings. You can use this to define the executor you want to use by default.
    • Added a Beaker Executor to the Beaker integration, registered as an Executor with the name "beaker". To use this executor, add these lines to your tango.yml file: ```yaml executor: type: beaker beaker_workspace: ai2/my-workspace clusters:
      • ai2/general-cirrascale `` See the docs for theBeakerExecutor` for more information on the input parameters.
  • Step class:
    • Added a metadata field to the step class API. This can be set through the class variable METADATA or through the constructor argument step_metadata.
  • Weights & Biases integration:
    • You can now change the artifact kind for step result artifacts by adding a field called "artifactkind" to a step's metadata. For models, setting "artifactkind" to "model" will add the corresponding artifact to W&B's new model zoo.

Changed ⚠️

  • CLI:
    • The tango run command will throw an error if you have uncommitted changes in your repository, unless you use the --allow-dirty flag.
    • The tango run command will use the lightweight base executor (single process) by default. To use the multi-process executor, set -j/--parallelism to 1 or higher or -1 to use all available CPU cores.

Fixed ✅

  • Fixed bug where StepInfo environment and platform metadata could be out-of-date if a step is run again due to failure.
  • Fixed a bug where an unfortunate combination of early stopping and decreasing model performance could result in a crash in the torch trainer.

Commits

befb00a Add workspace_metadata arg to Step class, allow changing artifact kind in W&B workspace (#363) 5ab1c2a Fix undefined behavior with TorchTrainStep (#366) bf3c1a0 Update filelock requirement from <3.8,>=3.4 to >=3.4,<3.9 (#354) b4e48a7 Update jsonpickle requirement from <2.2.0,>=2.1.0 to >=2.1.0,<2.3.0 (#351) 1c491f0 Update wandb requirement from <0.13,>=0.12 to >=0.12,<0.14 (#350) 93d5eb4 Bump allenai/setup-beaker from 1 to 2 (#359) dc0f89a Fix #355 - ensure git metadata is up-to-date (#361) 258e880 Raise better error msg from step_result_for_run() (#360) 43916d1 Print debugging information about the repo used. (#353) 928aa7a Add BeakerExecutor (#340)

- Python
Published by github-actions[bot] over 3 years ago

tango - v0.11.0

What's new

Added 🎉

  • Added a Flax integration along with an example config.

Commits

b4cd2b3 Flax Integration (#313) b9a7422 Bump sphinx from 5.0.2 to 5.1.1 (#346) d7952ef Bump mypy from 0.961 to 0.971 (#339) 6a58bfd Put PIP install instructions first (#348)

- Python
Published by github-actions[bot] over 3 years ago

tango - v0.10.1

What's new

Fixed ✅

  • Fixed issue where the StepInfo config argument could be parsed into a Step.
  • Restored capability to run tests out-of-tree.

Commits

2498318 Fix issue where StepInfo config could be parsed into a Step (#344) 57096b2 Make tests runnable out-of-tree for help with conda-packaging (#307)

- Python
Published by github-actions[bot] over 3 years ago

tango - v0.10.0

What's new

Changed ⚠️

  • Renamed workspace parameter of BeakerWorkspace class to beaker_workspace.
  • Executor class is now a Registrable base class. MulticoreExecutor is registered as "multicore".

Removed 👋

  • Removed StepExecutionMetadata. Its fields have been absorbed into StepInfo.

Fixed ✅

  • Improved Step.ensure_result() such that the step's result doesn't have to be read from the cache.
  • Fixed an issue with the output from MulticoreExecutor such that it's now consistent with the default Executor for steps that were found in the cache.
  • One of our error messages referred to a configuration file that no longer exists.
  • Improved performance of BeakerWorkspace.

Added 🎉

  • Added the ability to train straight Model instead of just Lazy[Model]

Commits

4e809f5 Eager models (#319) 361777b Metadata changes, make executor registrable (#331) a6b0be9 Beaker workspace performance (#328) f43e5ea Update torch requirement from <1.12,>=1.9 to >=1.9,<1.13 (#330) 8495c64 update dev dependencies (#333) 712d862 Make multicore executor output consistent with default (#325) 903569c Refer to the right config file (#324) bd9e4be Modernize our issue templates (#323)

- Python
Published by github-actions[bot] over 3 years ago

tango - v0.9.1

What's new

Fixed ✅

  • Fixed non-deterministic behavior in TorchTrainStep.
  • Fixed bug in BeakerWorkspace where .step_info(step) would raise a KeyError if the step hasn't been registered as part of a run yet.
  • Fixed a bug in BeakerWorkspace where it would send too many requests to the beaker service.
  • Fixed a bug where WandbWorkspace.step_finished() or .step_failed() would crash if called from a different process than .step_starting().
  • Fixed a bug in WandbWorkspace.step_finished() which led to a RuntimeError sometimes while caching the result of a step.

Commits

c6fc5be Fix bugs with Workspace and WandbWorkspace, specifically (#321) 80c90ca Beaker DOS fix (#315) 8b75591 Log from BeakerStepLock at WARNING level (#316) 4d46d67 fix non-deterministic behavior in TorchTrainStep (#314) c59b6b3 Bump actions/setup-python from 3 to 4 (#311) b02cf40 Bump sphinx from 4.5.0 to 5.0.1 (#305) 4501815 Bump furo from 2022.6.4 to 2022.6.4.1 (#309) da9c29c Fix bug in Beaker workspace (#312) e8422cb Bump mypy from 0.960 to 0.961 (#308) 8256a74 Bump myst-parser from 0.17.2 to 0.18.0 (#310) 44ae92e Bump furo from 2022.4.7 to 2022.6.4 (#306) 39923ae Update protobuf requirement from <=3.20.0 to <4.22.0 (#301) e7ef1f5 Registerables first steps eg (#304)

- Python
Published by github-actions[bot] over 3 years ago

tango - v0.9.0

What's new

Added 🎉

  • Added a Beaker integration that comes with BeakerWorkspace, a remote Workspace implementation that uses Beaker Datasets under the hood.
  • Added a datasets::dataset_remix step that provides the split remixing functionality of tango.steps.datasest_remix.DatasetRemixStep now for Huggingface DatasetDict.

Changed ⚠️

  • If you try to import something from a tango integration that is not fully installed due to missing dependencies, an IntegrationMissingError will be raised instead of ModuleNotFound.
  • You can now set -j 0 in tango run to disable multicore execution altogether.

Fixed ✅

  • Improved how steps and workspaces handle race conditions when different processes are competing to execute the same step. This would result in a RuntimeError before with most workspaces, but now it's handled gracefully.
  • Fixed bug which caused GradScaler state to not be saved and loaded with checkpoints.

Commits

0ddd2ac Add Beaker integration (#296) 6bdd1dd Updates the Euler example (#297) bc89470 GradScaler state saving and loading (#293) b8562db fix old filename in CONTRIBUTING.md (#300) 4aff1bb Dataset remix (#298) eb1fcd8 Bump mypy from 0.950 to 0.960 (#295) 903741e Update filelock requirement from <3.7,>=3.4 to >=3.4,<3.8 (#284) b58b823 Handle missing integrations (#292)

- Python
Published by github-actions[bot] over 3 years ago

tango - v0.8.0

What's new

Added 🎉

  • Added a Weights & Baises remote Workspace implementation: WandbWorkspace, registered as "wandb". This can be instantiated from a workspace URL in the form "wandb://entity/project".
  • Added a method Workspace.step_result_for_run which gives the result of a step given the run name and step name within that run.
  • Added property Workspace.url, which returns a URL for the workspace that can be used to instantiate the exact same workspace using Workspace.from_url(). Subclasses must implement this.

Changed ⚠️

  • StepInfo start and end times will be always be in UTC now.
  • WandbTrainCallback now logs system metrics from each worker process in distributed training.
  • StepCache.__contains__() and StepCache.__getitem__() now take accept either a Step or StepInfo as an argument (Union[Step, StepInfo]).
  • Refactored tango.step_graph.StepGraph to allow initialization from a Dict[str, Step].
  • Executor.execute_step_graph() now attempts to execute all steps and summarizes success/failures.

Fixed ✅

  • Fixed bug with LocalWorkspace.from_parsed_url() (#278).
  • Deprecation warnings will now be logged from tango CLI.
  • Fixed the text format in the case of serializing an iterator of string.
  • Added missing default value of None to TangoGlobalSettings.find_or_default().
  • Mypy has become incompatible with transformers and datasets, so we have to disable the checks in some places.
  • The VERSION member of step arguments that were wrapped in Lazy were not respected. Now they are.

Commits

3069226 Makes sure the VERSION parameter of classes is respected even when we construct them inside of a Lazy object. (#289) dd71446 Add Weights & Baises remote workspace (#232) e3f2bd2 Adds a dependency that's missing from transformers (#285) 25919e1 Fixes the text format (#283) 381de74 Add missing default to TangoGlobalSettings.find_or_default() (#282) 9ac708a Update click requirement from <8.1.3,>=7.0 to >=7.0,<8.1.4 (#277) 749357e Bump mypy from 0.942 to 0.950 (#276) 2c59c96 Bump allenai/beaker-run-action from 1.0 to 1.1 (#274) 53ffe80 refactor (#275)

- Python
Published by github-actions[bot] almost 4 years ago

tango - v0.7.0

What's new

Added 🎉

  • Added the "-n/--name" option to tango run. This option allows the user to give the run an arbitrary name.
  • Added a convenience property .workspace to Step class that can be called from a step's .run() method to get the current Workspace being used.
  • Gave FromParams objects (which includes all Registrable objects) the ability to version themselves.
  • Added CLI option to run a single step in a config using --step-name or -s.
  • Added a MultiCoreExecutor that executes steps in parallel.
  • Added an ExecutorOutput dataclass that is returned by Executor.execute_step_graph().
  • StepGraph now prints itself in a readable way.
  • Tango now automatically detects when it's running under a debugger, and disables multicore support accordingly. Many debuggers can't properly follow sub-processes, so this is a convenience for people who love debuggers.
  • Added more models to the stuff we can import from the transformers library.
  • Added new example for finetuning text-to-text models.

Changed ⚠️

  • Renamed click_logger to cli_logger, and we now use rich's logging Handler as the default handler, which means prettier output, better tracebacks, and you can use rich's markup syntax with the cli_logger to easily add style to text.
  • Refactored tango.step_graph.StepGraph to allow initialization from a Dict[str, Step].
  • Executor.execute_step_graph() now attempts to execute all steps and summarizes success/failures.
  • Upgraded PyTorch version in tango Docker image to latest v1.11.0+cu113.
  • RunGeneration now allows model object as input.

Fixed ✅

  • Fixed bug that mistakenly disallowed fully-qualified names containing "_" (underscores) in the config.
  • Fixed bug where TorchTrainStep working directory would be left in an unrecoverable state if training failed after saving the final model weights.
  • Fixed bug in FromParams where **kwargs might be passed down to the constructors of arguments.
  • Fixed bug in the way dependencies are tracked between steps.
  • Fixed bug that caused MulticoreExecutor to hang in case of a failing step that was required recursively (not directly) downstream.
  • Fixed bug in the way dependencies are tracked between steps
  • Compatibility with PyTorch Lightning 1.6

Commits

1083049 Finetuning (#255) 42b1dba Bug fix with failing steps (#257) 7bd251a Bump myst-parser from 0.17.0 to 0.17.2 (#273) cc9a1dd Bump actions/upload-artifact from 2 to 3 (#262) 66777d9 Bump actions/download-artifact from 2 to 3 (#261) 14d4adb use new beaker-action for building test image (#265) af47287 Update pytorch-lightning requirement from <1.6,>=1.5 to >=1.5,<1.7 (#248) b1df9a4 use beaker-run action for GPU Tests (#263) 0a7468e fix release job (#260) c1b16b2 Bump furo from 2022.3.4 to 2022.4.7 (#259) b55aaf2 use beaker-py to submit GPU tests (#258) b2a93a9 Logging part 2: denoising run logging and making Dirk happy (#252) ff6be8d Update click requirement from <=8.0.4,>=7.0 to >=7.0,<8.1.3 (#254) 83d78cc Bump mypy from 0.941 to 0.942 (#243) 3769327 Bump sphinx from 4.4.0 to 4.5.0 (#245) 81fc5c5 Bump black from 21.12b0 to 22.3.0 (#246) e46059b Update tqdm requirement from <4.64,>=4.62 to >=4.62,<4.65 (#256) bbdeb6f Revert "Set `$TEMP` (#241)" b9fd9e9 Fix tracking dependencies between steps (#249) 53502e1 Pretty-print a step graph (#250) d5328c9 Fix dissimilar objects hashing to the same thing (#240) ccc37ce Autodetect debugger and turn off multicore (#251) 5c39f61 Pin click 5bb0fad Logging improvements (#233) 037e4a0 fix bug with FromParams (#242) e142530 Bump actions/cache from 2 to 3 (#236) 878402d Set `$TEMP` (#241) 2d9fa0c fix bug w/ TorchTrainStep working dir (#238) 410faeb Multicore Parallelism (#204) 9e8e99f Update datasets requirement from <2,>=1.12 to >=1.12,<3 (#234) 40e0a1a Bump mypy from 0.940 to 0.941 (#230) ede7428 add name to changelog workflow 4bb659b Bump actions/setup-python from 2 to 3 (#229) 5db1a6a Bump actions/checkout from 1 to 3 (#228) 8049104 Update torch version where it's hard-coded, add an automatic remind to do this stuff in the future (#227) fe05449 add back intersphinx inventory links for HF libraries (#222) 9927749 Bump mypy from 0.931 to 0.940 (#226) 29ab68b Update torch requirement from <1.11,>=1.9 to >=1.9,<1.12 (#225) a3fc83b Bump furo from 2022.2.23 to 2022.3.4 (#218) 28e839e Bump fairscale from 0.4.5 to 0.4.6 (#224) f18d393 Update tqdm requirement from <4.63,>=4.62 to >=4.62,<4.64 (#213) 54c4a8d automatically keep copyright up-to-date (#221) 06adb07 Allow setting the run name as a command-line option (#212) 71e0639 Update cached-path requirement from <1.1,>=1.0 to >=1.0,<1.2 (#217) 5d4660a Temporarily remove intersphinx links to HF docs (#220) 13c7f3f Merge pull request #216 from allenai/VersionForFromParams 0027cb2 Merge pull request #215 from allenai/fix-fully-qualified-name-recognition 76f9922 Add "Step.workspace" property (#210)

- Python
Published by github-actions[bot] almost 4 years ago

tango - v0.6.0

What's new

Added 🎉

  • New example that finetunes a pre-trained ResNet model on the Cats & Dogs dataset.
  • Added a '@requires_gpus' decorator for marking tests as needing GPUs. Tests marked with this will be run in the "GPU Tests" workflow on dual k80 GPUs via Beaker.
  • Added the "-w/--workspace" option to tango run and tango server commands. This option takes a path or URL, and instantiates the workspace from the URL using the newly added Workspace.from_url() method.
  • Added the "workspace" field to TangoGlobalSettings.
  • Added the "environment" field to TangoGlobalSettings for setting environment variables each time tango is run.
  • Added a utility function to get a StepGraph directly from a file.
  • Added tango.settings module and tango settings group of commands.
  • A format for storing sequences as SqliteSparseSequence
  • A way to massage kwargs before they determine the unique ID of a Step

Changed ⚠️

  • local_workspace.ExecutorMetadata renamed to StepExecutionMetadata and now saved as execution-metadata.json.
  • tango run without the option "-w/--workspace" or "-d/--workspace-dir" will now use a MemoryWorkspace instead of a LocalWorkspace in a temp directory, unless you've specified a default workspace in a TangoGlobalSettings file.
  • Moved tango.workspace.MemoryWorkspace and tango.local_workspace.LocalWorkspace to tango.workspaces.*.
  • Moved tango.step_cache.MemoryStepCache and tango.step_cache.LocalStepCache to tango.step_caches.*.
  • Deprecated the -d/--workspace-dir command-line option. Please use -w/--workspace instead.

Fixed ✅

  • Fixed a small bug LocalWorkspace would fail to capture the conda environment in our Docker image.
  • Fixed activation of FILE_FRIENDLY_LOGGING when set from the corresponding environment variable.
  • Fixed setting log level via the environment variable TANGO_LOG_LEVEL.
  • Use relative paths within the work_dir for symbolic links to the latest and the best checkpoints in TorchTrainStep.
  • Fixed some scenarios where Tango can hang after finishing all steps.
  • distributed_port and log_every parameters won't factor into TorchTrainStep's unique ID.
  • MappedSequence now works with slicing.
  • MappedSequence now works with Huggingface Dataset.
  • Uncacheable steps are now visible in Tango UI.
  • Fixed bug in Registrable.list_available() where an error might be raised if the default implementation hadn't been explicitly imported.
  • Fixed issue where having a default argument to the run() method wasn't getting applied to the step's unique ID.

Commits

f9da0af Merge pull request #211 from allenai/Massage e78dcbe Allow setting environment variables in tango settings, fix bug with TANGOLOGLEVEL env var (#209) 82404b6 Re-create LICENSE so GitHub will show it (#208) 0fadecf Bump furo from 2022.2.14.1 to 2022.2.23 (#207) 787b6e6 Merge pull request #206 from allenai/settings c3401f2 Merge pull request #205 from allenai/RobustnessFixes 7ceda9c Merge pull request #201 from allenai/workspace-prep 6dd7d86 Merge pull request #200 from allenai/uncacheable-steps-in-server 5ad3f44 Bump furo from 2022.1.2 to 2022.2.14.1 (#199) 3528230 Update filelock requirement from <3.5,>=3.4 to >=3.4,<3.7 (#202) 21d6d40 Merge pull request #193 from allenai/StepGraphFromFile 258a7d2 skip 'distributed_port' and 'log_every' in unique ID (#197) dd4c47f Merge pull request #192 from allenai/CloseSqliteHarder cc94e1c Merge pull request #156 from allenai/DocumentationRefresh 5cc86b8 Rename "ExecutorMetadata" -> "StepExecutionMetadata" (#195) 6aecab7 Bump myst-parser from 0.16.1 to 0.17.0 (#191) 6478293 make pushing test image to Beaker more robust (#190) 7c1ac5b Finetune resnet Example for Tango (#150) 5187b01 update docs for integration tests and gpu tests timeout 95b78b5 Add new manually triggered workflow for integration tests, other bug fixes (#188) 19f7b31 Merge pull request #189 from allenai/fix-checkpoint-path-link a438b26 Workflow quickfix 671a6dc verify exit code of beaker job (#187) 7ccad94 Merge pull request #186 from allenai/add-tests bf6ecd0 Run GPU tests on Beaker (#183)

- Python
Published by github-actions[bot] almost 4 years ago

tango - v0.5.0

What's new

Added 🎉

  • Added TrainingEngine abstraction to torch integration.
  • Added FairScale with a FairScaleTrainingEngine that leverages FairScale's FullyShardedDataParallel. This is meant to be used within the TorchTrainStep.
  • All PyTorch components (such as learning rate schedulers, optimizers, data collators, etc) from the transformers library and now registered under the corresponding class in the torch integration. For example, transformers Adafactor optimizer is registered as an Optimizer under the name "transformers::Adafactor". More details can be found in the documentation for the transformers integration.

Changed ⚠️

  • Various changes to the parameters othe TorchTrainStep due to the introduction of the TrainingEngine class.
  • Params logged as DEBUG level instead of INFO to reduce noise in logs.
  • The waiting message for FileLock is now clear about which file it's waiting for.
  • Added an easier way to get the default Tango global config
  • Most methods to TorchTrainCallback also take an epoch parameter now.
  • WandbTrainCallback now logs peak GPU memory occupied by PyTorch tensors per worker. This is useful because W&B's system metrics only display the total GPU memory reserved by PyTorch, which is always higher than the actual amount of GPU memory occupied by tensors. So these new metrics give a more accurate view into how much memory your training job is actually using.
  • Plain old Python functions can now be used in Lazy objects.
  • LocalWorkspace now creates a symlink to the outputs of the latest run.
  • Tango is now better at guessing when a step has died and should be re-run.
  • Tango is now more lenient about registering the same class under the same name twice.
  • When you use dict instead of Dict in your type annotations, you now get a legible error message. Same for List, Tuple, and Set.

Fixed ✅

  • Fixed a bug in Registrable and FromParams where registered function constructors would not properly construct arguments that were classes.
  • Fixed a bug in FromParams that would cause a crash when an argument to the constructor had the name params.
  • Made FromParams more efficient by only trying to parse the params as a Step when it looks like it actually could be a step.
  • Fixed bug where Executor would crash if git command could not be found.
  • Fixed bug where validation settings were not interpreted the right way by the torch trainer.
  • When you register the same name twice using Registrable, you get an error message. That error message now contains the correct class name.

Commits

a39a69f Merge pull request #161 from allenai/FromParamsDuJour 3063a92 CHANGELOG quick fix cd006ae Add TrainEngine abstraction to TorchTrainStep, add FairScale integration, improve transformers integration (#77) 93438eb Update setuptools requirement from <=59.5.0 to <60.8.0 (#170) e57dd91 Bump sphinx-copybutton from 0.4.0 to 0.5.0 (#174) a8b1bdc split Docker build into seperate workflow, only run when necessary (#178) 59c91f7 make install comments work on all shells (#179) a059416 Merge pull request #160 from allenai/GuessStepDirBetter de7195d more fixes for conda-forge (#177) 75e9d42 use conda in Docker image, multi-stage build (#172) 611e446 Merge pull request #176 from allenai/latest-outputs 7241d20 Merge pull request #175 from allenai/self-contained-tests 83aa692 Merge pull request #153 from allenai/LazyWithoutFromParams 893e601 use virtualenv within Docker (#167) 178b8bd Merge pull request #171 from allenai/LenientRegister 6c765c8 Merge pull request #169 from allenai/InformativeFileLock 91ff7ac Merge pull request #168 from allenai/DefaultGlobalConfig 5d602fb push Docker images to GHCR.io (#166) 2b26fc8 set 'resume' to 'allow' instead of 'auto' (#155) 26771e7 fix bug when git missing (#163) 9009119 Add Dockerfile (#162) a02155d Add a required flag to the README for gpt2-example (#159)

- Python
Published by github-actions[bot] about 4 years ago

tango - v0.4.0

What's new

Changed ⚠️

  • Default log level is WARNING instead of ERROR.
  • The web UI now renders the step graph left-to-right.
  • The web UI now shows runs by date, with the most recent run at the top.
  • The web UI now shows steps in a color-coded way.
  • The --include-package flag now also accepts paths instead of module names.

Fixed ✅

  • Ensure tqdm log lines always make it into the log file out.log even when log level is WARNING or ERROR.

Commits

5ff51d6 Fix GPT2 example. (#158) 4011482 make --include-package accept paths (#157) 92b8fe5 Merge pull request #148 from allenai/RunsWithDates a4417e5 fix gpt2 config 797e3e8 minor logging tweaks (#145) 42654e6 Prepare for release v0.4.0rc5 df301ef Merge pull request #119 from allenai/RunGeneration 42535c9 Add TorchEvalStep to torch integration, use "devicecount" in TorchTrainStep instead of "devices" (#120) ecc6087 Store log output to a file in run directory, other logging improvements (#132) 9008255 Merge pull request #141 from allenai/dependabot/pip/sphinx-4.4.0 8e09b66 Merge pull request #139 from allenai/remove-no-logging 26318b2 Merge pull request #133 from allenai/dependabot/pip/mypy-0.931 7c91c4b Merge pull request #126 from allenai/dependabot/pip/sphinx-4.3.2 a178076 Merge pull request #128 from allenai/dependabot/pip/mypy-0.930 106a8cf Merge pull request #130 from allenai/dependabot/pip/furo-2022.1.2 6db7b4c Merge pull request #129 from allenai/run-name-at-end b93d22d Add typing info to package (PEP 561) (#131) e8867ae fix StopEarlyCallback state recovery 465a525 Prepare for release v0.4.0rc4 7a65540 CHANGELOG quick fix 31622d1 add logo to docs and README (#121) b17d325 fix bug with StepInfo (#122) d5698a0 Bump myst-parser from 0.16.0 to 0.16.1 (#118) 69d1dc8 Bump mypy from 0.910 to 0.920 (#117) 9739944 Prepare for release v0.4.0rc3 20138ce improve release notes generation script 760b4f2 Add DatasetsFormat, making LoadDataset cacheable, fix bug with KeyboardInterrupt (#114) e51691f Improvements to W&B callback (#115) d044f6e Add pre/post epoch callbacks (#113) ae1ae0b Bump myst-parser from 0.15.2 to 0.16.0 (#111) c605a1e Merge pull request #90 from allenai/SqliteDictFormat 8d6804d Prepare for release v0.4.0rc2 f404541 Merge pull request #110 from allenai/Conda c154c92 Merge pull request #101 from allenai/Euler 288e02f ensure all integrations are imported if we can't find registered name (#109) e033170 Bump black from 21.11b1 to 21.12b0 (#102) b2781fd FAQ not FAQs :) (#108) 81e1225 add FAQs to docs (#107) 2148b74 Merge pull request #106 from allenai/Favicon f812b73 fix bug with resolving lazy step inputs (#105) 110fc79 Merge pull request #104 from allenai/why-tango eab62d8 fix bug in distributed training (#103) 901631f Removed scary warning e86b843 Better summary 3bfd7c1 adjust dependency pinning (#100) 72aaa53 Merge pull request #79 from allenai/jon/html-viz d45ec0c Merge pull request #99 from allenai/SkipGitTests 98b022b make sure workspaces can be imported from base module (#98) 07ac494 Merge pull request #97 from allenai/lower-click-pin 5ccefa9 fix prelease indicator condition in CI 04a5ab8 Prepare for release v0.4.0rc1 b2c09e3 fix typo in example (#96) aba5758 Merge pull request #94 from allenai/dependabot/pip/cached-path-gte-0.3.3-and-lt-1.1.0 48b0b24 Merge pull request #92 from allenai/dependabot/pip/datasets-gte-1.12-and-lt-1.17 0107672 Merge pull request #91 from allenai/dependabot/pip/sphinx-4.3.1 7d9d919 Merge pull request #93 from allenai/NoOverrides b5907de Merge pull request #67 from allenai/ResponsibleSteps 20951ea Bump furo from 2021.11.16 to 2021.11.23 (#89) 8bb00c4 Bump black from 21.11b0 to 21.11b1 (#88) f240ac4 update filelock + cachedpath, improve release scripts (#87) eba4b8e Bump black from 21.10b0 to 21.11b0 (#86) bc80bb8 Merge pull request #85 from allenai/dependabot/pip/filelock-gte-3.3-and-lt-3.5 17d28c7 Bump furo from 2021.11.15 to 2021.11.16 (#84) 4118912 Merge pull request #82 from allenai/dependabot/pip/furo-2021.11.15 cb1b853 Merge pull request #62 from allenai/dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 30f7a13 W&B log as step+1 (#76) aab58b6 add some conda instructions to CONTRIBUTING.md (#81)

- Python
Published by github-actions[bot] about 4 years ago

tango - v0.4.0rc5

What's new

Added 🎉

  • Added TorchEvalStep to torch integration, registered as "torch::eval".

Changed ⚠️

  • Renamed aggregate_val_metric to auto_aggregate_val_metric in TorchTrainStep.
  • devices parameter to TorchTrainStep replaced with device_count: int.
  • Run name printed at the end of a run so it's easier to find.
  • Type information added to package data. See PEP 561 for more information.
  • A new integration, transformers, with two new steps for running seq2seq models.
  • Added logging_tqdm, if you don't want a progress bar, but you still want to see progress in the logs.
  • Added threaded_generator(), for wrapping generators so that they run in a separate thread from the generator's consumer.
  • Added a new example for evaluating the T0 model on XSum, a summarization task.
  • Added MappedSequence for functionally wrapping sequences.
  • Added TextFormat, in case you want to store the output of your steps in raw text instead of JSON.
  • Steps can now list arguments in SKIP_ID_ARGUMENTS to indicate that the argument should not affect a step's unique id. This is useful for arguments that affect the execution of a step, but not the output.
  • Step now implements __str__, so steps look pretty in the debugger.
  • Added DatasetCombineStep, a step that combines multiple datasets into one.
  • Added common.logging.initialize_worker_logging() function for configuring logging from worker processes/threads.
  • Logs from tango run ... will be written to a file called out.log in the run directory.

Fixed ✅

  • Fixed torch StopEarlyCallback state not being recovered properly on restarts.
  • Fixed file friendly logging by removing special styling characters.
  • Ensured exceptions captured in logs.
  • LocalWorkspace now works properly with uncacheable steps.
  • When a Tango run got killed hard, with kill -9, or because the machine lost power, LocalWorkspace would sometimes keep a step marked as "running", preventing further executions. This still happens sometimes, but it is now much less likely (and Tango gives you instructions for how to fix it).
  • To make all this happen, LocalWorkspace now saves step info in a Sqlite database. Unfortunately that means that the workspace format changes and existing workspace directories won't work properly with it.
  • Fixed premature cleanup of temporary directories when using MemoryWorkspace

Commits

df301ef Merge pull request #119 from allenai/RunGeneration 42535c9 Add TorchEvalStep to torch integration, use "device_count" in TorchTrainStep instead of "devices" (#120) ecc6087 Store log output to a file in run directory, other logging improvements (#132) 9008255 Merge pull request #141 from allenai/dependabot/pip/sphinx-4.4.0 8e09b66 Merge pull request #139 from allenai/remove-no-logging 26318b2 Merge pull request #133 from allenai/dependabot/pip/mypy-0.931 7c91c4b Merge pull request #126 from allenai/dependabot/pip/sphinx-4.3.2 a178076 Merge pull request #128 from allenai/dependabot/pip/mypy-0.930 106a8cf Merge pull request #130 from allenai/dependabot/pip/furo-2022.1.2 6db7b4c Merge pull request #129 from allenai/run-name-at-end b93d22d Add typing info to package (PEP 561) (#131) e8867ae fix StopEarlyCallback state recovery

- Python
Published by github-actions[bot] about 4 years ago

tango - v0.4.0rc4

What's new

Fixed ✅

  • Fixed a bug where StepInfo fails to deserialize when error is an exception that can't be pickled.

Commits

7a65540 CHANGELOG quick fix 31622d1 add logo to docs and README (#121) b17d325 fix bug with StepInfo (#122) d5698a0 Bump myst-parser from 0.16.0 to 0.16.1 (#118) 69d1dc8 Bump mypy from 0.910 to 0.920 (#117)

- Python
Published by github-actions[bot] about 4 years ago

tango - v0.4.0rc3

What's new

Added 🎉

  • Added DatasetsFormat format and LoadStreamingDataset step to datasets integration.
  • SqliteDictFormat for datasets.
  • Added pre_epoch() and post_epoch() callback methods to PyTorch TrainCallback.

Changed ⚠️

  • LoadDataset step from datasets integration is now cacheable, using the DatasetsFormat format by default. But this only works with non-streaming datasets. For streaming datasets, you should use the LoadStreamingDataset step instead.

Fixed ✅

  • Fixed bug where KeyboardInterrupt exceptions were not handled properly by steps and workspaces.
  • WandbTrainCallback now will use part of the step's unique ID as the name for the W&B run by default, to make it easier to indentify which tango step corresponds to each run in W&B.
  • WandbTrainCallback will save the entire TrainConfig object to the W&B config.

Commits

20138ce improve release notes generation script 760b4f2 Add DatasetsFormat, making LoadDataset cacheable, fix bug with KeyboardInterrupt (#114) e51691f Improvements to W&B callback (#115) d044f6e Add pre/post epoch callbacks (#113) ae1ae0b Bump myst-parser from 0.15.2 to 0.16.0 (#111) c605a1e Merge pull request #90 from allenai/SqliteDictFormat

- Python
Published by github-actions[bot] about 4 years ago

tango - v0.4.0rc2

What's new

Added 🎉

  • Sample experiment configurations that prove Euler's identity

Changed ⚠️

  • Loosened Click dependency to include v7.0.
  • Loosened datasets dependency.
  • Tightened petname dependency to exclude next major release for safety.

Fixed ✅

  • Workspace, MemoryWorkspace, and LocalWorkspace can now be imported directly from the tango base module.
  • Uncacheable leaf steps would never get executed. This is now fixed.
  • We were treating failed steps as if they were completed by accident.
  • The visualization had a problem with showing steps that never executed because a dependency failed.
  • Fixed a bug where Lazy inputs to a Step would fail to resolve arguments that come from the result of another step.
  • Fixed a bug in TorchTrainStep where some arguments for distributed training (devices, distributed_port) weren't being set properly.

Commits

f404541 Merge pull request #110 from allenai/Conda 4c347db Merge branch 'main' into Conda c154c92 Merge pull request #101 from allenai/Euler b3a8ae6 Revert "Make sure default steps are available when you run tango run" 76bda76 Merge remote-tracking branch 'origin/main' into Euler e073925 Revert "Import integrations safely" 5550ff8 Adds Conda to the readme 288e02f ensure all integrations are imported if we can't find registered name (#109) e033170 Bump black from 21.11b1 to 21.12b0 (#102) 2fd48cc Merge branch 'main' into Euler b2781fd FAQ not FAQs :) (#108) b31c8f2 Merge branch 'main' into Euler 81e1225 add FAQs to docs (#107) dec4a86 Merge remote-tracking branch 'origin/Euler' into Euler 23d7824 Brings back the Euler example 4327e22 Merge branch 'main' into Euler 2148b74 Merge pull request #106 from allenai/Favicon 5ce5c64 Merge branch 'main' into Favicon 18e556a Revert "Changelog" 97dd79d Merge branch 'main' into Euler f812b73 fix bug with resolving lazy step inputs (#105) 0a05047 Merge branch 'main' into Euler 89b315a Tango Favicon 110fc79 Merge pull request #104 from allenai/why-tango a67c617 Update README.md a5f5bd0 Merge branch 'main' into why-tango eab62d8 fix bug in distributed training (#103) e73a4a0 Update README.md eb3ef41 Merge branch 'main' into Euler c1c9bd0 Changelog 901631f Removed scary warning e86b843 Better summary cf9088d We no longer start the server during tests b257e46 Import integrations safely 10ec4f4 Pick the right mutable mapping a175370 Make sure default steps are available when you run tango run 1d4051a Make mypy happy 67e8a02 Moved complex arithmetic to https://github.com/allenai/tango-example 8a5c0d7 Adds test for steps that fail 3d551b3 Test for uncacheable leaf steps 21f0280 Don't explicitly run steps that don't need it 5fa4731 isort 9a3cd52 Changelog 6df6682 clarify comment c46e0b2 also pick any uncacheable direct dependencies of leaf steps c32a5cc Example steps for Euler's identity 1c97f10 Write step info for all steps before running any of them 3336a7e Reset step info when re-running a failed step 228b0f3 How did this slip by? f9f1d05 Make sure to execute uncacheable steps that are not a dependency of anything 3d44720 Add steps for complex arithmetic 3bfd7c1 adjust dependency pinning (#100) 72aaa53 Merge pull request #79 from allenai/jon/html-viz adea827 Merge branch 'jon/html-viz' of github.com:allenai/tango into jon/html-viz # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit. e771da7 make alert look nice d21a172 Merge branch 'main' into jon/html-viz d45ec0c Merge pull request #99 from allenai/SkipGitTests 28f17fa Don't check for git if we're not running in a repo 7db639e Merge branch 'main' into jon/html-viz 98b022b make sure workspaces can be imported from base module (#98) 07ac494 Merge pull request #97 from allenai/lower-click-pin f10b178 Merge branch 'main' into lower-click-pin 5ccefa9 fix prelease indicator condition in CI 9843c5a update CHANGELOG 91d6551 Merge branch 'main' into lower-click-pin 58dfd3e CHANGELOG.md 8a069f0 loosen Click requirement 85b7417 Shows an ugly but functional brief popup when you copy something 626f1b6 Merge branch 'main' into jon/html-viz b0cbf63 make tooltips meaningful since i cannot remove them 422ae0d Update tango/server/report.js 18869b7 Make local results work 3b1326a Python 3.7 again 5a282ff Warning to the future 2f54569 Give the option of tracking dependencies properly through Workspace f9f7bfb We don't have this method anymore. 77ae786 Silence a warning 0d532c5 Reformat setup.py only in Python 3.7? 4f1e786 Changelog b500f0e Merge branch 'main' into jon/html-viz 5c0b794 Merge branch 'main' into jon/html-viz 5c455e2 Include assets in package 4153e9e Fix import order 85dd1af Print a direct link to the run if we have one 01f0343 Makes it so you can run the server from any directory 300b722 Make frontend and backend consistent in their terminology 0ef1350 Fix some errors in the server bd07ad5 Merge remote-tracking branch 'origin/ResponsibleSteps' into jon/html-viz 0cde4e7 pr updates 7bf9488 Makes tests pass e96f6f3 Merge branch 'ResponsibleSteps' into jon/html-viz 8800bc9 Merge branch 'ResponsibleSteps' into jon/html-viz 378f649 pr gix 5047f6b add logo 4a7487b add in reloading every second a1131ed pr fixes 95f1cbe pr fixes d47b624 pr fixes 5962699 pr fixes a974369 fix merge conflicts 3280491 initial code to serve and display tango viz

- Python
Published by github-actions[bot] about 4 years ago

tango - v0.4.0rc1

What's new

Added 🎉

  • Introduced the concept of the Workspace, with LocalWorkspace and MemoryWorkspace as initial implementations.
  • Added a stub of a webserver that will be able to visualize runs as they happen.
  • Added separate classes for LightningTrainingTypePlugin, LightningPrecisionPlugin, LightningClusterEnvironmentPlugin, LightningCheckpointPlugin for compatibility with pytorch-lightning>=1.5.0.

Removed 👋

  • Removed old LightningPlugin class
  • Removed requirement of the overrides package

Changed ⚠️

  • Made it possible to construct a step graph out of Step objects, instead of constructing it out of StepStub objects.
  • Removed dataset fingerprinting code, since we can now use Step to make sure things are cached.
  • Made steps deterministic by default.
  • Brought back MemoryStepCache, so we can run steps without configuring anything.
  • W&B torch::TrainCallback logs with step=step+1 now so that training curves in the W&B dashboard match up with checkpoints saved locally and are easier to read (e.g. step 10000 instead of 9999).
  • filelock >= 3.4 required, parameter poll_intervall to tango.common.file_lock.FileLock.acquire renamed to poll_interval.

Fixed ✅

  • Fixed bug in FromParams where a parameter to a FromParams class may not be instantiated correctly if it's a class with a generic type parameter.

Commits

b2c09e3 fix typo in example (#96) aba5758 Merge pull request #94 from allenai/dependabot/pip/cached-path-gte-0.3.3-and-lt-1.1.0 4ae6115 Update requirements.txt d4d0655 Merge branch 'main' into dependabot/pip/cached-path-gte-0.3.3-and-lt-1.1.0 48b0b24 Merge pull request #92 from allenai/dependabot/pip/datasets-gte-1.12-and-lt-1.17 6044f8b Update cached-path requirement from <0.4.0,>=0.3.3 to >=0.3.3,<1.1.0 1ae82aa Merge branch 'main' into dependabot/pip/datasets-gte-1.12-and-lt-1.17 0107672 Merge pull request #91 from allenai/dependabot/pip/sphinx-4.3.1 fa47e54 Merge branch 'main' into dependabot/pip/datasets-gte-1.12-and-lt-1.17 38c0b42 Merge branch 'main' into dependabot/pip/sphinx-4.3.1 7d9d919 Merge pull request #93 from allenai/NoOverrides dfa461c Removes the dependency on the `overrides` package ff18197 Update datasets requirement from <1.16,>=1.12 to >=1.12,<1.17 fc69dd0 Bump sphinx from 4.3.0 to 4.3.1 b5907de Merge pull request #67 from allenai/ResponsibleSteps 267a6e4 clean up config usage 82862ef Merge branch 'main' into ResponsibleSteps 20951ea Bump furo from 2021.11.16 to 2021.11.23 (#89) 8d8670a Optional server 03049fa Handle the log level consistently c620405 Merge branch 'main' into ResponsibleSteps 8bb00c4 Bump black from 21.11b0 to 21.11b1 (#88) cd5a70a Fix tests 71cfbd7 Don't cache uncacheable steps 7839fd8 Merge branch 'main' into ResponsibleSteps f240ac4 update filelock + cached_path, improve release scripts (#87) 967ecb2 Merge branch 'main' into ResponsibleSteps 10634aa Don't show inherited from_params 4b104be Fix test 4e9910d Avoid a naming conflict in computer science 6db2c29 Improved documentation 44f79ec Added blurb 7f68a97 Use enum for step states 19e6de2 Click logging is disabled by default, enabled in the CLI use case eba4b8e Bump black from 21.10b0 to 21.11b0 (#86) 1d80766 Log the start of a run 6190e2d Merge branch 'main' into ResponsibleSteps 3550c3f Cleaner workspace docs 08d4056 Better StepCache docs d5ede4e Formatting bc80bb8 Merge pull request #85 from allenai/dependabot/pip/filelock-gte-3.3-and-lt-3.5 d69bac4 Check whether a run name already exists cde0f14 Unused import a80a87f Fix the case where a step's cacheability changes across restarts de5f248 Improve comment f13d717 Losely pin petname 5d72923 Merge branch 'main' into ResponsibleSteps 914902b Merge branch 'ResponsibleSteps' of https://github.com/allenai/tango into ResponsibleSteps 2f5a266 Merge pull request #83 from allenai/petew-ResponsibleSteps b3df88f Update filelock requirement from <3.4,>=3.3 to >=3.3,<3.5 17d28c7 Bump furo from 2021.11.15 to 2021.11.16 (#84) 8f2b48e add failing test case 1fc9860 Merge remote-tracking branch 'origin/main' into ResponsibleSteps 4118912 Merge pull request #82 from allenai/dependabot/pip/furo-2021.11.15 d5eb968 Merge branch 'main' into dependabot/pip/furo-2021.11.15 abac28a Fix the shortcut for running all (many) checks c4009de Merge branch 'main' into ResponsibleSteps e364766 Merge pull request #73 from allenai/petew-ResponsibleSteps 8ae0e27 More doctests 6211025 Format docs better 5814724 Fix docs cb1b853 Merge pull request #62 from allenai/dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 50caada Update requirements.txt 9c79b97 Makes the docs build 3ecb952 Merge branch 'main' into dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 55c5fae Bump furo from 2021.11.12 to 2021.11.15 83add7e Important fixes 301347d Fix tests cfe29de Create workdir when requested 7b67142 Bring back "needed by" a46b3e8 This wasn't meant to be checked in. af4cc3c Use click through the logger ee81446 Merge branch 'ResponsibleSteps' into petew-ResponsibleSteps 7247528 We don't need this TODO right now. 9b8446c Merge branch 'main' into ResponsibleSteps 30f7a13 W&B log as step+1 (#76) eab2b7f Merge pull request #74 from allenai/Workspaces d91302b Simplify! e605399 changelog 9f360da Bring back deterministic step randomness, without breaking random step names a2c5cd8 fix order of imports 02ad624 Merge branch 'dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0' of https://github.com/allenai/tango into dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 7f78b0a remove comment 8f8cbeb separating different plugin types b139181 Merge branch 'main' into dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 761f873 Merge branch 'ResponsibleSteps' into Workspaces 12a480a Merge branch 'ResponsibleSteps' into petew-ResponsibleSteps b7d4e88 merge main 63f6029 Merge branch 'main' into ResponsibleSteps aab58b6 add some conda instructions to CONTRIBUTING.md (#81) e7e5c5e Fix symlink creation b547472 Adds a command to keep a server running permanently 596c278 Not sure how this line got lost 06d7681 Fixes and cleanup 3cebc24 fix bug caused by random seed d145216 fix comment 067436d clean up a4ce577 Creates and uses the concept of a workspace, so that the server can consume it 110eb07 ci e811f47 fix tests 460dd87 executor fixes 837a454 handle generic non-FromParams classes 7fed5a6 fix merge conflicts 12ae8e9 Fixing the torch test 0c208e1 Remove stale comment b44a0a0 Fix doctest 3e4eb42 Makes det_hash consistent across Python versions f23b56c Executable documentation! 38388ca Formatting 6aaa01e Fix some documentation 57825a0 Fix docs cb7591d Order imports correctly 🙄 8caea06 Removing unused imports f7f5d2e Changelog again 05ca4dd Changelog da16b5a Merge branch 'main' into ResponsibleSteps 8c51b45 Make nested steps work for classes that aren't FromParams 083516b Remember which extra modules we imported 54e7dbc We can't restore the registry like that. 455b756 Refactors the test to fail in new and exciting ways 62cd1d2 Merge branch 'main' into dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 cd68752 Don't need this comment anymore 76c10f1 Remove fingerprint stuff a1f1af8 Merge branch 'FixImport' into ResponsibleSteps ed9ebc9 Fix Import 14bc2ee Merge branch 'main' into ResponsibleSteps 4a4ab58 Merge branch 'main' into dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 e5ea1af Fix after merge 7b92dd9 Merge remote-tracking branch 'origin/main' into ResponsibleSteps d066768 We were not actually using this function df07690 Mypy inspired changes 87ee96f Formatting 3823f98 🤦🏼 d018838 Fix executor test a7fb4ea Fix executor 0b1f310 Quiet, you 449a70d Fixes circular references 4fc1f34 Actually write the circular reference test 0df3538 Throw the error for the right reason afb3b61 WithUnresolvedSteps 1be6816 Update pytorch-lightning requirement 3c4cb19 Dicts are iterable, so these have to be swapped efc84d4 Adds new failing test 1f2bf16 Relative imports don't work 430112f Fix bug in test c3dd08d Add test that fails 118f8c2 Better name for the test a6ac3c9 Detect unsatisfiable dependencies 8eb7a48 Bring back parsing everything as a Step first :-/ bc6f886 Type checks 00b2df6 Makes the test pass a6ba11a Make the code more compatible with the IDE a611d48 We don't need these anymore 132153d Formatting d019499 Start fixing hard tests f4d77c1 Fix trivial tests fa6de92 Make steps responsible for their own execution cfe7007 Slightly more readable error message c0ded32 Typo

- Python
Published by github-actions[bot] about 4 years ago

tango - v0.3.6

What's new

Added 🎉

  • Added a .log_batch() method on torch::TrainCallback which is given the average loss across distributed workers, but only called every log_every steps.

Removed 👋

  • Removed .pre_log_batch() method on torch::TrainCallback.

Fixed ✅

  • Fixed typo in parameter name remove_stale_checkpoints in TorchTrainStep (previously was remove_state_checkpoints).
  • Fixed bug in FromParams that would cause failures when from __future__ import annotations was used with Python older than 3.10. See PEP 563 for details.

Commits

6b5cb24 support for PEP 563 in older Python versions (#80) 27657d9 Bump furo from 2021.10.9 to 2021.11.12 (#78) c8ac858 Bump sphinx from 4.2.0 to 4.3.0 (#75) 5516244 refactor TorchTrainStep (#70) b13bba7 Bump isort from 5.10.0 to 5.10.1 (#71)

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.3.5

What's new

Fixed ✅

  • Fixed a bug in FromParams where the "type" parameter was ignored in some cases where the Registrable base class did not directly inherit from Registrable.

Commits

5bdad24 Merge pull request #69 from allenai/weird-mix-fix 5434e23 test case e1108a5 CHANGELOG.md 2bbfa6f fix weird FromParams/Registrable bug

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.3.4

What's new

Added 🎉

  • Added StopEarlyCallback, a TorchTrainCallback for early stopping.
  • Added parameter remove_stale_checkpoints to TorchTrainStep.

Changed ⚠️

  • Minor changes to TorchTrainCallback interface.
  • Weights & Biases TorchTrainCallback now logs best validation metric score.

Commits

c126442 torch train improvements (#64)

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.3.3

What's new

Added 🎉

  • Added support for PEP 604 in FromParams, i.e. writing union types as "X | Y" instead of "Union[X, Y]".
  • [internals] Added a spot for miscellaneous end-to-end integration tests (not to be confused with "tests of integrations") in tests/end_to_end/.
  • [internals] Core tests now run on all officially supported Python versions.

Fixed ✅

  • Fixed a bug in FromParams where non-FromParams class parameters were not instantiated properly (or at all).
  • Fixed a bug in FromParams where kwargs were not passed on from a wrapper class to the wrapped class.
  • Fixed small bug where some errors from git would be printed when executor metadata is created outside of a git repository.

Commits

ea6d2e5 another FromParams fix (#66) eeb1560 Update datasets requirement from <1.15,>=1.12 to >=1.12,<1.16 (#60) fe6dbe0 Bump isort from 5.9.3 to 5.10.0 (#61) 9f302b3 PEP 604 support (#59) 634bd71 Merge pull request #63 from allenai/StepGraphTests 9f513d6 Rename stepgraph.py to stepgraph_test.py 604001f tweak checkpointing/validate step 0920349 fix bug with git metadata (#56) af1b438 fix another FromParams bug, add spot for miscellaneous end-to-end tests (#58)

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.3.2

What's new

Fixed ✅

  • Fixed a bug with FromParams that caused .from_params() to fail when the params contained an object that was already instantiated.
  • tango command no longer installs a SIGTERM handler, which fixes some bugs with integrations that use multiprocessing.

Commits

ab47e21 Merge pull request #55 from allenai/no-sigterm-handler b759c46 remove sigterm handler c4e96cb fix example a1f9ec5 fix FromParams bug (#54) d5ef0ae Bump black from 21.9b0 to 21.10b0 (#53) b20f5b2 fix typo

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.3.1

What's new

Changed ⚠️

  • Updated the LightningTrainStep to optionally take in a LightningDataModule as input.

Commits

4d77160 Merge pull request #52 from allenai/fix-typo 92874c2 fix release docs 26e36e4 Merge pull request #51 from allenai/lightning-data-module b10d256 update changelog 6d068a4 fix import order 1744f55 adding option for data module

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.3.0

What's new

Added 🎉

  • Added IterableDatasetDict, a version of DatasetDict for streaming-like datasets.
  • Added a PyTorch Lightning integration with LightningTrainStep.

Fixed ✅

  • Fixed bug with FromParams and Lazy where extra arguments would sometimes be passed down through to a Lazy class when they shouldn't.

Commits

a95dbae fix bugs with initializing lightning loggers and plugins (#50) 7e9c354 fix bug w/ DataLoader and PTL d634ea8 add isort (#49) 277b0e2 add IterableDatasetDict (#46) db06e70 Merge pull request #45 from allenai/pytorch-lightning 6174381 fix CHANGELOG 501bf73 fix failing test 686be74 Merge branch 'main' into pytorch-lightning 4f3f328 add torch:: to torch integrations 293fac1 doc and general fixes 2b2a6a3 Update tango/integrations/pytorchlightning/init.py d46a0ba Update docs/source/api/integrations/pytorchlightning.rst 3aeb825 fix docs 375ff58 update docs c6a1e38 update ci 8110242 PyTorch Lightning Integration 60121c0 update docs, print unicode characters by name (#44) 7beab21 clean up docs 2f6871b only print ascii characters (#43)

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.2.4

What's new

Added 🎉

Changed ⚠️

  • --file-friendly-logging flag is now an option to the main tango command, so needs to be passed before run, e.g. tango --file-friendly-logging run ....

Fixed ✅

  • Fixed bug with Step.from_params.
  • Ensure logging is initialized is spawn processes during distributed training with TorchTrainStep.

Commits

d497e7a Update torch requirement from <1.10.0,>=1.9.0 to >=1.9.0,<1.11.0 (#42) fbad9b2 fix failing test 409b50a ensure logging initialize in spawn distributed workers 1714390 move filefriendlylogging flag back to main command af8bc69 fix bug in Step.from_params 4e5b406 add TangoMetadata to docs

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.2.3

What's new

Added 🎉

  • Added support for global settings file, tango.yml.
  • Added 'include_package' (array of string) param to config spec.
  • Added a custom error StopEarly that a TrainCallback can raise within the TorchTrainStep to stop training early without crashing.
  • Added step config, tango command, and tango version to executor metadata.
  • Executor now also saves pip dependencies and conda environment files to the run directory for each step.

Fixed ✅

  • Ensured **kwargs arguments are logged in FromParams.

Commits

0094888 save pip and conda files to run directory, add step config to metadata (#41) d588886 Early stopping via callbacks in torch train (#40) c437887 ensure '**kwargs' are logged in FromParams 7de1837 add support for global settings file (#39) 4026df1 Update datasets requirement from <1.14,>=1.12 to >=1.12,<1.15 (#38) 7ce3493 add 'include_package' param to config spec

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.2.2

What's new

Added 🎉

  • Added new steps to datasets integration: ConcatenateDatasets ("datasets::concatenate") and InterleaveDatasets (datasets::interleave).
  • Added __contains__ and __iter__ methods on DatasetDict so that it is now a Mapping class.
  • Added tango info command that - among other things - displays which integrations are installed.

Commits

d47399d add 'tango info' command (#34) 80f8457 add interleave and concatenate dataset steps (#33) 10f56f7 add test for generics from std lib (#32) f77837a make DatasetDict an actual Mapping (#30)

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.2.1

What's new

Added 🎉

  • Added convert_to_tango_dataset_dict() function in the datasets integration. It's important for step caching purposes to use this to convert a HF DatasetDict to a native Tango DatasetDict when that DatasetDict is part of the input to another step. Otherwise the HF DatasetDict will have to be pickled to determine its hash.

Changed ⚠️

  • Format.checksum() is now an abstract method. Subclasses should only compute checksum on the serialized artifact and nothing else in the directory.
  • [internals] Changed the relationship between Executor, StepCache, and Step. Executor now owns the StepCache, and Step never interacts with StepCache directly.

Commits

bdc8486 make Format.checksum abstract 5514ce5 Refactor Executor, StepCache, and Step, improve hashing of DatasetDict (#29)

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.2.0

What's new

Added 🎉

  • Added a Weights & Biases integration with a training callback ("wandb::log") for TorchTrainStep ("torch::train") that logs training and validation metrics to W&B.

Fixed ✅

  • Fixed Format.checksum() when there is a symlink to a directory in the cache folder.

Commits

374f1ad Add W&B integration (#28)

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.1.3

What's new

Added 🎉

  • Added the ability to track a metric other than "loss" for validation in TorchTrainStep ("torch::train").

Fixed ✅

  • Final model returned from TorchTrainStep ("torch::train") will have best weights loaded.
  • Checkpoints are saved from TorchTrainStep ("torch::train") even when there is no validation loop.
  • Fixed TorchTrainStep ("torch::train") when validation_split is None.
  • Fixed distributed training with TorchTrainStep ("torch::train") on GPU devices.

Commits

ba05e79 Torch train updates and distributed fixes (#27)

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.1.2

What's new

Added 🎉

  • Added support for YAML configuration files.

Commits

a2a6e08 relax requirement on PyYAML 1bd44b9 Update datasets requirement from <1.13,>=1.12 to >=1.12,<1.14 (#25) 55114a6 Bump pyyaml from 5.4.1 to 6.0 (#26) bcbe08e add support for YAML configuration files (#24) 69f6544 Fix more typos (#23) c11b029 minor updates (#22) 289a779 Typo 73706bb start on overview section of docs (#21) 7ab2969 list dependencies in example 659a8ac Add exampels to docs (#20) 4bba28f test examples (#19)

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.1.1

What's new

Added 🎉

  • TorchTrainStep now displays a progress bar while saving a checkpoint to file.
  • The default executor now saves a "executor-metadata.json" file to the directory for each step.

Changed ⚠️

  • Renamed DirectoryStepCache to LocalStepCache (registered as "local").
  • LocalStepCache saves metadata to cache-metadata.json instead of metadata.json.

Fixed ✅

  • Fixed bug with TorchTrainStep during distributed training.
  • FromParams will automatically convert strings into Path types now when the annotation is Path.

Commits

e2d5abb save to cache-metadata.json instead of metadata.json (#18) e5118bd Torch train fixes (#17) f3b5178 Bump furo from 2021.9.22 to 2021.10.9 (#13) 7f525fe fix typo 12e70fa add more sphinx extensions (#16) 010f3fb Rename DirectoryStepCache to SimpleStepCache, clean up docs (#15)

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.1.0

What's new

Added 🎉

  • Added StepGraph and Executor abstractions.
  • Added a basic PyTorch training step registered as "torch::train", along with other registrable components, such as Model, DataLoader, Sampler, DataCollator, Optimizer, and LRScheduler.
  • Added DatasetRemixStep in tango.steps.
  • Added module tango.common.sequences.
  • Added DatasetDict class in tango.common.dataset_dict.
  • Added 🤗 Datasets integration.
  • Added command-line options to set log level or disable logging completely.

Changed ⚠️

  • Step.work_dir, Step.unique_id, Step.dependencies, and Step.recursive_dependencies are now a properties instead of methods.
  • tango run command will acquire a lock on the directory to avoid race conditions.
  • Integrations can now be installed with pip install tango[INTEGRATION_NAME]. For example, pip install tango[torch].
  • Added method Registrable.search_modules() for automatically finding and importing the modules where a given name might be registered.
  • FromParams.from_params() and Registrable.resolve_class_name will now call Registrable.search_modules() to automatically import modules where the type might be defined. Thus for classes that are defined and registered within any tango.* submodules it is not necessary to explicitly import them.

Fixed ✅

  • Step implementations can now take arbitrary **kwargs in their run() methods.

Commits

3616040 add step graph and executor abstractions, support for distributed training in TorchTrainStep (#14) 0fa7d5e add torch components and simple train Step (#12) b408b8a auto search and import modules in from params (#10) 2c376f4 Update feature_request.md ff4fe11 Add 🤗 Datasets integration (#8) 2210c62 add DatasetRemixStep (#7) 9dffab3 document installing with integrations 5e225cf more work on integrations (#6) f45b98f rename workflow 971906c rename workflow file d33f57f add extras to setup.py (#5) cdc4393 add source code to docs c51f2a0 acquire lock on directory during run

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.0.3

What's new

Added 🎉

  • Added tango command.

Commits

7284269 add tango command (#4)

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.0.2

What's new

Added 🎉

  • Ported over core tango components from AllenNLP.

Commits

1bb5ed8 Merge pull request #1 from allenai/dependabot/pip/furo-2021.9.22 9d7a7fe Merge pull request #3 from allenai/add-components 21c3c2a merge docs with readme 6d3ba03 add format test e7e1aee fix again d760857 fix for py3.7 6076d17 add step tests 01e258e restore Registrable state after each test case f6aef4c fix typo in README edebbdf add issue and PR templates 18cd9f0 fixes 07ee21e orchestrating -> choreographing 586655a add contributing guide e79f475 fixups 14afd11 prepare docs 0f7f6f6 add tango components from allennlp d65c9bb brand as 'AI2 Tango' 4bf379c ignore more directories in flake8 config daf4e3d add .gitignore cf04957 Merge pull request #2 from allenai/custom-css 514adf8 add custom css

- Python
Published by github-actions[bot] over 4 years ago

tango - v0.0.1

What's new

Added 🎉

  • Added initial project boilerplate.

Commits

c0ec751 change name of PyPI package to ai2-tango

- Python
Published by github-actions[bot] over 4 years ago