Recent Releases of tango
tango - v1.3.1
What's new
Fixed ✅
- Minor bugs in the
GSWorkspace().
Changed ⚠️
- Added CLI-style execution functions for experiments defined in Python.
- Added
display()toExecutorOutputfor producing a table that summarizes the run.
Commits
4c8ae5a CLI-style execution for Python-defined experiments (#600) 8bb3472 GS workspace bug fixes (#607)
- Python
Published by github-actions[bot] over 2 years ago
tango - v1.3.0
What's new
Added 🎉
- Added the
Workspace.remove_step()method to safely remove steps.- The
GSWorkspace()can now be initialized with google cloud bucket subfolders.
- The
Changed ⚠️
- The
BeakerExecutornow uses the HEAD commit at the time the executor is instantiated to executor a step instead of the HEAD commit at the time the step is run.
Fixed ✅
- Removed unnecessary code coverage dev requirements.
- Fixed issue where new version of torch caused no LR schedulers to be registered.
- Updated pinned versions of jax, jaxlib, and flax.
Commits
ed72140 Prepare for release v1.3.0
5d776d5 Revert "Prepare for release v1.3.0"
8eec3df Revert "remove folder"
671525f remove folder
1a384f7 Prepare for release v1.3.0
56c1476 Use commit at time executor is instantiated (#605)
11b5229 'Remove step' feature for local workspaces (#588)
3857415 GS workspaces can be bucket subfolders (#604)
b955ef7 CI errors be gone (#601)
01077eb LR schedulers (#573)
3a19688 Bug fix in getting results from GS workspace (#582)
4c3edce style: migrate to ruff (#562)
416ffa6 Add CITATION.cff file (#572)
70f2681 Update torch requirement from <1.14,>=1.9 to >=1.9,<2.1 (#539)
1ee2c56 Update wandb requirement from <0.13.11,>=0.12 to >=0.12,<0.14.3 (#547)
fcf1010 Bump sentencepiece from 0.1.97 to 0.1.98 (#548)
7d04cde Bump mypy from 1.0.1 to 1.2.0 (#555)
fbb068b Bump allenai/beaker-run-action from 1.1 to 1.2 (#534)
a717d70 Bump black from 23.1.0 to 23.3.0 (#554)
42322bb fix readthedocs
825ec19 remove stray pkl file
3638cdc refactor: move packaging information to pyproject.toml (#549)
e86ad65 Bump black from 23.1.0 to 23.3.0 (#543)
69e7574 remove unnecessary code coverage deps (#550)
- Python
Published by github-actions[bot] over 2 years ago
tango - v1.2.1
What's new
Added 🎉
- Added the following workspace methods to support the Tango viz UI:
Workspace.search_registered_runs(),Workspace.search_step_info(),Workspace.num_registered_runs(), andWorkspace.num_steps().
Fixed ✅
- Fixes a bug where
FromParamswould fail to parse when an object takes aStepargument directly. - Changed a name so we don't override the built-in name
set. - Fixed a bug that would cause O(n^2) memory consumption in dense step graphs.
Commits
258f440 Only one step object (#545)
07abee5 Don't override the name set (#544)
4784bab Return more info from Workspace.search_registered_runs() (#536)
3d2d890 Fix bug when a FromParams object takes a Step argument directly (#535)
38561d0 New paginated workspace search methods (#489)
7a25e3e Fix datasets typing issue (#531)
810b742 minor updates to CI (#529)
- Python
Published by github-actions[bot] almost 3 years ago
tango - v1.2.0
What's new
Added 🎉
- You can now add arguments to steps without invalidating the cache. See
Step.SKIP_DEFAULT_ARGUMENTS. - Fixed integration status messages in
tango infocommand. - Added abstractions for
RemoteClient,RemoteStepCache, andRemoteWorkspace. - Added a GS integration that comes with
GSWorkspace, a remoteWorkspaceimplementation that uses google cloud storage. - You can now bind functional steps to the underlying
Stepinstance with@step(bind=True), meaning the first argument to the function will be aStep. - Added
ShellStepfor running arbitrary shell commands. - Added
@make_registrabledecorator to make arbitrary functions registrable, to make it easier to refer to them in tango configurations.
Fixed ✅
- Jsonnet parsing is now much faster and works on Windows.
- Warnings about locks are now reliably printed every 30 seconds
- We now make sure Beaker jobs have the latest version of beaker-py, so that we're compatible with the latest API changes.
- Stopping early now works when the metric doesn't change at all.
- Fixed bug with
FromParamswhich didn't handle variable length tuples correctly.
Changed ⚠️
- The default log level for Tango is now
warning. - You can specify multiple steps with
-sfrom thetango runcommand.
Commits
985f6fa fix lint
f77c0e0 fix release_notes script
2c9456d Prepare for release v1.2.0
f1dc63d Update wandb requirement from <=0.13.5,>=0.12 to >=0.12,<0.13.11 (#523)
32598c4 Update rich requirement from <13.0,>=12.3 to >=12.3,<14.0 (#498)
35ca0f0 Bump actions/checkout from 1 to 3 (#524)
739e40c Various dependencies (#525)
49f3afc Fix bug with variable length tuples (#527)
28ea796 minor workspace fixes (#526)
379095d GCSWorkspace (#417)
c949416 Shell step + Registrable functions (#521)
308689b Allow specifying multiple steps with -s from tango run (#516)
9002169 Stop early same metric (#515)
4a21132 Add bind option to @step decorator (#512)
8a6775e Beaker-py upgrade in Beaker jobs (#509)
679700b Rjsonnet (#505)
3760419 Default log level warning (#508)
8d29321 Lock warning (#506)
521de99 Pin wandb requirement until fixed (#507)
6618a04 upgrades for beaker-py upgrades (#504)
34dbec4 Fix integration status messages in tango info command (#502)
fbb7581 quick fix for Beaker-py upgrade
a10fb3b clone to a src dir (#495)
65f699d Fix #483 (#484)
4a183de Fix bug with extra uncacheable dependencies (#480)
ac0a193 Skip default args (#481)
- Python
Published by github-actions[bot] about 3 years ago
tango - v1.1.0
What's new
Added 🎉
- Added
gpu_typefield toStepResources. TheBeakerExecutorcan use this to determine which clusters to a submit a step to. - Added
machinefield toStepResources. You can set this to "local" when using theBeakerExecutorto force it to run the step locally. - Added
--ext-varargument totango runfor setting JSONNET external variables when loading the experiment config. - Added
@step()decorator to createStepclasses from functions. - Added the
transformers::with_soft_promptintegration, to make soft-prompted prefix transformers easy.
Removed 👋
- Removed PyTorch Lightning integration.
- Removed
tango servercommand and--serve/--no-serveoption fortango run. - Removed
source_release.py, which was checked in by accident.
Fixed ✅
- Fixed issue where Executor
parallelismoption in a Tango settings file would be ignored. - Fixed a bug where the unique ID of a step that depends on a key-value of the result of another step could change if the name of the other step changes.
- Fixed a bug where importing certain libraries (like torchmetrics) would mess with our exception handling because they set
sys.excepthookfor some reason. Now we always resetsys.excepthookafter importing. - The type hints for the flax trainer suggested that the training split is optional when in fact it's mandatory.
- Made
BeakerWorkspace/BeakerStepLockmore robust when a job is preempted. - Minor performance improvements for the Beaker executor and workspace.
Commits
73bfa86 Soft prompts (#231)
79b7d01 Beaker integration perf improvements
d455541 Train is not optional (#474)
3eab580 Beaker integration perf improvements (#475)
241b4eb Remove step that was never supposed to be there (#478)
5f5ba41 Add @step() decorator (#476)
c0c4ae0 Make BeakerStepLock robust to preempted jobs
39d3d66 Remove tango server (#470)
fef9bba reset sys.excepthook after importing other modules, add --ext-var CLI option (#471)
ccebb4c Automatically remove ephemeral Beaker datasets
81d773e BeakerExecutor improvements, fix bug with StepIndexer (#469)
c05a80a Bump sphinx-copybutton from 0.5.0 to 0.5.1 (#467)
147e408 Remove PyTorch Lightning integration (#468)
cd4f626 Update more-itertools requirement from <9.0,>=8.0 to >=8.0,<10.0 (#456)
d219dff Fix doc build failure
ee71c09 Fix bug with setting parallelism in settings file (#466)
1609eb4 Bump mypy from 0.982 to 0.991 (#465)
- Python
Published by github-actions[bot] about 3 years ago
tango - v1.0.2
What's new
Changed ⚠️
BeakerSchedulercan now return a list of clusters.
Commits
64215d8 Bump mypy from 0.982 to 0.990 (#463)
acc1082 Update torch requirement from <1.13,>=1.9 to >=1.9,<1.14 (#460)
570d24e Use new constraints field for cluster assignment (#462)
5afd89f Set Beaker client User-Agent header to Tango v..* (#459)
9d9628f Fix progress logging statement from BeakerExecutor (#458)
- Python
Published by github-actions[bot] over 3 years ago
tango - v1.0.1
What's new
Fixed ✅
LightningTrainStepnow can take aLazymodel object which results in a gauranteed deterministic hash.- Fixed issue where remote
Workspaceimplementations likeWandbWorkspaceandBeakerWorkspacewould use the same local cache regardless of the W&B / Beaker workspace being used. - Fixed bug with
TorchEvalStepwhen constructing callbacks. - Fixed some import error issues caused when an integration is not installed.
- Fix incorrect reporting of final results in
MulticoreExecutor.
Changed ⚠️
- Wandb step cache retries api call in case of timeout
beaker-py >= 1.11required.
Commits
26e6416 Bump sphinx from 5.2.3 to 5.3.0 (#455)
c10cec5 Retry wandb call (#450)
f950b12 Use separate local cache dirs for each workspace (#451)
4c70161 Uncacheable step failure should be reported (#453)
9476942 Bump black from 22.8.0 to 22.10.0 (#445)
821c0fc Bump furo from 2022.9.15 to 2022.9.29 (#435)
8d0b146 Allow Lazy model input to LightningTrainStep (#448)
12eec56 Fix import issues when missing integrations (#447)
d16997c Fix bug with TorchEvalStep (#442)
9a5f792 Bump beaker-py to version >=1.11 (#443)
f5c2e52 Add to FAQ (#440)
- Python
Published by github-actions[bot] over 3 years ago
tango - v1.0.0
This is the first stable release of AI2 Tango, the culmination of over a year's work and nearly 1000 commits!
We've been working on this project quietly for a while on the AllenNLP team, and it's being used daily by researchers here. So we're excited to officially announce it now that we're happy with the API 🎉
What's new since the last release
Added 🎉
- Added
step_extra_dependenciesinput field toStepclass that can be used to force a dependency on another step even if the current step doesn't directly depend on the output of the other step. See #418 for more context.
Changed ⚠️
beaker-py >= 1.10required.
Fixed ✅
- Long log lines will be soft-wrapped to ensure that links are clickable.
- Fixed a bug where some workspaces could be left in a bad state if a step's
Formatfailed to serialize the step's result inWorkspace.step_finished(). - Sometimes functions and methods end up as arguments to steps, which means we have to hash them. Instead of taking a hash of the function, we now take a hash of the function's module and name.
- Fixed a bug with the Beaker executor where it would hang at the end of a run if a step failed that is a dependency of another step.
- Fixed tests to work with new version of transformers.
- Fixed
Executor.execute_sub_graph_for_step()to be able to run the step's dependencies in parallel.
Commits
a48a825 Only install gh in entrypoint if needed (#439)
69b948d Improve error handling for step state edge case (#429)
39b3132 Fix Executor.execute_sub_graph_for_step() (#438)
408944e Loosen range on some dependencies (#437)
76cd46d Follow up fix for #401 (#434)
04e7963 Fix bug with Beaker executor (#430)
fb077e6 Bump fairscale from 0.4.9 to 0.4.11 (#432)
25f7373 Bump furo from 2022.6.21 to 2022.9.15 (#410)
f11925c Bump mypy from 0.971 to 0.982 (#433)
4845317 Bump sphinx from 5.1.1 to 5.2.3 (#431)
572f83b Bump myst-parser from 0.18.0 to 0.18.1 (#426)
e1fc2d3 Add step_extra_dependencies option to Step class (#419)
19927d3 Don't pin protobuf anymore (#428)
3dbe8c7 We need click 8 to work. (#427)
cbdbe68 Hashing functions (#424)
747469f Mark Format serialization failures as step failure (#421)
b1d6431 Ensure base path for included modules kept in sys.path (#406)
e1a1cd1 Minor improvements to BeakerExecutor internals (#416)
d3e1891 Ensure log lines get soft-wrapped so links are always clickable (#415)
- Python
Published by github-actions[bot] over 3 years ago
tango - v0.14.0
What's new
Added 🎉
- Adds a function to modify a Hugging Face transformer with IA3 adaptors
- Added a
BeakerSchedulerregistrable class, specified as the argumentschedulertoBeakerExecutor, which controls the resources assigned to steps ran on Beaker. Users can implement their ownBeakerSchedulersubclasses to customize the resource assignment behavior.
Changed ⚠️
- In the
tango runcommand,--no-serveris now the default. Use--serverto start the server.
Fixed ✅
- Made
BeakerExecutormore robust to connection, timeout, SSL, and other recoverable HTTP errors. - Made the
BeakerStepLockmore robust, and as a resultBeakerWorkspaceis more robust and should require less manual intervention for locks in a bad state. - Fixed a bug with the internal scheduling logic of the
BeakerExecutorwhich could delay submitting some steps in parallel. - Fixed a bug where creating a
StepInfoobject from params might result in unnecessary imports. - Fixed a bug where canceling the Beaker executor might not work properly.
- Fixed a bug where the trainer trains too much when
train_epochsis set and you're using gradient accumulation. - Fixed how the results of uncacheable steps are displayed by
tango run. - Beaker executor won't run duplicate cacheable steps at the same time.
Commits
0828adc BeakerExecutor won't run duplicate cacheable steps (#414)
7382019 IA3 adaptors (#403)
d498cf7 Hot fix to final output
bff9ebf Add warning when steps can't be run yet, bug fixes (#408)
c72552e Don't start server by default (#409)
d34fe09 Added BeakerScheduler class for handling resource assignment (#407)
5dcbb56 Gradient accumulation and train_epochs (#402)
d27bbef Make BeakerStepLock more robust (#401)
cd9b5fd Fix bug with StepInfo.from_params(), canceling BeakerExecutor, reserve "ref" name (#400)
6ff6b9e Fix bug with scheduling logic (#399)
15196f2 Deterministic hashing for tensors (#398)
230d78e Make BeakerExecutor more robust to all recoverable errors types (connection, HTTP, SSL, timeout, etc) (#397)
5f63a27 Bump fairscale from 0.4.8 to 0.4.9 (#391)
- Python
Published by github-actions[bot] over 3 years ago
tango - v0.13.0
What's new
Added 🎉
- You can now reference into a particular index of the result of another step in a config. For example:
{type: "ref", ref: "some_previous_step", key: 0}. The key field can be an integer if the result of the referenced step is a list or tuple, or a string if the result of the referenced step is a dictionary. - Added
priorityparameter to Beaker executor for setting the default task priority for Beaker jobs. - Added
Workspace.step_result()method for getting a step's result from the latest run. tango runwill now display a URL to the logs for failed steps when you use theBeakerExecutor.
Changed ⚠️
- The
TorchTrainStepnow enables monitoring arbitrary model outputs during training.TorchTrainEngine.forward_trainnow returns a tupleloss, model_outputsfor each micro batch and the list of model outputs for all micro batches in a batch is passed to theTrainCallback.log_batchandTrainCallback.post_batch. - Tango will now automatically search Python modules in the current working directory
for registered classes so that you don't always need to use the
--include-packagesetting. - The minimum supported Python version is now 3.8.
- Added support for PyTorch Lightning 1.7.x
- The Beaker Executor will no-longer live-stream logs from Beaker jobs, but logs will be viewable on Beaker and more readable.
- Only the Beaker executor requires a clean working directory
Fixed ✅
- Fixed a bug that did not allow a wandb artifact's type to be set from a step's metadata dictionary.
- Fixed a bug with how the Beaker executor streams log lines from Beaker which sometimes resulted in messages missing some starting characters, and tqdm lines being duplicated.
- Fixed a bug in the Beaker workspace where the lock dataset wouldn't be removed if the step was found to be in an invalid state.
- Improved cluster choice logic in
BeakerExecutorto ensure greater diversity of clusters when submitting many steps at once. - Fixed bug where sub-processes of the multicore executor would use the wrong executor if
executorwas defined in atango.ymlfile.
Commits
4f89d55 Improve Beaker cluster choice logic (#392)
e1ceae2 Display URL to logs for failed steps (#390)
3dc9591 Bump black from 22.6.0 to 22.8.0 (#380)
c9ce257 Catch when Beaker experiments are stopped (#389)
0fe12e9 Fix issues with WandbWorkspace causing CI crash (#388)
342eb26 Keep parameters in Params objects to make error messages more readable (#375)
92f0354 Simplified beaker logging (#383)
fd9d3cc Only the Beaker executor needs clean working directories (#373)
06f26ae Update wandb artifact type (#378)
f6a6b70 Update base images, get us out of the latest infinite loop of pip madness (#382)
306986b Catch all errors when attempting log record decode (#379)
628caff Allowing indexing into step results in config (#371)
858cef8 Minor improvement to Beaker logging (#377)
7a5619e Add Workspace.step_result() method (#374)
0750d76 Fix bugs with how BeakerExecutor streams logs (#372)
6e8b107 Detailed train outputs (#369)
bcd50d8 Update pytorch-lightning requirement from <1.7,>=1.6 to >=1.6,<1.8 (#349)
8ed0c86 Bump fairscale from 0.4.6 to 0.4.8 (#347)
62f2746 Python minimum version is 3.8 (#368)
45e02fe Auto import local Python modules when searching for registered classes (#367)
- Python
Published by github-actions[bot] over 3 years ago
tango - v0.12.0
What's new
Added 🎉
- Step resources:
- Added a
step_resourcesparameter to theStepclass which should be used to describe the computational resources required to run a step.Executorimplementations can use this information. For example, if your step needs 2 GPUs, you should setstep_resources=StepResources(gpu_count=2)("step_resources": {"gpu_count": 2}in the configuration language). - Added a
Step.resources()property method. By default this returns the value specified by thestep_resourcesparameter. If your step implementation always requires the same resources, you can just override this method so you don't have to provide thestep_resourcesparameter.
- Added a
- Step execution:
- Added an
executorfield to thetango.ymlsettings. You can use this to define the executor you want to use by default. - Added a Beaker
Executorto the Beaker integration, registered as anExecutorwith the name "beaker". To use this executor, add these lines to yourtango.ymlfile: ```yaml executor: type: beaker beaker_workspace: ai2/my-workspace clusters:- ai2/general-cirrascale
``
See the docs for theBeakerExecutor` for more information on the input parameters.
- ai2/general-cirrascale
``
- Added an
- Step class:
- Added a metadata field to the step class API. This can be set through the class
variable
METADATAor through the constructor argumentstep_metadata.
- Added a metadata field to the step class API. This can be set through the class
variable
- Weights & Biases integration:
- You can now change the artifact kind for step result artifacts by adding a field called "artifactkind" to a step's metadata. For models, setting "artifactkind" to "model" will add the corresponding artifact to W&B's new model zoo.
Changed ⚠️
- CLI:
- The
tango runcommand will throw an error if you have uncommitted changes in your repository, unless you use the--allow-dirtyflag. - The
tango runcommand will use the lightweight base executor (single process) by default. To use the multi-process executor, set-j/--parallelismto 1 or higher or -1 to use all available CPU cores.
- The
Fixed ✅
- Fixed bug where
StepInfoenvironment and platform metadata could be out-of-date if a step is run again due to failure. - Fixed a bug where an unfortunate combination of early stopping and decreasing model performance could result in a crash in the torch trainer.
Commits
befb00a Add workspace_metadata arg to Step class, allow changing artifact kind in W&B workspace (#363)
5ab1c2a Fix undefined behavior with TorchTrainStep (#366)
bf3c1a0 Update filelock requirement from <3.8,>=3.4 to >=3.4,<3.9 (#354)
b4e48a7 Update jsonpickle requirement from <2.2.0,>=2.1.0 to >=2.1.0,<2.3.0 (#351)
1c491f0 Update wandb requirement from <0.13,>=0.12 to >=0.12,<0.14 (#350)
93d5eb4 Bump allenai/setup-beaker from 1 to 2 (#359)
dc0f89a Fix #355 - ensure git metadata is up-to-date (#361)
258e880 Raise better error msg from step_result_for_run() (#360)
43916d1 Print debugging information about the repo used. (#353)
928aa7a Add BeakerExecutor (#340)
- Python
Published by github-actions[bot] over 3 years ago
tango - v0.11.0
What's new
Added 🎉
- Added a Flax integration along with an example config.
Commits
b4cd2b3 Flax Integration (#313) b9a7422 Bump sphinx from 5.0.2 to 5.1.1 (#346) d7952ef Bump mypy from 0.961 to 0.971 (#339) 6a58bfd Put PIP install instructions first (#348)
- Python
Published by github-actions[bot] over 3 years ago
tango - v0.10.1
What's new
Fixed ✅
- Fixed issue where the StepInfo config argument could be parsed into a Step.
- Restored capability to run tests out-of-tree.
Commits
2498318 Fix issue where StepInfo config could be parsed into a Step (#344) 57096b2 Make tests runnable out-of-tree for help with conda-packaging (#307)
- Python
Published by github-actions[bot] over 3 years ago
tango - v0.10.0
What's new
Changed ⚠️
- Renamed
workspaceparameter ofBeakerWorkspaceclass tobeaker_workspace. Executorclass is now aRegistrablebase class.MulticoreExecutoris registered as "multicore".
Removed 👋
- Removed
StepExecutionMetadata. Its fields have been absorbed intoStepInfo.
Fixed ✅
- Improved
Step.ensure_result()such that the step's result doesn't have to be read from the cache. - Fixed an issue with the output from
MulticoreExecutorsuch that it's now consistent with the defaultExecutorfor steps that were found in the cache. - One of our error messages referred to a configuration file that no longer exists.
- Improved performance of
BeakerWorkspace.
Added 🎉
- Added the ability to train straight
Modelinstead of justLazy[Model]
Commits
4e809f5 Eager models (#319) 361777b Metadata changes, make executor registrable (#331) a6b0be9 Beaker workspace performance (#328) f43e5ea Update torch requirement from <1.12,>=1.9 to >=1.9,<1.13 (#330) 8495c64 update dev dependencies (#333) 712d862 Make multicore executor output consistent with default (#325) 903569c Refer to the right config file (#324) bd9e4be Modernize our issue templates (#323)
- Python
Published by github-actions[bot] over 3 years ago
tango - v0.9.1
What's new
Fixed ✅
- Fixed non-deterministic behavior in
TorchTrainStep. - Fixed bug in
BeakerWorkspacewhere.step_info(step)would raise aKeyErrorif the step hasn't been registered as part of a run yet. - Fixed a bug in
BeakerWorkspacewhere it would send too many requests to the beaker service. - Fixed a bug where
WandbWorkspace.step_finished()or.step_failed()would crash if called from a different process than.step_starting(). - Fixed a bug in
WandbWorkspace.step_finished()which led to aRuntimeErrorsometimes while caching the result of a step.
Commits
c6fc5be Fix bugs with Workspace and WandbWorkspace, specifically (#321)
80c90ca Beaker DOS fix (#315)
8b75591 Log from BeakerStepLock at WARNING level (#316)
4d46d67 fix non-deterministic behavior in TorchTrainStep (#314)
c59b6b3 Bump actions/setup-python from 3 to 4 (#311)
b02cf40 Bump sphinx from 4.5.0 to 5.0.1 (#305)
4501815 Bump furo from 2022.6.4 to 2022.6.4.1 (#309)
da9c29c Fix bug in Beaker workspace (#312)
e8422cb Bump mypy from 0.960 to 0.961 (#308)
8256a74 Bump myst-parser from 0.17.2 to 0.18.0 (#310)
44ae92e Bump furo from 2022.4.7 to 2022.6.4 (#306)
39923ae Update protobuf requirement from <=3.20.0 to <4.22.0 (#301)
e7ef1f5 Registerables first steps eg (#304)
- Python
Published by github-actions[bot] over 3 years ago
tango - v0.9.0
What's new
Added 🎉
- Added a Beaker integration that comes with
BeakerWorkspace, a remoteWorkspaceimplementation that uses Beaker Datasets under the hood. - Added a
datasets::dataset_remixstep that provides the split remixing functionality oftango.steps.datasest_remix.DatasetRemixStepnow for HuggingfaceDatasetDict.
Changed ⚠️
- If you try to import something from a tango integration that is not fully installed due to missing dependencies, an
IntegrationMissingErrorwill be raised instead ofModuleNotFound. - You can now set
-j 0intango runto disable multicore execution altogether.
Fixed ✅
- Improved how steps and workspaces handle race conditions when different processes are competing to execute the same step. This would result in a
RuntimeErrorbefore with most workspaces, but now it's handled gracefully. - Fixed bug which caused GradScaler state to not be saved and loaded with checkpoints.
Commits
0ddd2ac Add Beaker integration (#296) 6bdd1dd Updates the Euler example (#297) bc89470 GradScaler state saving and loading (#293) b8562db fix old filename in CONTRIBUTING.md (#300) 4aff1bb Dataset remix (#298) eb1fcd8 Bump mypy from 0.950 to 0.960 (#295) 903741e Update filelock requirement from <3.7,>=3.4 to >=3.4,<3.8 (#284) b58b823 Handle missing integrations (#292)
- Python
Published by github-actions[bot] over 3 years ago
tango - v0.8.0
What's new
Added 🎉
- Added a Weights & Baises remote
Workspaceimplementation:WandbWorkspace, registered as "wandb". This can be instantiated from a workspace URL in the form "wandb://entity/project". - Added a method
Workspace.step_result_for_runwhich gives the result of a step given the run name and step name within that run. - Added property
Workspace.url, which returns a URL for the workspace that can be used to instantiate the exact same workspace usingWorkspace.from_url(). Subclasses must implement this.
Changed ⚠️
StepInfostart and end times will be always be in UTC now.WandbTrainCallbacknow logs system metrics from each worker process in distributed training.StepCache.__contains__()andStepCache.__getitem__()now take accept either aSteporStepInfoas an argument (Union[Step, StepInfo]).- Refactored
tango.step_graph.StepGraphto allow initialization from aDict[str, Step]. Executor.execute_step_graph()now attempts to execute all steps and summarizes success/failures.
Fixed ✅
- Fixed bug with
LocalWorkspace.from_parsed_url()(#278). - Deprecation warnings will now be logged from
tangoCLI. - Fixed the text format in the case of serializing an iterator of string.
- Added missing default value of
NonetoTangoGlobalSettings.find_or_default(). - Mypy has become incompatible with transformers and datasets, so we have to disable the checks in some places.
- The
VERSIONmember of step arguments that were wrapped inLazywere not respected. Now they are.
Commits
3069226 Makes sure the VERSION parameter of classes is respected even when we construct them inside of a Lazy object. (#289)
dd71446 Add Weights & Baises remote workspace (#232)
e3f2bd2 Adds a dependency that's missing from transformers (#285)
25919e1 Fixes the text format (#283)
381de74 Add missing default to TangoGlobalSettings.find_or_default() (#282)
9ac708a Update click requirement from <8.1.3,>=7.0 to >=7.0,<8.1.4 (#277)
749357e Bump mypy from 0.942 to 0.950 (#276)
2c59c96 Bump allenai/beaker-run-action from 1.0 to 1.1 (#274)
53ffe80 refactor (#275)
- Python
Published by github-actions[bot] almost 4 years ago
tango - v0.7.0
What's new
Added 🎉
- Added the "-n/--name" option to
tango run. This option allows the user to give the run an arbitrary name. - Added a convenience property
.workspacetoStepclass that can be called from a step's.run()method to get the currentWorkspacebeing used. - Gave
FromParamsobjects (which includes allRegistrableobjects) the ability to version themselves. - Added CLI option to run a single step in a config using
--step-nameor-s. - Added a
MultiCoreExecutorthat executes steps in parallel. - Added an
ExecutorOutputdataclass that is returned byExecutor.execute_step_graph(). StepGraphnow prints itself in a readable way.- Tango now automatically detects when it's running under a debugger, and disables multicore support accordingly. Many debuggers can't properly follow sub-processes, so this is a convenience for people who love debuggers.
- Added more models to the stuff we can import from the transformers library.
- Added new example for finetuning text-to-text models.
Changed ⚠️
- Renamed
click_loggertocli_logger, and we now use rich's loggingHandleras the default handler, which means prettier output, better tracebacks, and you can use rich's markup syntax with thecli_loggerto easily add style to text. - Refactored
tango.step_graph.StepGraphto allow initialization from aDict[str, Step]. Executor.execute_step_graph()now attempts to execute all steps and summarizes success/failures.- Upgraded PyTorch version in
tangoDocker image to latestv1.11.0+cu113. RunGenerationnow allows model object as input.
Fixed ✅
- Fixed bug that mistakenly disallowed fully-qualified names containing
"_"(underscores) in the config. - Fixed bug where
TorchTrainStepworking directory would be left in an unrecoverable state if training failed after saving the final model weights. - Fixed bug in
FromParamswhere**kwargsmight be passed down to the constructors of arguments. - Fixed bug in the way dependencies are tracked between steps.
- Fixed bug that caused
MulticoreExecutorto hang in case of a failing step that was required recursively (not directly) downstream. - Fixed bug in the way dependencies are tracked between steps
- Compatibility with PyTorch Lightning 1.6
Commits
1083049 Finetuning (#255) 42b1dba Bug fix with failing steps (#257) 7bd251a Bump myst-parser from 0.17.0 to 0.17.2 (#273) cc9a1dd Bump actions/upload-artifact from 2 to 3 (#262) 66777d9 Bump actions/download-artifact from 2 to 3 (#261) 14d4adb use new beaker-action for building test image (#265) af47287 Update pytorch-lightning requirement from <1.6,>=1.5 to >=1.5,<1.7 (#248) b1df9a4 use beaker-run action for GPU Tests (#263) 0a7468e fix release job (#260) c1b16b2 Bump furo from 2022.3.4 to 2022.4.7 (#259) b55aaf2 use beaker-py to submit GPU tests (#258) b2a93a9 Logging part 2: denoising run logging and making Dirk happy (#252) ff6be8d Update click requirement from <=8.0.4,>=7.0 to >=7.0,<8.1.3 (#254) 83d78cc Bump mypy from 0.941 to 0.942 (#243) 3769327 Bump sphinx from 4.4.0 to 4.5.0 (#245) 81fc5c5 Bump black from 21.12b0 to 22.3.0 (#246) e46059b Update tqdm requirement from <4.64,>=4.62 to >=4.62,<4.65 (#256) bbdeb6f Revert "Set `$TEMP` (#241)" b9fd9e9 Fix tracking dependencies between steps (#249) 53502e1 Pretty-print a step graph (#250) d5328c9 Fix dissimilar objects hashing to the same thing (#240) ccc37ce Autodetect debugger and turn off multicore (#251) 5c39f61 Pin click 5bb0fad Logging improvements (#233) 037e4a0 fix bug with FromParams (#242) e142530 Bump actions/cache from 2 to 3 (#236) 878402d Set `$TEMP` (#241) 2d9fa0c fix bug w/ TorchTrainStep working dir (#238) 410faeb Multicore Parallelism (#204) 9e8e99f Update datasets requirement from <2,>=1.12 to >=1.12,<3 (#234) 40e0a1a Bump mypy from 0.940 to 0.941 (#230) ede7428 add name to changelog workflow 4bb659b Bump actions/setup-python from 2 to 3 (#229) 5db1a6a Bump actions/checkout from 1 to 3 (#228) 8049104 Update torch version where it's hard-coded, add an automatic remind to do this stuff in the future (#227) fe05449 add back intersphinx inventory links for HF libraries (#222) 9927749 Bump mypy from 0.931 to 0.940 (#226) 29ab68b Update torch requirement from <1.11,>=1.9 to >=1.9,<1.12 (#225) a3fc83b Bump furo from 2022.2.23 to 2022.3.4 (#218) 28e839e Bump fairscale from 0.4.5 to 0.4.6 (#224) f18d393 Update tqdm requirement from <4.63,>=4.62 to >=4.62,<4.64 (#213) 54c4a8d automatically keep copyright up-to-date (#221) 06adb07 Allow setting the run name as a command-line option (#212) 71e0639 Update cached-path requirement from <1.1,>=1.0 to >=1.0,<1.2 (#217) 5d4660a Temporarily remove intersphinx links to HF docs (#220) 13c7f3f Merge pull request #216 from allenai/VersionForFromParams 0027cb2 Merge pull request #215 from allenai/fix-fully-qualified-name-recognition 76f9922 Add "Step.workspace" property (#210)
- Python
Published by github-actions[bot] almost 4 years ago
tango - v0.6.0
What's new
Added 🎉
- New example that finetunes a pre-trained ResNet model on the Cats & Dogs dataset.
- Added a '@requires_gpus' decorator for marking tests as needing GPUs. Tests marked with this will be run in the "GPU Tests" workflow on dual k80 GPUs via Beaker.
- Added the "-w/--workspace" option to
tango runandtango servercommands. This option takes a path or URL, and instantiates the workspace from the URL using the newly addedWorkspace.from_url()method. - Added the "workspace" field to
TangoGlobalSettings. - Added the "environment" field to
TangoGlobalSettingsfor setting environment variables each timetangois run. - Added a utility function to get a
StepGraphdirectly from a file. - Added
tango.settingsmodule andtango settingsgroup of commands. - A format for storing sequences as
SqliteSparseSequence - A way to massage kwargs before they determine the unique ID of a
Step
Changed ⚠️
local_workspace.ExecutorMetadatarenamed toStepExecutionMetadataand now saved asexecution-metadata.json.tango runwithout the option "-w/--workspace" or "-d/--workspace-dir" will now use aMemoryWorkspaceinstead of aLocalWorkspacein a temp directory, unless you've specified a default workspace in aTangoGlobalSettingsfile.- Moved
tango.workspace.MemoryWorkspaceandtango.local_workspace.LocalWorkspacetotango.workspaces.*. - Moved
tango.step_cache.MemoryStepCacheandtango.step_cache.LocalStepCachetotango.step_caches.*. - Deprecated the
-d/--workspace-dircommand-line option. Please use-w/--workspaceinstead.
Fixed ✅
- Fixed a small bug
LocalWorkspacewould fail to capture the conda environment in our Docker image. - Fixed activation of
FILE_FRIENDLY_LOGGINGwhen set from the corresponding environment variable. - Fixed setting log level via the environment variable
TANGO_LOG_LEVEL. - Use relative paths within the
work_dirfor symbolic links to the latest and the best checkpoints inTorchTrainStep. - Fixed some scenarios where Tango can hang after finishing all steps.
distributed_portandlog_everyparameters won't factor intoTorchTrainStep's unique ID.MappedSequencenow works with slicing.MappedSequencenow works with HuggingfaceDataset.- Uncacheable steps are now visible in Tango UI.
- Fixed bug in
Registrable.list_available()where an error might be raised if the default implementation hadn't been explicitly imported. - Fixed issue where having a default argument to the
run()method wasn't getting applied to the step's unique ID.
Commits
f9da0af Merge pull request #211 from allenai/Massage e78dcbe Allow setting environment variables in tango settings, fix bug with TANGOLOGLEVEL env var (#209) 82404b6 Re-create LICENSE so GitHub will show it (#208) 0fadecf Bump furo from 2022.2.14.1 to 2022.2.23 (#207) 787b6e6 Merge pull request #206 from allenai/settings c3401f2 Merge pull request #205 from allenai/RobustnessFixes 7ceda9c Merge pull request #201 from allenai/workspace-prep 6dd7d86 Merge pull request #200 from allenai/uncacheable-steps-in-server 5ad3f44 Bump furo from 2022.1.2 to 2022.2.14.1 (#199) 3528230 Update filelock requirement from <3.5,>=3.4 to >=3.4,<3.7 (#202) 21d6d40 Merge pull request #193 from allenai/StepGraphFromFile 258a7d2 skip 'distributed_port' and 'log_every' in unique ID (#197) dd4c47f Merge pull request #192 from allenai/CloseSqliteHarder cc94e1c Merge pull request #156 from allenai/DocumentationRefresh 5cc86b8 Rename "ExecutorMetadata" -> "StepExecutionMetadata" (#195) 6aecab7 Bump myst-parser from 0.16.1 to 0.17.0 (#191) 6478293 make pushing test image to Beaker more robust (#190) 7c1ac5b Finetune resnet Example for Tango (#150) 5187b01 update docs for integration tests and gpu tests timeout 95b78b5 Add new manually triggered workflow for integration tests, other bug fixes (#188) 19f7b31 Merge pull request #189 from allenai/fix-checkpoint-path-link a438b26 Workflow quickfix 671a6dc verify exit code of beaker job (#187) 7ccad94 Merge pull request #186 from allenai/add-tests bf6ecd0 Run GPU tests on Beaker (#183)
- Python
Published by github-actions[bot] almost 4 years ago
tango - v0.5.0
What's new
Added 🎉
- Added
TrainingEngineabstraction to torch integration. - Added FairScale with a
FairScaleTrainingEnginethat leverages FairScale'sFullyShardedDataParallel. This is meant to be used within theTorchTrainStep. - All PyTorch components (such as learning rate schedulers, optimizers, data collators, etc) from the
transformers library and now registered under the corresponding class in the torch integration.
For example, transformers
Adafactoroptimizer is registered as anOptimizerunder the name "transformers::Adafactor". More details can be found in the documentation for the transformers integration.
Changed ⚠️
- Various changes to the parameters othe
TorchTrainStepdue to the introduction of theTrainingEngineclass. - Params logged as
DEBUGlevel instead ofINFOto reduce noise in logs. - The waiting message for
FileLockis now clear about which file it's waiting for. - Added an easier way to get the default Tango global config
- Most methods to
TorchTrainCallbackalso take anepochparameter now. WandbTrainCallbacknow logs peak GPU memory occupied by PyTorch tensors per worker. This is useful because W&B's system metrics only display the total GPU memory reserved by PyTorch, which is always higher than the actual amount of GPU memory occupied by tensors. So these new metrics give a more accurate view into how much memory your training job is actually using.- Plain old Python functions can now be used in
Lazyobjects. LocalWorkspacenow creates a symlink to the outputs of the latest run.- Tango is now better at guessing when a step has died and should be re-run.
- Tango is now more lenient about registering the same class under the same name twice.
- When you use
dictinstead ofDictin your type annotations, you now get a legible error message. Same forList,Tuple, andSet.
Fixed ✅
- Fixed a bug in
RegistrableandFromParamswhere registered function constructors would not properly construct arguments that were classes. - Fixed a bug in
FromParamsthat would cause a crash when an argument to the constructor had the nameparams. - Made
FromParamsmore efficient by only trying to parse the params as aStepwhen it looks like it actually could be a step. - Fixed bug where
Executorwould crash ifgitcommand could not be found. - Fixed bug where validation settings were not interpreted the right way by the torch trainer.
- When you register the same name twice using
Registrable, you get an error message. That error message now contains the correct class name.
Commits
a39a69f Merge pull request #161 from allenai/FromParamsDuJour 3063a92 CHANGELOG quick fix cd006ae Add TrainEngine abstraction to TorchTrainStep, add FairScale integration, improve transformers integration (#77) 93438eb Update setuptools requirement from <=59.5.0 to <60.8.0 (#170) e57dd91 Bump sphinx-copybutton from 0.4.0 to 0.5.0 (#174) a8b1bdc split Docker build into seperate workflow, only run when necessary (#178) 59c91f7 make install comments work on all shells (#179) a059416 Merge pull request #160 from allenai/GuessStepDirBetter de7195d more fixes for conda-forge (#177) 75e9d42 use conda in Docker image, multi-stage build (#172) 611e446 Merge pull request #176 from allenai/latest-outputs 7241d20 Merge pull request #175 from allenai/self-contained-tests 83aa692 Merge pull request #153 from allenai/LazyWithoutFromParams 893e601 use virtualenv within Docker (#167) 178b8bd Merge pull request #171 from allenai/LenientRegister 6c765c8 Merge pull request #169 from allenai/InformativeFileLock 91ff7ac Merge pull request #168 from allenai/DefaultGlobalConfig 5d602fb push Docker images to GHCR.io (#166) 2b26fc8 set 'resume' to 'allow' instead of 'auto' (#155) 26771e7 fix bug when git missing (#163) 9009119 Add Dockerfile (#162) a02155d Add a required flag to the README for gpt2-example (#159)
- Python
Published by github-actions[bot] about 4 years ago
tango - v0.4.0
What's new
Changed ⚠️
- Default log level is
WARNINGinstead ofERROR. - The web UI now renders the step graph left-to-right.
- The web UI now shows runs by date, with the most recent run at the top.
- The web UI now shows steps in a color-coded way.
- The
--include-packageflag now also accepts paths instead of module names.
Fixed ✅
- Ensure tqdm log lines always make it into the log file
out.logeven when log level isWARNINGorERROR.
Commits
5ff51d6 Fix GPT2 example. (#158) 4011482 make --include-package accept paths (#157) 92b8fe5 Merge pull request #148 from allenai/RunsWithDates a4417e5 fix gpt2 config 797e3e8 minor logging tweaks (#145) 42654e6 Prepare for release v0.4.0rc5 df301ef Merge pull request #119 from allenai/RunGeneration 42535c9 Add TorchEvalStep to torch integration, use "devicecount" in TorchTrainStep instead of "devices" (#120) ecc6087 Store log output to a file in run directory, other logging improvements (#132) 9008255 Merge pull request #141 from allenai/dependabot/pip/sphinx-4.4.0 8e09b66 Merge pull request #139 from allenai/remove-no-logging 26318b2 Merge pull request #133 from allenai/dependabot/pip/mypy-0.931 7c91c4b Merge pull request #126 from allenai/dependabot/pip/sphinx-4.3.2 a178076 Merge pull request #128 from allenai/dependabot/pip/mypy-0.930 106a8cf Merge pull request #130 from allenai/dependabot/pip/furo-2022.1.2 6db7b4c Merge pull request #129 from allenai/run-name-at-end b93d22d Add typing info to package (PEP 561) (#131) e8867ae fix StopEarlyCallback state recovery 465a525 Prepare for release v0.4.0rc4 7a65540 CHANGELOG quick fix 31622d1 add logo to docs and README (#121) b17d325 fix bug with StepInfo (#122) d5698a0 Bump myst-parser from 0.16.0 to 0.16.1 (#118) 69d1dc8 Bump mypy from 0.910 to 0.920 (#117) 9739944 Prepare for release v0.4.0rc3 20138ce improve release notes generation script 760b4f2 Add DatasetsFormat, making LoadDataset cacheable, fix bug with KeyboardInterrupt (#114) e51691f Improvements to W&B callback (#115) d044f6e Add pre/post epoch callbacks (#113) ae1ae0b Bump myst-parser from 0.15.2 to 0.16.0 (#111) c605a1e Merge pull request #90 from allenai/SqliteDictFormat 8d6804d Prepare for release v0.4.0rc2 f404541 Merge pull request #110 from allenai/Conda c154c92 Merge pull request #101 from allenai/Euler 288e02f ensure all integrations are imported if we can't find registered name (#109) e033170 Bump black from 21.11b1 to 21.12b0 (#102) b2781fd FAQ not FAQs :) (#108) 81e1225 add FAQs to docs (#107) 2148b74 Merge pull request #106 from allenai/Favicon f812b73 fix bug with resolving lazy step inputs (#105) 110fc79 Merge pull request #104 from allenai/why-tango eab62d8 fix bug in distributed training (#103) 901631f Removed scary warning e86b843 Better summary 3bfd7c1 adjust dependency pinning (#100) 72aaa53 Merge pull request #79 from allenai/jon/html-viz d45ec0c Merge pull request #99 from allenai/SkipGitTests 98b022b make sure workspaces can be imported from base module (#98) 07ac494 Merge pull request #97 from allenai/lower-click-pin 5ccefa9 fix prelease indicator condition in CI 04a5ab8 Prepare for release v0.4.0rc1 b2c09e3 fix typo in example (#96) aba5758 Merge pull request #94 from allenai/dependabot/pip/cached-path-gte-0.3.3-and-lt-1.1.0 48b0b24 Merge pull request #92 from allenai/dependabot/pip/datasets-gte-1.12-and-lt-1.17 0107672 Merge pull request #91 from allenai/dependabot/pip/sphinx-4.3.1 7d9d919 Merge pull request #93 from allenai/NoOverrides b5907de Merge pull request #67 from allenai/ResponsibleSteps 20951ea Bump furo from 2021.11.16 to 2021.11.23 (#89) 8bb00c4 Bump black from 21.11b0 to 21.11b1 (#88) f240ac4 update filelock + cachedpath, improve release scripts (#87) eba4b8e Bump black from 21.10b0 to 21.11b0 (#86) bc80bb8 Merge pull request #85 from allenai/dependabot/pip/filelock-gte-3.3-and-lt-3.5 17d28c7 Bump furo from 2021.11.15 to 2021.11.16 (#84) 4118912 Merge pull request #82 from allenai/dependabot/pip/furo-2021.11.15 cb1b853 Merge pull request #62 from allenai/dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 30f7a13 W&B log as step+1 (#76) aab58b6 add some conda instructions to CONTRIBUTING.md (#81)
- Python
Published by github-actions[bot] about 4 years ago
tango - v0.4.0rc5
What's new
Added 🎉
- Added
TorchEvalStepto torch integration, registered as "torch::eval".
Changed ⚠️
- Renamed
aggregate_val_metrictoauto_aggregate_val_metricinTorchTrainStep. devicesparameter toTorchTrainStepreplaced withdevice_count: int.- Run name printed at the end of a run so it's easier to find.
- Type information added to package data. See PEP 561 for more information.
- A new integration,
transformers, with two new steps for running seq2seq models. - Added
logging_tqdm, if you don't want a progress bar, but you still want to see progress in the logs. - Added
threaded_generator(), for wrapping generators so that they run in a separate thread from the generator's consumer. - Added a new example for evaluating the T0 model on XSum, a summarization task.
- Added
MappedSequencefor functionally wrapping sequences. - Added
TextFormat, in case you want to store the output of your steps in raw text instead of JSON. - Steps can now list arguments in
SKIP_ID_ARGUMENTSto indicate that the argument should not affect a step's unique id. This is useful for arguments that affect the execution of a step, but not the output. Stepnow implements__str__, so steps look pretty in the debugger.- Added
DatasetCombineStep, a step that combines multiple datasets into one. - Added
common.logging.initialize_worker_logging()function for configuring logging from worker processes/threads. - Logs from
tango run ...will be written to a file calledout.login the run directory.
Fixed ✅
- Fixed torch
StopEarlyCallbackstate not being recovered properly on restarts. - Fixed file friendly logging by removing special styling characters.
- Ensured exceptions captured in logs.
LocalWorkspacenow works properly with uncacheable steps.- When a Tango run got killed hard, with
kill -9, or because the machine lost power,LocalWorkspacewould sometimes keep a step marked as "running", preventing further executions. This still happens sometimes, but it is now much less likely (and Tango gives you instructions for how to fix it). - To make all this happen,
LocalWorkspacenow saves step info in a Sqlite database. Unfortunately that means that the workspace format changes and existing workspace directories won't work properly with it. - Fixed premature cleanup of temporary directories when using
MemoryWorkspace
Commits
df301ef Merge pull request #119 from allenai/RunGeneration 42535c9 Add TorchEvalStep to torch integration, use "device_count" in TorchTrainStep instead of "devices" (#120) ecc6087 Store log output to a file in run directory, other logging improvements (#132) 9008255 Merge pull request #141 from allenai/dependabot/pip/sphinx-4.4.0 8e09b66 Merge pull request #139 from allenai/remove-no-logging 26318b2 Merge pull request #133 from allenai/dependabot/pip/mypy-0.931 7c91c4b Merge pull request #126 from allenai/dependabot/pip/sphinx-4.3.2 a178076 Merge pull request #128 from allenai/dependabot/pip/mypy-0.930 106a8cf Merge pull request #130 from allenai/dependabot/pip/furo-2022.1.2 6db7b4c Merge pull request #129 from allenai/run-name-at-end b93d22d Add typing info to package (PEP 561) (#131) e8867ae fix StopEarlyCallback state recovery
- Python
Published by github-actions[bot] about 4 years ago
tango - v0.4.0rc4
What's new
Fixed ✅
- Fixed a bug where
StepInfofails to deserialize whenerroris an exception that can't be pickled.
Commits
7a65540 CHANGELOG quick fix 31622d1 add logo to docs and README (#121) b17d325 fix bug with StepInfo (#122) d5698a0 Bump myst-parser from 0.16.0 to 0.16.1 (#118) 69d1dc8 Bump mypy from 0.910 to 0.920 (#117)
- Python
Published by github-actions[bot] about 4 years ago
tango - v0.4.0rc3
What's new
Added 🎉
- Added
DatasetsFormatformat andLoadStreamingDatasetstep todatasetsintegration. SqliteDictFormatfor datasets.- Added
pre_epoch()andpost_epoch()callback methods to PyTorchTrainCallback.
Changed ⚠️
LoadDatasetstep fromdatasetsintegration is now cacheable, using theDatasetsFormatformat by default. But this only works with non-streaming datasets. For streaming datasets, you should use theLoadStreamingDatasetstep instead.
Fixed ✅
- Fixed bug where
KeyboardInterruptexceptions were not handled properly by steps and workspaces. WandbTrainCallbacknow will use part of the step's unique ID as the name for the W&B run by default, to make it easier to indentify which tango step corresponds to each run in W&B.WandbTrainCallbackwill save the entireTrainConfigobject to the W&B config.
Commits
20138ce improve release notes generation script 760b4f2 Add DatasetsFormat, making LoadDataset cacheable, fix bug with KeyboardInterrupt (#114) e51691f Improvements to W&B callback (#115) d044f6e Add pre/post epoch callbacks (#113) ae1ae0b Bump myst-parser from 0.15.2 to 0.16.0 (#111) c605a1e Merge pull request #90 from allenai/SqliteDictFormat
- Python
Published by github-actions[bot] about 4 years ago
tango - v0.4.0rc2
What's new
Added 🎉
- Sample experiment configurations that prove Euler's identity
Changed ⚠️
- Loosened
Clickdependency to include v7.0. - Loosened
datasetsdependency. - Tightened
petnamedependency to exclude next major release for safety.
Fixed ✅
Workspace,MemoryWorkspace, andLocalWorkspacecan now be imported directly from thetangobase module.- Uncacheable leaf steps would never get executed. This is now fixed.
- We were treating failed steps as if they were completed by accident.
- The visualization had a problem with showing steps that never executed because a dependency failed.
- Fixed a bug where
Lazyinputs to aStepwould fail to resolve arguments that come from the result of another step. - Fixed a bug in
TorchTrainStepwhere some arguments for distributed training (devices,distributed_port) weren't being set properly.
Commits
f404541 Merge pull request #110 from allenai/Conda
4c347db Merge branch 'main' into Conda
c154c92 Merge pull request #101 from allenai/Euler
b3a8ae6 Revert "Make sure default steps are available when you run tango run"
76bda76 Merge remote-tracking branch 'origin/main' into Euler
e073925 Revert "Import integrations safely"
5550ff8 Adds Conda to the readme
288e02f ensure all integrations are imported if we can't find registered name (#109)
e033170 Bump black from 21.11b1 to 21.12b0 (#102)
2fd48cc Merge branch 'main' into Euler
b2781fd FAQ not FAQs :) (#108)
b31c8f2 Merge branch 'main' into Euler
81e1225 add FAQs to docs (#107)
dec4a86 Merge remote-tracking branch 'origin/Euler' into Euler
23d7824 Brings back the Euler example
4327e22 Merge branch 'main' into Euler
2148b74 Merge pull request #106 from allenai/Favicon
5ce5c64 Merge branch 'main' into Favicon
18e556a Revert "Changelog"
97dd79d Merge branch 'main' into Euler
f812b73 fix bug with resolving lazy step inputs (#105)
0a05047 Merge branch 'main' into Euler
89b315a Tango Favicon
110fc79 Merge pull request #104 from allenai/why-tango
a67c617 Update README.md
a5f5bd0 Merge branch 'main' into why-tango
eab62d8 fix bug in distributed training (#103)
e73a4a0 Update README.md
eb3ef41 Merge branch 'main' into Euler
c1c9bd0 Changelog
901631f Removed scary warning
e86b843 Better summary
cf9088d We no longer start the server during tests
b257e46 Import integrations safely
10ec4f4 Pick the right mutable mapping
a175370 Make sure default steps are available when you run tango run
1d4051a Make mypy happy
67e8a02 Moved complex arithmetic to https://github.com/allenai/tango-example
8a5c0d7 Adds test for steps that fail
3d551b3 Test for uncacheable leaf steps
21f0280 Don't explicitly run steps that don't need it
5fa4731 isort
9a3cd52 Changelog
6df6682 clarify comment
c46e0b2 also pick any uncacheable direct dependencies of leaf steps
c32a5cc Example steps for Euler's identity
1c97f10 Write step info for all steps before running any of them
3336a7e Reset step info when re-running a failed step
228b0f3 How did this slip by?
f9f1d05 Make sure to execute uncacheable steps that are not a dependency of anything
3d44720 Add steps for complex arithmetic
3bfd7c1 adjust dependency pinning (#100)
72aaa53 Merge pull request #79 from allenai/jon/html-viz
adea827 Merge branch 'jon/html-viz' of github.com:allenai/tango into jon/html-viz # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
e771da7 make alert look nice
d21a172 Merge branch 'main' into jon/html-viz
d45ec0c Merge pull request #99 from allenai/SkipGitTests
28f17fa Don't check for git if we're not running in a repo
7db639e Merge branch 'main' into jon/html-viz
98b022b make sure workspaces can be imported from base module (#98)
07ac494 Merge pull request #97 from allenai/lower-click-pin
f10b178 Merge branch 'main' into lower-click-pin
5ccefa9 fix prelease indicator condition in CI
9843c5a update CHANGELOG
91d6551 Merge branch 'main' into lower-click-pin
58dfd3e CHANGELOG.md
8a069f0 loosen Click requirement
85b7417 Shows an ugly but functional brief popup when you copy something
626f1b6 Merge branch 'main' into jon/html-viz
b0cbf63 make tooltips meaningful since i cannot remove them
422ae0d Update tango/server/report.js
18869b7 Make local results work
3b1326a Python 3.7 again
5a282ff Warning to the future
2f54569 Give the option of tracking dependencies properly through Workspace
f9f7bfb We don't have this method anymore.
77ae786 Silence a warning
0d532c5 Reformat setup.py only in Python 3.7?
4f1e786 Changelog
b500f0e Merge branch 'main' into jon/html-viz
5c0b794 Merge branch 'main' into jon/html-viz
5c455e2 Include assets in package
4153e9e Fix import order
85dd1af Print a direct link to the run if we have one
01f0343 Makes it so you can run the server from any directory
300b722 Make frontend and backend consistent in their terminology
0ef1350 Fix some errors in the server
bd07ad5 Merge remote-tracking branch 'origin/ResponsibleSteps' into jon/html-viz
0cde4e7 pr updates
7bf9488 Makes tests pass
e96f6f3 Merge branch 'ResponsibleSteps' into jon/html-viz
8800bc9 Merge branch 'ResponsibleSteps' into jon/html-viz
378f649 pr gix
5047f6b add logo
4a7487b add in reloading every second
a1131ed pr fixes
95f1cbe pr fixes
d47b624 pr fixes
5962699 pr fixes
a974369 fix merge conflicts
3280491 initial code to serve and display tango viz
- Python
Published by github-actions[bot] about 4 years ago
tango - v0.4.0rc1
What's new
Added 🎉
- Introduced the concept of the
Workspace, withLocalWorkspaceandMemoryWorkspaceas initial implementations. - Added a stub of a webserver that will be able to visualize runs as they happen.
- Added separate classes for
LightningTrainingTypePlugin,LightningPrecisionPlugin,LightningClusterEnvironmentPlugin,LightningCheckpointPluginfor compatibility withpytorch-lightning>=1.5.0.
Removed 👋
- Removed old
LightningPluginclass - Removed requirement of the
overridespackage
Changed ⚠️
- Made it possible to construct a step graph out of
Stepobjects, instead of constructing it out ofStepStubobjects. - Removed dataset fingerprinting code, since we can now use
Stepto make sure things are cached. - Made steps deterministic by default.
- Brought back
MemoryStepCache, so we can run steps without configuring anything. - W&B
torch::TrainCallbacklogs withstep=step+1now so that training curves in the W&B dashboard match up with checkpoints saved locally and are easier to read (e.g. step 10000 instead of 9999). filelock >= 3.4required, parameterpoll_intervalltotango.common.file_lock.FileLock.acquirerenamed topoll_interval.
Fixed ✅
- Fixed bug in
FromParamswhere a parameter to aFromParamsclass may not be instantiated correctly if it's a class with a generic type parameter.
Commits
b2c09e3 fix typo in example (#96) aba5758 Merge pull request #94 from allenai/dependabot/pip/cached-path-gte-0.3.3-and-lt-1.1.0 4ae6115 Update requirements.txt d4d0655 Merge branch 'main' into dependabot/pip/cached-path-gte-0.3.3-and-lt-1.1.0 48b0b24 Merge pull request #92 from allenai/dependabot/pip/datasets-gte-1.12-and-lt-1.17 6044f8b Update cached-path requirement from <0.4.0,>=0.3.3 to >=0.3.3,<1.1.0 1ae82aa Merge branch 'main' into dependabot/pip/datasets-gte-1.12-and-lt-1.17 0107672 Merge pull request #91 from allenai/dependabot/pip/sphinx-4.3.1 fa47e54 Merge branch 'main' into dependabot/pip/datasets-gte-1.12-and-lt-1.17 38c0b42 Merge branch 'main' into dependabot/pip/sphinx-4.3.1 7d9d919 Merge pull request #93 from allenai/NoOverrides dfa461c Removes the dependency on the `overrides` package ff18197 Update datasets requirement from <1.16,>=1.12 to >=1.12,<1.17 fc69dd0 Bump sphinx from 4.3.0 to 4.3.1 b5907de Merge pull request #67 from allenai/ResponsibleSteps 267a6e4 clean up config usage 82862ef Merge branch 'main' into ResponsibleSteps 20951ea Bump furo from 2021.11.16 to 2021.11.23 (#89) 8d8670a Optional server 03049fa Handle the log level consistently c620405 Merge branch 'main' into ResponsibleSteps 8bb00c4 Bump black from 21.11b0 to 21.11b1 (#88) cd5a70a Fix tests 71cfbd7 Don't cache uncacheable steps 7839fd8 Merge branch 'main' into ResponsibleSteps f240ac4 update filelock + cached_path, improve release scripts (#87) 967ecb2 Merge branch 'main' into ResponsibleSteps 10634aa Don't show inherited from_params 4b104be Fix test 4e9910d Avoid a naming conflict in computer science 6db2c29 Improved documentation 44f79ec Added blurb 7f68a97 Use enum for step states 19e6de2 Click logging is disabled by default, enabled in the CLI use case eba4b8e Bump black from 21.10b0 to 21.11b0 (#86) 1d80766 Log the start of a run 6190e2d Merge branch 'main' into ResponsibleSteps 3550c3f Cleaner workspace docs 08d4056 Better StepCache docs d5ede4e Formatting bc80bb8 Merge pull request #85 from allenai/dependabot/pip/filelock-gte-3.3-and-lt-3.5 d69bac4 Check whether a run name already exists cde0f14 Unused import a80a87f Fix the case where a step's cacheability changes across restarts de5f248 Improve comment f13d717 Losely pin petname 5d72923 Merge branch 'main' into ResponsibleSteps 914902b Merge branch 'ResponsibleSteps' of https://github.com/allenai/tango into ResponsibleSteps 2f5a266 Merge pull request #83 from allenai/petew-ResponsibleSteps b3df88f Update filelock requirement from <3.4,>=3.3 to >=3.3,<3.5 17d28c7 Bump furo from 2021.11.15 to 2021.11.16 (#84) 8f2b48e add failing test case 1fc9860 Merge remote-tracking branch 'origin/main' into ResponsibleSteps 4118912 Merge pull request #82 from allenai/dependabot/pip/furo-2021.11.15 d5eb968 Merge branch 'main' into dependabot/pip/furo-2021.11.15 abac28a Fix the shortcut for running all (many) checks c4009de Merge branch 'main' into ResponsibleSteps e364766 Merge pull request #73 from allenai/petew-ResponsibleSteps 8ae0e27 More doctests 6211025 Format docs better 5814724 Fix docs cb1b853 Merge pull request #62 from allenai/dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 50caada Update requirements.txt 9c79b97 Makes the docs build 3ecb952 Merge branch 'main' into dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 55c5fae Bump furo from 2021.11.12 to 2021.11.15 83add7e Important fixes 301347d Fix tests cfe29de Create workdir when requested 7b67142 Bring back "needed by" a46b3e8 This wasn't meant to be checked in. af4cc3c Use click through the logger ee81446 Merge branch 'ResponsibleSteps' into petew-ResponsibleSteps 7247528 We don't need this TODO right now. 9b8446c Merge branch 'main' into ResponsibleSteps 30f7a13 W&B log as step+1 (#76) eab2b7f Merge pull request #74 from allenai/Workspaces d91302b Simplify! e605399 changelog 9f360da Bring back deterministic step randomness, without breaking random step names a2c5cd8 fix order of imports 02ad624 Merge branch 'dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0' of https://github.com/allenai/tango into dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 7f78b0a remove comment 8f8cbeb separating different plugin types b139181 Merge branch 'main' into dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 761f873 Merge branch 'ResponsibleSteps' into Workspaces 12a480a Merge branch 'ResponsibleSteps' into petew-ResponsibleSteps b7d4e88 merge main 63f6029 Merge branch 'main' into ResponsibleSteps aab58b6 add some conda instructions to CONTRIBUTING.md (#81) e7e5c5e Fix symlink creation b547472 Adds a command to keep a server running permanently 596c278 Not sure how this line got lost 06d7681 Fixes and cleanup 3cebc24 fix bug caused by random seed d145216 fix comment 067436d clean up a4ce577 Creates and uses the concept of a workspace, so that the server can consume it 110eb07 ci e811f47 fix tests 460dd87 executor fixes 837a454 handle generic non-FromParams classes 7fed5a6 fix merge conflicts 12ae8e9 Fixing the torch test 0c208e1 Remove stale comment b44a0a0 Fix doctest 3e4eb42 Makes det_hash consistent across Python versions f23b56c Executable documentation! 38388ca Formatting 6aaa01e Fix some documentation 57825a0 Fix docs cb7591d Order imports correctly 🙄 8caea06 Removing unused imports f7f5d2e Changelog again 05ca4dd Changelog da16b5a Merge branch 'main' into ResponsibleSteps 8c51b45 Make nested steps work for classes that aren't FromParams 083516b Remember which extra modules we imported 54e7dbc We can't restore the registry like that. 455b756 Refactors the test to fail in new and exciting ways 62cd1d2 Merge branch 'main' into dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 cd68752 Don't need this comment anymore 76c10f1 Remove fingerprint stuff a1f1af8 Merge branch 'FixImport' into ResponsibleSteps ed9ebc9 Fix Import 14bc2ee Merge branch 'main' into ResponsibleSteps 4a4ab58 Merge branch 'main' into dependabot/pip/pytorch-lightning-gte-1.4.0-and-lt-1.6.0 e5ea1af Fix after merge 7b92dd9 Merge remote-tracking branch 'origin/main' into ResponsibleSteps d066768 We were not actually using this function df07690 Mypy inspired changes 87ee96f Formatting 3823f98 🤦🏼 d018838 Fix executor test a7fb4ea Fix executor 0b1f310 Quiet, you 449a70d Fixes circular references 4fc1f34 Actually write the circular reference test 0df3538 Throw the error for the right reason afb3b61 WithUnresolvedSteps 1be6816 Update pytorch-lightning requirement 3c4cb19 Dicts are iterable, so these have to be swapped efc84d4 Adds new failing test 1f2bf16 Relative imports don't work 430112f Fix bug in test c3dd08d Add test that fails 118f8c2 Better name for the test a6ac3c9 Detect unsatisfiable dependencies 8eb7a48 Bring back parsing everything as a Step first :-/ bc6f886 Type checks 00b2df6 Makes the test pass a6ba11a Make the code more compatible with the IDE a611d48 We don't need these anymore 132153d Formatting d019499 Start fixing hard tests f4d77c1 Fix trivial tests fa6de92 Make steps responsible for their own execution cfe7007 Slightly more readable error message c0ded32 Typo
- Python
Published by github-actions[bot] about 4 years ago
tango - v0.3.6
What's new
Added 🎉
- Added a
.log_batch()method ontorch::TrainCallbackwhich is given the average loss across distributed workers, but only called everylog_everysteps.
Removed 👋
- Removed
.pre_log_batch()method ontorch::TrainCallback.
Fixed ✅
- Fixed typo in parameter name
remove_stale_checkpointsinTorchTrainStep(previously wasremove_state_checkpoints). - Fixed bug in
FromParamsthat would cause failures whenfrom __future__ import annotationswas used with Python older than 3.10. See PEP 563 for details.
Commits
6b5cb24 support for PEP 563 in older Python versions (#80) 27657d9 Bump furo from 2021.10.9 to 2021.11.12 (#78) c8ac858 Bump sphinx from 4.2.0 to 4.3.0 (#75) 5516244 refactor TorchTrainStep (#70) b13bba7 Bump isort from 5.10.0 to 5.10.1 (#71)
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.3.5
What's new
Fixed ✅
- Fixed a bug in
FromParamswhere the "type" parameter was ignored in some cases where theRegistrablebase class did not directly inherit fromRegistrable.
Commits
5bdad24 Merge pull request #69 from allenai/weird-mix-fix 5434e23 test case e1108a5 CHANGELOG.md 2bbfa6f fix weird FromParams/Registrable bug
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.3.4
What's new
Added 🎉
- Added
StopEarlyCallback, aTorchTrainCallbackfor early stopping. - Added parameter
remove_stale_checkpointstoTorchTrainStep.
Changed ⚠️
- Minor changes to
TorchTrainCallbackinterface. - Weights & Biases
TorchTrainCallbacknow logs best validation metric score.
Commits
c126442 torch train improvements (#64)
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.3.3
What's new
Added 🎉
- Added support for PEP 604 in
FromParams, i.e. writing union types as "X | Y" instead of "Union[X, Y]". - [internals] Added a spot for miscellaneous end-to-end integration tests (not to be confused with "tests of integrations") in
tests/end_to_end/. - [internals] Core tests now run on all officially supported Python versions.
Fixed ✅
- Fixed a bug in
FromParamswhere non-FromParamsclass parameters were not instantiated properly (or at all). - Fixed a bug in
FromParamswhere kwargs were not passed on from a wrapper class to the wrapped class. - Fixed small bug where some errors from git would be printed when executor metadata is created outside of a git repository.
Commits
ea6d2e5 another FromParams fix (#66) eeb1560 Update datasets requirement from <1.15,>=1.12 to >=1.12,<1.16 (#60) fe6dbe0 Bump isort from 5.9.3 to 5.10.0 (#61) 9f302b3 PEP 604 support (#59) 634bd71 Merge pull request #63 from allenai/StepGraphTests 9f513d6 Rename stepgraph.py to stepgraph_test.py 604001f tweak checkpointing/validate step 0920349 fix bug with git metadata (#56) af1b438 fix another FromParams bug, add spot for miscellaneous end-to-end tests (#58)
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.3.2
What's new
Fixed ✅
- Fixed a bug with
FromParamsthat caused.from_params()to fail when the params contained an object that was already instantiated. - tango command no longer installs a SIGTERM handler, which fixes some bugs with integrations that use multiprocessing.
Commits
ab47e21 Merge pull request #55 from allenai/no-sigterm-handler b759c46 remove sigterm handler c4e96cb fix example a1f9ec5 fix FromParams bug (#54) d5ef0ae Bump black from 21.9b0 to 21.10b0 (#53) b20f5b2 fix typo
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.3.1
What's new
Changed ⚠️
- Updated the
LightningTrainStepto optionally take in aLightningDataModuleas input.
Commits
4d77160 Merge pull request #52 from allenai/fix-typo 92874c2 fix release docs 26e36e4 Merge pull request #51 from allenai/lightning-data-module b10d256 update changelog 6d068a4 fix import order 1744f55 adding option for data module
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.3.0
What's new
Added 🎉
- Added
IterableDatasetDict, a version ofDatasetDictfor streaming-like datasets. - Added a PyTorch Lightning integration with
LightningTrainStep.
Fixed ✅
- Fixed bug with
FromParamsandLazywhere extra arguments would sometimes be passed down through to aLazyclass when they shouldn't.
Commits
a95dbae fix bugs with initializing lightning loggers and plugins (#50) 7e9c354 fix bug w/ DataLoader and PTL d634ea8 add isort (#49) 277b0e2 add IterableDatasetDict (#46) db06e70 Merge pull request #45 from allenai/pytorch-lightning 6174381 fix CHANGELOG 501bf73 fix failing test 686be74 Merge branch 'main' into pytorch-lightning 4f3f328 add torch:: to torch integrations 293fac1 doc and general fixes 2b2a6a3 Update tango/integrations/pytorchlightning/init.py d46a0ba Update docs/source/api/integrations/pytorchlightning.rst 3aeb825 fix docs 375ff58 update docs c6a1e38 update ci 8110242 PyTorch Lightning Integration 60121c0 update docs, print unicode characters by name (#44) 7beab21 clean up docs 2f6871b only print ascii characters (#43)
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.2.4
What's new
Added 🎉
- Added support for torch 1.10.0.
Changed ⚠️
--file-friendly-loggingflag is now an option to the maintangocommand, so needs to be passed beforerun, e.g.tango --file-friendly-logging run ....
Fixed ✅
- Fixed bug with
Step.from_params. - Ensure logging is initialized is spawn processes during distributed training with
TorchTrainStep.
Commits
d497e7a Update torch requirement from <1.10.0,>=1.9.0 to >=1.9.0,<1.11.0 (#42) fbad9b2 fix failing test 409b50a ensure logging initialize in spawn distributed workers 1714390 move filefriendlylogging flag back to main command af8bc69 fix bug in Step.from_params 4e5b406 add TangoMetadata to docs
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.2.3
What's new
Added 🎉
- Added support for global settings file,
tango.yml. - Added 'include_package' (array of string) param to config spec.
- Added a custom error
StopEarlythat aTrainCallbackcan raise within theTorchTrainStepto stop training early without crashing. - Added step config, tango command, and tango version to executor metadata.
- Executor now also saves pip dependencies and conda environment files to the run directory for each step.
Fixed ✅
- Ensured
**kwargsarguments are logged inFromParams.
Commits
0094888 save pip and conda files to run directory, add step config to metadata (#41) d588886 Early stopping via callbacks in torch train (#40) c437887 ensure '**kwargs' are logged in FromParams 7de1837 add support for global settings file (#39) 4026df1 Update datasets requirement from <1.14,>=1.12 to >=1.12,<1.15 (#38) 7ce3493 add 'include_package' param to config spec
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.2.2
What's new
Added 🎉
- Added new steps to
datasetsintegration:ConcatenateDatasets("datasets::concatenate") andInterleaveDatasets(datasets::interleave). - Added
__contains__and__iter__methods onDatasetDictso that it is now aMappingclass. - Added
tango infocommand that - among other things - displays which integrations are installed.
Commits
d47399d add 'tango info' command (#34) 80f8457 add interleave and concatenate dataset steps (#33) 10f56f7 add test for generics from std lib (#32) f77837a make DatasetDict an actual Mapping (#30)
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.2.1
What's new
Added 🎉
- Added
convert_to_tango_dataset_dict()function in thedatasetsintegration. It's important for step caching purposes to use this to convert a HFDatasetDictto a native TangoDatasetDictwhen thatDatasetDictis part of the input to another step. Otherwise the HFDatasetDictwill have to be pickled to determine its hash.
Changed ⚠️
Format.checksum()is now an abstract method. Subclasses should only compute checksum on the serialized artifact and nothing else in the directory.- [internals] Changed the relationship between
Executor,StepCache, andStep.Executornow owns theStepCache, andStepnever interacts withStepCachedirectly.
Commits
bdc8486 make Format.checksum abstract 5514ce5 Refactor Executor, StepCache, and Step, improve hashing of DatasetDict (#29)
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.2.0
What's new
Added 🎉
- Added a Weights & Biases integration with a training callback ("wandb::log")
for
TorchTrainStep("torch::train") that logs training and validation metrics to W&B.
Fixed ✅
- Fixed
Format.checksum()when there is a symlink to a directory in the cache folder.
Commits
374f1ad Add W&B integration (#28)
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.1.3
What's new
Added 🎉
- Added the ability to track a metric other than "loss" for validation in
TorchTrainStep("torch::train").
Fixed ✅
- Final model returned from
TorchTrainStep("torch::train") will have best weights loaded. - Checkpoints are saved from
TorchTrainStep("torch::train") even when there is no validation loop. - Fixed
TorchTrainStep("torch::train") whenvalidation_splitisNone. - Fixed distributed training with
TorchTrainStep("torch::train") on GPU devices.
Commits
ba05e79 Torch train updates and distributed fixes (#27)
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.1.2
What's new
Added 🎉
- Added support for YAML configuration files.
Commits
a2a6e08 relax requirement on PyYAML 1bd44b9 Update datasets requirement from <1.13,>=1.12 to >=1.12,<1.14 (#25) 55114a6 Bump pyyaml from 5.4.1 to 6.0 (#26) bcbe08e add support for YAML configuration files (#24) 69f6544 Fix more typos (#23) c11b029 minor updates (#22) 289a779 Typo 73706bb start on overview section of docs (#21) 7ab2969 list dependencies in example 659a8ac Add exampels to docs (#20) 4bba28f test examples (#19)
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.1.1
What's new
Added 🎉
TorchTrainStepnow displays a progress bar while saving a checkpoint to file.- The default executor now saves a "executor-metadata.json" file to the directory for each step.
Changed ⚠️
- Renamed
DirectoryStepCachetoLocalStepCache(registered as "local"). LocalStepCachesaves metadata tocache-metadata.jsoninstead ofmetadata.json.
Fixed ✅
- Fixed bug with
TorchTrainStepduring distributed training. FromParamswill automatically convert strings intoPathtypes now when the annotation isPath.
Commits
e2d5abb save to cache-metadata.json instead of metadata.json (#18) e5118bd Torch train fixes (#17) f3b5178 Bump furo from 2021.9.22 to 2021.10.9 (#13) 7f525fe fix typo 12e70fa add more sphinx extensions (#16) 010f3fb Rename DirectoryStepCache to SimpleStepCache, clean up docs (#15)
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.1.0
What's new
Added 🎉
- Added
StepGraphandExecutorabstractions. - Added a basic PyTorch training step registered as
"torch::train", along with other registrable components, such asModel,DataLoader,Sampler,DataCollator,Optimizer, andLRScheduler. - Added
DatasetRemixStepintango.steps. - Added module
tango.common.sequences. - Added
DatasetDictclass intango.common.dataset_dict. - Added 🤗 Datasets integration.
- Added command-line options to set log level or disable logging completely.
Changed ⚠️
Step.work_dir,Step.unique_id,Step.dependencies, andStep.recursive_dependenciesare now a properties instead of methods.tango runcommand will acquire a lock on the directory to avoid race conditions.- Integrations can now be installed with
pip install tango[INTEGRATION_NAME]. For example,pip install tango[torch]. - Added method
Registrable.search_modules()for automatically finding and importing the modules where a givennamemight be registered. FromParams.from_params()andRegistrable.resolve_class_namewill now callRegistrable.search_modules()to automatically import modules where the type might be defined. Thus for classes that are defined and registered within anytango.*submodules it is not necessary to explicitly import them.
Fixed ✅
Stepimplementations can now take arbitrary**kwargsin theirrun()methods.
Commits
3616040 add step graph and executor abstractions, support for distributed training in TorchTrainStep (#14) 0fa7d5e add torch components and simple train Step (#12) b408b8a auto search and import modules in from params (#10) 2c376f4 Update feature_request.md ff4fe11 Add 🤗 Datasets integration (#8) 2210c62 add DatasetRemixStep (#7) 9dffab3 document installing with integrations 5e225cf more work on integrations (#6) f45b98f rename workflow 971906c rename workflow file d33f57f add extras to setup.py (#5) cdc4393 add source code to docs c51f2a0 acquire lock on directory during run
- Python
Published by github-actions[bot] over 4 years ago
tango - v0.0.2
What's new
Added 🎉
- Ported over core tango components from AllenNLP.
Commits
1bb5ed8 Merge pull request #1 from allenai/dependabot/pip/furo-2021.9.22 9d7a7fe Merge pull request #3 from allenai/add-components 21c3c2a merge docs with readme 6d3ba03 add format test e7e1aee fix again d760857 fix for py3.7 6076d17 add step tests 01e258e restore Registrable state after each test case f6aef4c fix typo in README edebbdf add issue and PR templates 18cd9f0 fixes 07ee21e orchestrating -> choreographing 586655a add contributing guide e79f475 fixups 14afd11 prepare docs 0f7f6f6 add tango components from allennlp d65c9bb brand as 'AI2 Tango' 4bf379c ignore more directories in flake8 config daf4e3d add .gitignore cf04957 Merge pull request #2 from allenai/custom-css 514adf8 add custom css
- Python
Published by github-actions[bot] over 4 years ago