Recent Releases of rl4co

rl4co - v0.6.0

v0.6.0 πŸš€

:tada: RL4CO got accepted as oral in KDD 2025 :tada:

This release adds several features and bugfixes, including:

What's Changed

  • [Feat] add GFACS and replace PyVRP with HGS for CVRP local search by @hyeok9855 in https://github.com/ai4co/rl4co/pull/236
  • [Feat] Implementing GLOP by @Furffico in https://github.com/ai4co/rl4co/pull/253
  • [Bug fixes for #255] Added environment config files for FLP and MCP by @bokveizen in https://github.com/ai4co/rl4co/pull/256
  • [Additional features for #255] Added embedding functions for MCP by @bokveizen in https://github.com/ai4co/rl4co/pull/257
  • Set env, policy and dataset to be ignored in hparams for logging by @ngastzepeda in https://github.com/ai4co/rl4co/pull/262
  • Ensure baseline for SymNCO is always "symnco" when loading from checkpoint by @ngastzepeda in https://github.com/ai4co/rl4co/pull/263
  • Don't artificially limit the rendering axes for CVRP environment by @ngastzepeda in https://github.com/ai4co/rl4co/pull/264
  • [Feat] update MDCPDP env #220 by @fedebotu in https://github.com/ai4co/rl4co/pull/265
  • New citations for KDD 2025

Full Changelog: https://github.com/ai4co/rl4co/compare/v0.5.2...v0.6.0

- Python
Published by fedebotu 9 months ago

rl4co - v0.5.2

v0.5.2 :rocket:

This release introduces several bugfixes and improvements!

What's Changed

  • [BugFix] Taillard instances in example notebook by @LTluttmann in https://github.com/ai4co/rl4co/pull/234
  • [Fix] fix the position of row and col embedding by @Leaveson in https://github.com/ai4co/rl4co/pull/235
  • [Feat] Update build system by @fedebotu in https://github.com/ai4co/rl4co/pull/238
  • [BugFix] solve PDP tour issues #231
  • [Docs] add installation instructions with uv, poetry, conda
  • [Chore] update compatibility with soon-to-be released Pytorch 2.6.0
  • [Minor] Fix logos

New Contributors

  • @Leaveson made their first contribution in https://github.com/ai4co/rl4co/pull/235

Full Changelog: https://github.com/ai4co/rl4co/compare/v0.5.1...v0.5.2

- Python
Published by fedebotu about 1 year ago

rl4co - v0.5.1

v0.5.1 :rocket:

Minor release with several QOL improvements

What's Changed

  • Implement floyd on tmat_class atsp generation by @abcdhhhh in https://github.com/ai4co/rl4co/pull/226
  • Bump actions/download-artifact from 3 to 4.1.7 in /.github/workflows by @dependabot in https://github.com/ai4co/rl4co/pull/210
  • [DOCS] Modify url in example/1-quickstart.ipynb by @falconlee236 in https://github.com/ai4co/rl4co/pull/215
  • TSPLIB and CVRPLIB testing notebooks
  • Automatically enable sampling if num_samples>1
  • Update Spec names according to torchrl>=0.6.0
  • Now actions are automatically returned by default, no need to specify return_actions=True
  • Fix edge cases for SDPA #228
  • PCTSP distribution problem fix

New Contributors

  • @falconlee236 made their first contribution in https://github.com/ai4co/rl4co/pull/215
  • @abcdhhhh made their first contribution in https://github.com/ai4co/rl4co/pull/226

Full Changelog: https://github.com/ai4co/rl4co/compare/v0.5.0...v0.5.1

- Python
Published by fedebotu over 1 year ago

rl4co - v0.5.0

Major release: v0.5.0 is finally here! :rocket:

We are proud to finally release our latest version, 0.5.0, after much work done for NeurIPS! (Will our paper finally get accepted? :crossed_fingers:)

Changelog

:sparkles: Features

  • New documentation released at rl4.co!
  • Add SOTA FJSP environment @LTluttmann
  • Add Improvement methods and respective environments MDP @yining043
    • N2S
    • DACT
    • NeuOpt
  • Ade HetGNN model for the JFSP @LTluttmann
  • Add L2D model @LTluttmann
  • Add Multi-task VRP (MTVRP) environment
  • Add temperature in NARGNN policies @Furffico
  • Add multiple batch sizes for different dataset
  • Local search support, DeepACO + Local search @hyeok9855
  • Add MTPOMO, MVMoE model @RoyalSkye @FeiLiu36
  • Supporting the meta learning trainer @jieyibi
  • Supporting the improvement training @yining043
  • Add graph problems: MCP and FLP @bokveizen
  • New PPO versions:
    • Stepwise @LTluttmann
    • Improvement @yining043
  • PolyNet support @ahottung
  • Different distributions support + MDPOMO @jieyibi
  • Add initial support for solvers API from RL4CO (MTVRP): PyVRP, OR-Tools, LKH3 @N-Wouda @leonlan
  • Faster data logprobs collection: now we don't need to collect logprobs for unused trajectories, but we gather only logprobs for selected nodes by default, which decreases memory consumption
  • Add Codecov to track the tests coverage

:gear: Refactoring

  • [Environment] Supporting generator_params arguments for environments, more modularized and flexible.
  • Modularization of the Attention Model decoder’s QKV calculation for more flexibility @LTluttmann
  • Refactor the MatNet encoder with the cross attention only needs to be calculated once @LTluttmann

:memo: Documentation

  • New documentation based on MkDocs
    • Fast search
    • Beautiful (we hope you'll like it!) new homepage
    • New API reference, about section
    • Ad-free website
    • Light/Dark mode
    • New about sections (licensing, citation)
    • ...and more!
  • New tutorial on data distributions
  • Miscellaneous: fix Colab links @wouterkool

:bug: Bug Fixes

  • Fix the DeepACO’s log_heuristic calculation bug to raise the performance. @Furffico @henry-yeh
  • Solve memory leakage during the autoregressive decoding @LTluttmann
  • Python versioning: remove Python 3.8, compatibility with Python 3.12, and poetry support @ShuN6211
  • Compatibility with tensordict>=0.5.0
  • Memory leak in OP and PCTSP
  • Fix A2C bug: optimize all parameters in module instead of only "policy" by default
  • Fix double logging parameters, better logging in Wandb

- Python
Published by fedebotu over 1 year ago

rl4co - v0.4.0

Major release: v0.4.0 is here! πŸš€

This release adds several new features and major refactorings in both modeling and environment sides!

Changelog

✨ Features

  • DeepACO + ACO @Furffico @henry-yeh
  • Non-autoregressive (NAR) models and NARGNN @Furffico @henry-yeh
  • Add modular environment data generator with support to new distributions @cbhua
  • New decoding techniques based on the decoding strategy class @LTluttmann
    • Top-p (nucleus sampling)
    • Top-k
    • Select start nodes functions @LTluttmann

βš™οΈ Refactoring

  • Major modeling refactoring (summarized here). Now we categorize NCO approaches (which are not necessarily trained with RL!) into the following: 1) constructive (AR and NAR), 2) improvement, 3) transductive. This translates into code, which is now fully customizable. For instance, in constructive methods, now encoders / decoders can be fully replaced or removed in an abstract way!
  • Major environment refactoring (summarized here): we further modularize the environments into components (logic under env, data generation under generator, and so on), with several components moved inside the RL4COEnvBase. Importantly, we introduce data generators that can be customized!
  • Use Abstract classes if class should not be @ngastzepeda

πŸ“ Documentation

  • Hydra documentation and tutorial @LTluttmann
  • New modularized examples under examples/
  • Updated RL4CO structure in ReadTheDocs
  • Move to MIT license with AI4CO for inclusiveness
  • New RL4CO / AI4CO swag. You may also find them here!

πŸ› Bug Fixes

  • MatNet and FFSP bugfix @LTluttmann
  • Best solution gathering from POMO @ahottung
  • Tests now passing on MPS; compatibility with TorchRL https://github.com/pytorch/rl/pull/2125
  • Miscellaneuous @LTluttmann , @bokveizen , @tycbony

- Python
Published by fedebotu almost 2 years ago

rl4co - v0.3.3

New Routing Envs and more :rocket:

Changelog

:sparkles: Features

  • Add CVRPTW Environment @ngastzepeda
    • Add Solomon instance / solution loader via vrplib
  • Add basic Skill-VRP (SVRP) @ngastzepeda

:pagewithcurl: Documentation

  • [Minor] improve decoding strategies documentation

:bug: Bug Fixes

  • Avoid deepcopy bug by not saving intermediate steps of decoding strategy #123
  • Allow passing select_start_nodes_fn and other kwargs in decoding strategies

- Python
Published by fedebotu almost 2 years ago

rl4co - v0.3.2

New Decoding Types and more :rocket:

Changelog

Features

  • Beam Search #109 #110 @LTluttmann
  • Decoding type class #109 #110 @LTluttmann

Documentation

  • Add (simple , API work in progress!) tutorial notebooks for TSPLib and CVRPLib #84
  • Add decoding strategies notebook @LTluttmann + small fix @Haimrich

Optimization

  • torch.no_grad to torch.inference_mode
  • Faster testing

Bug Fixes

  • Batch size initialization @ngastzepeda
  • Bump up naming to align with 0.4.0 release of TorchRL
  • MatNet bug fix #108

- Python
Published by fedebotu about 2 years ago

rl4co - v0.3.1

QoL and BugFixes πŸš€

Changelog

  • Better multi start decoding #102
    • Add modular select_start_nodes function for POMO
    • Improve efficiency of multistart function
    • Add testing and selection function for more envs
    • Fix OP selecting too far away nodes in POMO
    • Automatic multistart, no need to manually choose beforehand when running POMO
  • Fix CVRP capacity bug @ngastzepeda #105
  • Add critic init embedding support
  • Fix data generation and add better docs #106
  • Better dataset handling: add dataset choice; use low CPU usage dataset by default
  • Better solution plotting and better quickstart notebook #103
  • Library winter cleanup
  • Miscellaneous minor fixes here and there

- Python
Published by fedebotu about 2 years ago

rl4co - v0.3.0

Faster Library, Python 3.11 and new TorchRL support, Envs, Models, Multiple Dataloaders, and more πŸš€

Faster Library, new Python 3.11 and TorchRL

  • Update to latest TorchRL #72, solving several issues as #95 #97 (also see this)
  • Benchmarking:
    • Up to 20% speedup in training epochs thanks to faster TensorDict and new env updates
    • Almost instant data generation (avoid list comprehension, e.g. from ~20 seconds to <1 second per epoch!)
    • Python 3.11 now available #97

## New SMTWTP environment - Add new scheduling problem: Single Machine Total Weighted Tardiness Problem environment as in DeepACO @henry-yeh

New MatNet model

  • Add MatNet version for square matrices (faster implementation ideal for routing problems)
  • Should be easy to implement scheduling from here

Multiple Dataloaders

  • Now it is possible to have multiple dataloaders, with naming as well!
    • For example, to track generalization during training

## Miscellaneous - Fix POMO shapes @hyeok9855 , modularizing PPO etc - Fix precion bug for PPO - New AI4CO transfer!

- Python
Published by fedebotu over 2 years ago

rl4co - v0.2.3

Add FlashAttention2 support ⚑

  • Add FlashAttention2 support as mentioned here
  • Remove old wrapper for half() precision since Lightning already deals with this
  • Fix scaled_dot_product_attention implementation in PyTorch < 2.0
  • Minor fixes

- Python
Published by fedebotu over 2 years ago

rl4co - v0.2.2

QoL: New Baseline, Testing Search Methods, Downloader, Miscellanea πŸš€

Changelog - Add mean baseline @hyeok9855 - Add testing for search methods - Move downloader to external repo, extra URL as backup for DPP - Small bug fix for duplicate args - Add more modular data generation - Suppress extra warning in automatic_optimization - Minor doc cleaning

- Python
Published by fedebotu over 2 years ago

rl4co - v0.2.1

QoL, Better documentation, Bug Fixes πŸš€

  • Add RandomPolicy class
  • Control max_steps for debugging purposes during decoding
  • Better documentation, add tutorials, and references #88 @bokveizen
  • Set bound to < Python 3.11 for the time being #90 @hyeok9855
  • Log more info by default in PPO
  • precompute_cache method can now accept td as well
  • If Trainer is supplied with gradient_clip_val and manual_optimization=False, then remove gradient clipping (e.g. for PPO)
  • Fix test data size following training and not test by default

- Python
Published by fedebotu over 2 years ago

rl4co - v0.2.0

Search Methods, Flexible Embeddings, New Graph Encoders and more πŸš€

Search methods

  • New flexible and extensible abstract class
  • Active Search (Bello et al, 2016)
  • Efficient Active Search (Hottung et al, 2022)

Flexible embeddings

  • Support for changing any environment embedding (init, context and dynamic)
  • Add new notebook showcasing how to solve new complex problems (example of multi-depot multi-agent pickup and delivery problem - MDPDP)

Support for torch-geometric

  • Added new template graph neural networks (MPNN, GCN)
  • Example Notebook here

### Miscellaneous - Separate loggers - Better imports - Bugfix compatibility with Mac - Update configs - ... and more!

- Python
Published by fedebotu over 2 years ago

rl4co - v0.1.1

Better training, Bug fixes, and more πŸš€

  • Better automatic training with DDP #87
  • Bug Fix RL4COTrainer
  • Avoid broadcasting error warning in critic baselines
  • Fix rollout baseline bug
  • New experiment config structure: interpolate with environment name (we won't need anymore to have separate folders for each environment name such as TSP, CVRP etc, simply use one config to rule them all!

- Python
Published by fedebotu over 2 years ago

rl4co - v0.1.0

Major release: refactoring of models, trainer and pipelines, and more! πŸš€

  • Refactored the old task class into a base class (RL4COLitModule) that is the base for RL algorithms (such as REINFORCE and PPO), following the discussion in #67
  • New base class for construction methods: now encoder, decoder, policy, and model can be based on common parent classes to make implementation much more modular
  • Added native loading from the checkpoint, which used to be buggy
  • Nice new logo (we like it, but we are obviously biased, so feel free to give us your opinion ;) )
  • Added mPDP environment (and added some WIP for EquityTransformer)
  • New RL4COTrainer that automatically includes training tricks for RL
  • Added Codecov coverage
  • Better testing: now we thoroughly test most of the library, including training (the Hydra part as well!)
  • Documentation overhaul: add Sphinx plugins for modularized, automatic docs
  • ... and more!

- Python
Published by fedebotu over 2 years ago

rl4co - v0.0.6

Better handling of notebooks, refactoring, plots and more!

Changelog of this release: - Add notebook with checkpointing, logging, testing and more and relative bugfixes + feats #83 - Refactor env embeddings into init, context and dynamic - OP plotting - PCTSP plotting - Quickfix Lightning problem: https://github.com/Lightning-AI/lightning/pull/18022 - Quickfix docs - Misc

Full Changelog: https://github.com/kaist-silab/rl4co/compare/v0.0.5...v0.0.6

- Python
Published by fedebotu over 2 years ago

rl4co - v0.0.5

Changelog

  • Fix SDVRP dynamic embedding #82
  • Add missing environments in testing
  • Update quickstart notebook to be more informative
  • Remove rendering titles to avoid cluttering
  • Other minor misc. updates

- Python
Published by fedebotu over 2 years ago

rl4co - v0.0.4

Documentation, environment refactoring, SPCTSP and more!

  • Add initial documentation on ReadTheDocs #80
  • Major refactoring of environments: new subclasses (get_action_mask, check_solution_validity) and more modular operations such as get_tour_length, move the base class and utils under common/
  • New SPCTSP environment
  • Fix SDVRP, refactor as subclass of CVRP
  • Fix OP with major refactoring; typos @eltociear
  • Add Slack chat links
  • Add paper link and citation
  • Move dev status to Beta (it was production stable - we wish it was! Perhaps in the future... 🀞🏼)
  • Misc bug fixes

- Python
Published by fedebotu over 2 years ago

rl4co - v0.0.3

Bug Fixes and more!

Changelog: - Solve #71 (pip install from PyPI now works!) - Fix TSP rendering - Add pre-commit-config (we will make a contribution guide in the near future) - Linting action with Black+Ruff combo - handled by default by the pre-commit - Add working Colab notebook with full training and testing of AM - Add badges - Misc

- Python
Published by fedebotu over 2 years ago

rl4co - v0.0.2

NOTE: please do not use this version, as it only contains the __init__.py file

- Python
Published by fedebotu over 2 years ago