https://github.com/bytedance/flowrl

Official implementation of "Flow Based Policy for Online Reinforcement Learning"

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.7%) to scientific vocabulary

Keywords

research

Last synced: 10 months ago · JSON representation

Repository

Official implementation of "Flow Based Policy for Online Reinforcement Learning"

Basic Info

Host: GitHub
Owner: bytedance
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://github.com/bytedance/FlowRL
Size: 81.1 KB

Statistics

Stars: 20
Watchers: 0
Forks: 0
Open Issues: 1
Releases: 0

Topics

research

Created about 1 year ago · Last pushed 10 months ago

Metadata Files

Readme License

README.md

👋 Hi, everyone!
We are ByteDance Seed team.

You can get to know us better through the following channels👇

seed logo

Flow-based Polciy for Online Reinforcement Learning

We are delighted to introduce FlowRL. It is a new approach for online reinforcement learning that integrates flow-based policy representation with Wasserstein-2-regularized optimization. This creates a promising framework that integrates generative policies with reinforcement learning.

News

[2025/06/10]🔥We release the PyTorch version of the code.

Introduction

FlowRL is an Actor-Critic framework that leverages flow-based policy representation and integrates Wasserstein-2-regularized optimization. By implicitly constraining the current policy to the optimal behavioral policy via W2 distance, FlowRL achieves superior performance on challenging benchmarks like the DMControl (Dog domain, Humanoid domain) and HumanoidBench.

Getting Started

Setup Conda Environment: Create an environment with bash conda create -n flowrl python=3.11
Clone this Repository: bash git clone https://github.com/bytedance/FlowRL.git cd FlowRL
Install FlowRL Dependencies: bash pip install -r requirements.txt
Training Examples:
- Run a single training instance: bash python3 main.py --domain dog --task run

- Run parallel training:
    ```bash
    bash scripts/train_parallel.sh
    ```

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

TODO

[ ] Release JAX version source code ## Citation If you find FlowRL useful for your research and applications, please consider giving us a star ⭐ or cite us using:

bibtex @article{lv2025flow, title={Flow-Based Policy for Online Reinforcement Learning}, author={Lv, Lei and Li, Yunfei and Luo, Yu and Sun, Fuchun and Kong, Tao and Xu, Jiafeng and Ma, Xiao}, journal={arXiv preprint arXiv:2506.12811}, year={2025} }

About ByteDance Seed Team

Founded in 2023, ByteDance Seed Team is dedicated to crafting the industry's most advanced AI foundation models. The team aspires to become a world-class research team and make significant contributions to the advancement of science and society.

Owner

Name: Bytedance Inc.
Login: bytedance
Kind: organization
Location: Singapore

Website: https://opensource.bytedance.com
Twitter: ByteDanceOSS
Repositories: 255
Profile: https://github.com/bytedance

GitHub Events

Total

Issues event: 1
Watch event: 16
Push event: 1
Public event: 1

Last Year

Issues event: 1
Watch event: 16
Push event: 1
Public event: 1

Dependencies

requirements .txt pypi

dm_control *
gymnasium *
imageio *
mujoco *
tensorboard *
torch *
torchvision *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/bytedance/flowrl

Science Score: 36.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Flow-based Polciy for Online Reinforcement Learning

News

Introduction

Getting Started

License

TODO

About ByteDance Seed Team

Owner

GitHub Events

Total

Last Year

Dependencies