cogment-verse

Research platform for Human-in-the-loop learning (HILL) & Multi-Agent Reinforcement Learning (MARL)

https://github.com/cogment/cogment-verse

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (15.4%) to scientific vocabulary

Keywords

cogment human-in-the-loop-learning reinforcement-learning rlhf

Last synced: 9 months ago · JSON representation ·

Repository

Research platform for Human-in-the-loop learning (HILL) & Multi-Agent Reinforcement Learning (MARL)

Basic Info

Host: GitHub
Owner: cogment
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://cogment.ai/cogment_verse
Size: 19.7 MB

Statistics

Stars: 81
Watchers: 9
Forks: 17
Open Issues: 63
Releases: 0

Topics

cogment human-in-the-loop-learning reinforcement-learning rlhf

Created over 4 years ago · Last pushed over 2 years ago

Metadata Files

Readme Changelog Contributing License Code of conduct Citation

Cogment Verse

Cogment Verse is a SDK helping researchers and developers in the fields of human-in-the-loop learning (HILL) and multi-agent reinforcement learning (MARL) train and validate their agents at scale. Cogment Verse instantiates the open-source Cogment platform for environments following the OpenAI Gym mold, making it easy to get started.

Simply clone the repo and start training.

Documentation table of contents

Getting started
Tutorials
- Simple Behavioral Cloning
Develop
Deploy
- Tunnel unsing ngrok
Experimental results 🚧
- A2C
- REINFORCE
Changelog
Contributors guide
Community code of conduct

Getting started

The following will show you how to setup Cogment Verse locally, it is possible to use a Docker based setup instead. Instructions for this can be found here

Clone this repository
Install Python 3.9
Depending on your specific machine, you might also need to following dependencies:

swig, which is required for the Box2d gym environments, it can be installed using apt-get install swig on ubuntu or brew install swig on macOS
python3-opencv, which is required on ubuntu systems, it can be installed using apt-get install python3-opencv
libosmesa6-dev and patchelf are required to run the environment libraries using mujoco. They can be installed using apt-get install libosmesa6-dev patchelf.

Create and activate a virtual environment

console $ python -m venv .venv $ source .venv/bin/activate

Install the python dependencies.

console $ pip install -r requirements.txt

Depending on the environment you want to use, you might need to take additional steps.

In another terminal, launch a mlflow server on port 3000

console $ source .venv/bin/activate $ python -m simple_mlflow

Start the default Cogment Verse run using python -m main
Open Chrome (other web browser might work but haven't tested) and navigate to http://localhost:8080/
Play the game!

That's the basic setup for Cogment Verse, you are now ready to train AI agents.

Configuration

Cogment Verse relies on hydra for configuration. This enables easy configuration and composition of configuration directly from yaml files and the command line.

The configuration files are located in the config directory, with defaults defined in config/config.yaml.

Here are a few examples:

Launch a Simple Behavior Cloning run with the Mountain Car Gym environment (which is the default environment) console $ python -m main +experiment=simple_bc/mountain_car
Launch a Simple Behavior Cloning run with the Lunar Lander Gym environment console $ python -m main +experiment=simple_bc/mountain_car services/environment=lunar_lander
Launch and play a single trial of the Lunar Lander Gym environment with continuous controls console $ python -m main services/environment=lunar_lander_continuous
Launch an A2C training run with the Cartpole Gym environment

console $ python -m main +experiment=simple_a2c/cartpole

This one is completely headless (training doens't involve interaction with a human player). It will take a little while to run, you can monitor the progress using mlflow at http://localhost:3000

Launch an DQN self training run with the Connect Four PettingZoo environment

console $ python -m main +experiment=simple_dqn/connect_four

The same experiment can be launched with a ratio of human-in-the-loop training trials (that are playable on in the web client)

console $ python -m main +experiment=simple_dqn/connect_four +run.hill_training_trials_ratio=0.05

PettingZoo's Atari Pong Environment

Example #1: Play against RL agent

console $ python -m main +experiment=ppo_atari_pz/play_pong_pz

Example #2: Observing RL agents playing against each other

console $ python -m main +experiment=ppo_atari_pz/observe_play_pong_pz

Example #3: Training with human's demonstrations

console $ python -m main +experiment=ppo_atari_pz/hill_pong_pz

Example #4: Training with human's feedback

console $ python -m main +experiment=ppo_atari_pz/hfb_pong_pz

Example #5: Self-training

console $ python -m main +experiment=ppo_atari_pz/pong_pz

NOTE: Example 2&3 require users to open Chrome and navigate to http://localhost:8080 in order to provide either demonstrations or feedback.

List of publications and submissions using Cogment and/or Cogment Verse

Analyzing and Overcoming Degradation in Warm-Start Off-Policy Reinforcement Learning code
Multi-Teacher Curriculum Design for Sparse Reward Environments code

(please open a pull request to add missing entries)

Owner

Name: Cogment
Login: cogment
Kind: organization

Website: https://cogment.ai
Repositories: 6
Profile: https://github.com/cogment

Citation (CITATION.cff)

cff-version: "1.2.0"
message: "If you use Cogment Verse in your research, please cite the article from `preferred-citation`."
title: "Cogment Verse"
authors:
  - family-names: AI Redefined
    website: https://ai-r.com
license: Apache-2.0
url: https://cogment.ai/cogment_verse
repository: https://github.com/cogment/cogment-verse
preferred-citation:
  type: "conference-paper"
  title: "Hiking up that HILL with Cogment-Verse: Train & Operate Multi-agent Systems Learning from Humans"
  booktitle: "Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems"
  year: 2023
  authors:
    - affiliation: AI Redefined
      family-names: Gottipati
      given-names: Sai Krishna
    - affiliation: AI Redefined
      family-names: Nguyen
      given-names: Luong-Ha
    - affiliation: AI Redefined
      family-names: Mars
      given-names: "Clodéric"
    - affiliation: AI Redefined
      family-names: Taylor
      given-names: Matthew E.

GitHub Events

Total

Watch event: 4
Fork event: 3

Last Year

Watch event: 4
Fork event: 3

Dependencies

requirements.txt pypi

black *
mlflow ==1.26.1
pylint *

cogment_verse/web/web_app/package-lock.json npm

1235 dependencies

cogment_verse/web/web_app/package.json npm

@types/react ^17.0.21 development
@types/react-dom ^17.0.9 development
autoprefixer ^10.4.4 development
postcss ^8.4.12 development
tailwindcss ^3.0.24 development
@cogment/cogment-js-sdk ^2.0.1
@fortawesome/fontawesome-svg-core ^6.2.1
@fortawesome/free-solid-svg-icons ^6.2.1
@fortawesome/react-fontawesome ^0.2.0
@types/google-protobuf ^3.15.5
classnames ^2.3.1
google-protobuf ^3.18.0-rc.2
grpc-tools ^1.11.2
jsdoc ^3.6.11
prettier ^2.4.1
protobufjs github:protobufjs/protobuf.js#d13d5d5688052e366aa2e9169f50dfca376b32cf
react ^17.0.2
react-countdown-circle-timer ^3.0.8
react-dom ^17.0.2
react-router-dom ^6.4.3
react-scripts ^5.0.1
tmp ^0.2.1
typescript ^4.4.3
uglify-js ^3.17.4
web-vitals ^1.1.2

isaac_requirements.txt pypi

pyvirtualdisplay *
rl-games ==1.5.2

Dockerfile docker

$BASE_IMAGE latest build

environments/gym/web/package-lock.json npm

@esbuild/android-arm 0.18.0 development
@esbuild/android-arm64 0.18.0 development
@esbuild/android-x64 0.18.0 development
@esbuild/darwin-arm64 0.18.0 development
@esbuild/darwin-x64 0.18.0 development
@esbuild/freebsd-arm64 0.18.0 development
@esbuild/freebsd-x64 0.18.0 development
@esbuild/linux-arm 0.18.0 development
@esbuild/linux-arm64 0.18.0 development
@esbuild/linux-ia32 0.18.0 development
@esbuild/linux-loong64 0.18.0 development
@esbuild/linux-mips64el 0.18.0 development
@esbuild/linux-ppc64 0.18.0 development
@esbuild/linux-riscv64 0.18.0 development
@esbuild/linux-s390x 0.18.0 development
@esbuild/linux-x64 0.18.0 development
@esbuild/netbsd-x64 0.18.0 development
@esbuild/openbsd-x64 0.18.0 development
@esbuild/sunos-x64 0.18.0 development
@esbuild/win32-arm64 0.18.0 development
@esbuild/win32-ia32 0.18.0 development
@esbuild/win32-x64 0.18.0 development
esbuild 0.18.0 development

environments/gym/web/package.json npm

esbuild 0.18.0 development

environments/overcooked/web/package-lock.json npm

@esbuild/android-arm 0.18.0 development
@esbuild/android-arm64 0.18.0 development
@esbuild/android-x64 0.18.0 development
@esbuild/darwin-arm64 0.18.0 development
@esbuild/darwin-x64 0.18.0 development
@esbuild/freebsd-arm64 0.18.0 development
@esbuild/freebsd-x64 0.18.0 development
@esbuild/linux-arm 0.18.0 development
@esbuild/linux-arm64 0.18.0 development
@esbuild/linux-ia32 0.18.0 development
@esbuild/linux-loong64 0.18.0 development
@esbuild/linux-mips64el 0.18.0 development
@esbuild/linux-ppc64 0.18.0 development
@esbuild/linux-riscv64 0.18.0 development
@esbuild/linux-s390x 0.18.0 development
@esbuild/linux-x64 0.18.0 development
@esbuild/netbsd-x64 0.18.0 development
@esbuild/openbsd-x64 0.18.0 development
@esbuild/sunos-x64 0.18.0 development
@esbuild/win32-arm64 0.18.0 development
@esbuild/win32-ia32 0.18.0 development
@esbuild/win32-x64 0.18.0 development
esbuild 0.18.0 development
js-tokens 4.0.0
loose-envify 1.4.0
react 18.2.0
react-countdown-circle-timer 3.2.1

environments/overcooked/web/package.json npm

esbuild 0.18.0 development
react-countdown-circle-timer ^3.2.1

environments/pettingzoo/web/package-lock.json npm

@esbuild/linux-x64 0.18.0 development
esbuild 0.18.0 development
@fortawesome/fontawesome-common-types 6.4.0
@fortawesome/fontawesome-svg-core 6.4.0
@fortawesome/free-solid-svg-icons 6.4.0
@fortawesome/react-fontawesome 0.2.0
js-tokens 4.0.0
loose-envify 1.4.0
object-assign 4.1.1
prop-types 15.8.1
react 18.2.0
react-countdown-circle-timer 3.2.1
react-is 16.13.1

environments/pettingzoo/web/package.json npm

esbuild 0.18.0 development
@fortawesome/free-solid-svg-icons ^6.4.0
@fortawesome/react-fontawesome ^0.2.0
react-countdown-circle-timer ^3.2.1

pyproject.toml pypi

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science