p3achygo

(Yet Another) AlphaZero-based Go Engine

https://github.com/p3achyjr/p3achygo

Science Score: 28.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org, nature.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.1%) to scientific vocabulary

Keywords

alphago alphazero machine-learning reinforcement-learning
Last synced: 9 months ago · JSON representation ·

Repository

(Yet Another) AlphaZero-based Go Engine

Basic Info
Statistics
  • Stars: 5
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
alphago alphazero machine-learning reinforcement-learning
Created over 3 years ago · Last pushed about 2 years ago
Metadata Files
Readme Citation

README.md

p3achyjr's Go Bot :)

Visit p3achyjr.github.io/p3achygo-page for more details about methods, implementation, and current status.

Building and Running

Assuming inside docker container [docs tbd], run the following commands.

mkdir /tmp/p3achygo mkdir /tmp/shuffler ./sh/build_all_container.sh

To run a single process that iteratively runs self-play, trains, and runs eval, do

python -m python.rl_loop.train_sp_eval --sp_bin_path=/app/bazel-bin/cc/selfplay/main --eval_bin_path=/app/bazel-bin/cc/eval/main --run_id=${RUN_ID} 2>&1 | tee /tmp/sp_log.txt

To run the shuffler, do python -m python.rl_loop.shuffle --bin_path=/app/bazel-bin/cc/shuffler/main --run_id=${RUN_ID} --local_run_dir=/tmp/shuffler

Alternatively, you can run the CC binaries themselves. For eval, do ./bazel-bin/cc/eval/main --cur_model_path=${CUR_MODEL_PATH} --cand_model_path=${CAND_MODEL_PATH} --num_games=${NUM_GAMES} --cache_size=${CACHE_SIZE} --cur_n=${CUR_N} --cur_k=${CUR_K} --cand_n=${CAND_N} --cand_k=${CAND_K}

Resources Consulted:

AlphaGo Fan Paper

AlphaGo Zero Paper

KataGo Paper

Gumbel Policy Scheme for AlphaZero/MuZero

Owner

  • Name: Anatol Liu
  • Login: p3achyjr
  • Kind: user
  • Location: United States

Citation (citations.md)

[David J. Wu, Accelerating Self Play Learning in Go](https://arxiv.org/pdf/1902.10565.pdf)

[David Silver et. al., Mastering the game of Go without human knowledge](https://www.nature.com/articles/nature24270.epdf?author_access_token=VJXbVjaSHxFoctQQ4p2k4tRgN0jAjWel9jnR3ZoTv0PVW4gB86EEpGqTRDtpIz-2rmo8-KG06gqVobU5NSCFeHILHcVFUeMsbvwS-lxjqQGg98faovwjxeTUgZAUMnRQ)

[Ivo Danihelka et .al., Policy Improvement By Planning with Gumbel](https://openreview.net/pdf?id=bERaNdoegnO)

[Brian Lee et .al., Minigo: A Case Study in Reproducing Reinforcement Learning Research](https://openreview.net/pdf?id=H1eerhIpLV)

[Alexander Trudeau, Michael Bowling, Target Search Control in AlphaZero for Effective Policy Improvement](https://arxiv.org/pdf/2302.12359.pdf)

Not exhaustive.

GitHub Events

Total
  • Watch event: 3
Last Year
  • Watch event: 3