p3achygo

(Yet Another) AlphaZero-based Go Engine

https://github.com/p3achyjr/p3achygo

Science Score: 28.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org, nature.com
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (5.1%) to scientific vocabulary

Keywords

alphago alphazero machine-learning reinforcement-learning

Last synced: 9 months ago · JSON representation ·

Repository

(Yet Another) AlphaZero-based Go Engine

Basic Info

Host: GitHub
Owner: p3achyjr
Language: C++
Default Branch: main
Homepage: https://p3achyjr.github.io/p3achygo-page/
Size: 8.63 MB

Statistics

Stars: 5
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Topics

alphago alphazero machine-learning reinforcement-learning

Created over 3 years ago · Last pushed about 2 years ago

Metadata Files

Readme Citation

p3achyjr's Go Bot :)

Visit p3achyjr.github.io/p3achygo-page for more details about methods, implementation, and current status.

Building and Running

Assuming inside docker container [docs tbd], run the following commands.

mkdir /tmp/p3achygo mkdir /tmp/shuffler ./sh/build_all_container.sh

To run a single process that iteratively runs self-play, trains, and runs eval, do

python -m python.rl_loop.train_sp_eval --sp_bin_path=/app/bazel-bin/cc/selfplay/main --eval_bin_path=/app/bazel-bin/cc/eval/main --run_id=${RUN_ID} 2>&1 | tee /tmp/sp_log.txt

To run the shuffler, do python -m python.rl_loop.shuffle --bin_path=/app/bazel-bin/cc/shuffler/main --run_id=${RUN_ID} --local_run_dir=/tmp/shuffler

Alternatively, you can run the CC binaries themselves. For eval, do ./bazel-bin/cc/eval/main --cur_model_path=${CUR_MODEL_PATH} --cand_model_path=${CAND_MODEL_PATH} --num_games=${NUM_GAMES} --cache_size=${CACHE_SIZE} --cur_n=${CUR_N} --cur_k=${CUR_K} --cand_n=${CAND_N} --cand_k=${CAND_K}

Resources Consulted:

AlphaGo Fan Paper

AlphaGo Zero Paper

KataGo Paper

Gumbel Policy Scheme for AlphaZero/MuZero

Owner

Name: Anatol Liu
Login: p3achyjr
Kind: user
Location: United States

Repositories: 41
Profile: https://github.com/p3achyjr

Citation (citations.md)

[David J. Wu, Accelerating Self Play Learning in Go](https://arxiv.org/pdf/1902.10565.pdf)

[David Silver et. al., Mastering the game of Go without human knowledge](https://www.nature.com/articles/nature24270.epdf?author_access_token=VJXbVjaSHxFoctQQ4p2k4tRgN0jAjWel9jnR3ZoTv0PVW4gB86EEpGqTRDtpIz-2rmo8-KG06gqVobU5NSCFeHILHcVFUeMsbvwS-lxjqQGg98faovwjxeTUgZAUMnRQ)

[Ivo Danihelka et .al., Policy Improvement By Planning with Gumbel](https://openreview.net/pdf?id=bERaNdoegnO)

[Brian Lee et .al., Minigo: A Case Study in Reproducing Reinforcement Learning Research](https://openreview.net/pdf?id=H1eerhIpLV)

[Alexander Trudeau, Michael Bowling, Target Search Control in AlphaZero for Effective Policy Improvement](https://arxiv.org/pdf/2302.12359.pdf)

Not exhaustive.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science