merlin

Machine Learning for HPC Workflows

https://github.com/llnl/merlin

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
✓
Committers with academic emails
7 of 14 committers (50.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (16.2%) to scientific vocabulary

Keywords

big-data celery-workers hpc machine-learning radiuss redis-server simulation workflow workflows

Keywords from Contributors

interactive serializer packaging numeric build-tool network-simulation hacking autograding observability embedded

Last synced: 6 months ago · JSON representation

Repository

Machine Learning for HPC Workflows

Basic Info

Host: GitHub
Owner: LLNL
License: mit
Language: Python
Default Branch: develop
Homepage:
Size: 7.41 MB

Statistics

Stars: 140
Watchers: 11
Forks: 30
Open Issues: 29
Releases: 41

Topics

big-data celery-workers hpc machine-learning radiuss redis-server simulation workflow workflows

Created over 6 years ago · Last pushed 6 months ago

Metadata Files

Readme Changelog Contributing License Code of conduct

A brief introduction to Merlin

Merlin is a tool for running machine learning based workflows. The goal of Merlin is to make it easy to build, run, and process the kinds of large scale HPC workflows needed for cognitive simulation.

At its heart, Merlin is a distributed task queuing system, designed to allow complex HPC workflows to scale to large numbers of simulations (we've done 100 Million on the Sierra Supercomputer).

Why would you want to run that many simulations? To become your own Big Data generator.

Data sets of this size can be large enough to train deep neural networks that can mimic your HPC application, to be used for such things as design optimization, uncertainty quantification and statistical experimental inference. Merlin's been used to study inertial confinement fusion, extreme ultraviolet light generation, structural mechanics and atomic physics, to name a few.

How does it work?

In essence, Merlin coordinates complex workflows through a persistent external queue server that lives outside of your HPC systems, but that can talk to nodes on your cluster(s). As jobs spin up across your ecosystem, workers on those allocations pull work from a central server, which coordinates the task dependencies for your workflow. Since this coordination is done via direct connections to the workers (i.e. not through a file system), your workflow can scale to very large numbers of workers, which means a very large number of simulations with very little overhead.

Furthermore, since the workers pull their instructions from the central server, you can do a lot of other neat things, like having multiple batch allocations contribute to the same work (think surge computing), or specialize workers to different machines (think CPU workers for your application and GPU workers that train your neural network). Another neat feature is that these workers can add more work back to central server, which enables a variety of dynamic workflows, such as may be necessary for the intelligent sampling of design spaces or reinforcement learning tasks.

Merlin does all of this by leveraging some key HPC and cloud computing technologies, building off open source components. It uses maestro to provide an interface for describing workflows, as well as for defining workflow task dependencies. It translates those dependencies into concrete tasks via celery, which can be configured for a variety of backend technologies (rabbitmq and redis are currently supported). Although not a hard dependency, we encourage the use of flux for interfacing with HPC batch systems, since it can scale to a very large number of jobs.

The integrated system looks a little something like this:

A Typical Merlin Workflow

In this example, here's how it all works:

The scientist describes her HPC workflow as a maestro DAG (directed acyclic graph) "spec" file workflow.yaml
She then sends it to the persistent server with merlin run workflow.yaml . Merlin translates the file into tasks.
The scientist submits a job request to her HPC center. These jobs ask for workers via the command merlin run-workers workflow.yaml.
Coffee break.
As jobs stand up, they pull work from the queue, making calls to flux to get the necessary HPC resources.
Later, workers on a different allocation, with GPU resources connect to the server and contribute to processing the workload.

The central queue server deals with task dependencies and keeps the workers fed.

For more details, check out the rest of the documentation.

Need help? merlin@llnl.gov

Quick Start

Note: Merlin supports Python 3.6+.

To install Merlin and its dependencies, run:

$ pip3 install merlin

Create your application config file:

$ merlin config

That's it.

To run something a little more like what you're interested in, namely a demo workflow that has simulation and machine learning, first generate an example workflow:

$ merlin example feature_demo

Then install the workflow's dependencies:

$ pip install -r feature_demo/requirements.txt

Then process the workflow and create tasks on the server:

$ merlin run feature_demo/feature_demo.yaml

And finally, launch workers that can process those tasks:

$ merlin run-workers feature_demo/feature_demo.yaml

Documentation

Full documentation is available, or run:

$ merlin --help

(or add --help to the end of any sub-command you want to learn more about.)

Code of Conduct

Please note that Merlin has a Code of Conduct. By participating in the Merlin community, you agree to abide by its rules.

License

Merlin is distributed under the terms of the MIT LICENSE.

LLNL-CODE-797170

Owner

Name: Lawrence Livermore National Laboratory
Login: LLNL
Kind: organization
Email: github-admin@llnl.gov
Location: Livermore, CA, USA

Website: https://software.llnl.gov
Twitter: LLNL_OpenSource
Repositories: 520
Profile: https://github.com/LLNL

For over 70 years, the Lawrence Livermore National Laboratory has applied science and technology to make the world a safer place.

GitHub Events

Total

Create event: 7
Issues event: 36
Release event: 2
Watch event: 15
Issue comment event: 31
Push event: 56
Pull request event: 57
Pull request review comment event: 78
Pull request review event: 94
Fork event: 6

Last Year

Create event: 7
Issues event: 36
Release event: 2
Watch event: 15
Issue comment event: 31
Push event: 56
Pull request event: 57
Pull request review comment event: 78
Pull request review event: 94
Fork event: 6

Committers

Last synced: 9 months ago

All Time

Total Commits: 699
Total Committers: 14
Avg Commits per committer: 49.929
Development Distribution Score (DDS): 0.449

Past Year

Commits: 18
Committers: 3
Avg Commits per committer: 6.0
Development Distribution Score (DDS): 0.167

Top Committers

Name	Email	Commits
Benjamin Bay	b**1@l**v	385
Joseph M. Koning	k**1@l**v	105
Brian Gunnarson	4****5	74
Alexander Cameron Winter	w**7@q**v	48
Luc Peterson	p**6@l**v	46
Alexander Winter	8****L	15
Ryan Lee	4****a	12
Bay	b**1@g**v	5
Yamen Mubarka	m**1@l**v	3
Jane Herriman	x****e	2
Wout De Nolf	w**f@e**u	1
dependabot[bot]	4****]	1
fixdocker	6****r	1
robinson96	r**6@l**v	1

Committer Domains (Top 20 + Academic)

llnl.gov: 5 esrf.eu: 1 geralt.llnl.gov: 1 quartz1148.llnl.gov: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 56
Total pull requests: 232
Average time to close issues: about 1 year
Average time to close pull requests: 13 days
Total issue authors: 17
Total pull request authors: 15
Average comments per issue: 1.63
Average comments per pull request: 1.54
Merged pull requests: 178
Bot issues: 0
Bot pull requests: 4

Past Year

Issues: 26
Pull requests: 70
Average time to close issues: 1 day
Average time to close pull requests: 9 days
Issue authors: 3
Pull request authors: 3
Average comments per issue: 0.04
Average comments per pull request: 0.53
Merged pull requests: 47
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

bgunnar5 (18)
MarcusHsieh (9)
koning (4)
lucpeterson (4)
AlexanderWinterLLNL (3)
kustowski1 (3)
ryannova (2)
ymubarka (2)
ben-bay (2)
vsoch (2)
srcopela (1)
xorJane (1)
lvandeca (1)
jpbrodsky (1)
papajim (1)

Pull Request Authors

bgunnar5 (143)
lucpeterson (23)
AlexanderWinterLLNL (15)
ryannova (15)
koning (13)
xorJane (4)
MarcusHsieh (4)
dependabot[bot] (4)
KaseyNagleLLNL (2)
ymubarka (2)
woutdenolf (2)
jimagaffney (2)
ben-bay (1)
nkeilbart (1)
dylancliche (1)

Top Labels

Issue Labels

enhancement (22) bug (19) refactor (7) documentation (3) question (2) good first issue (2) Merlin 2.0 (1)

Pull Request Labels

enhancement (10) dependencies (4) bug (3) documentation (1)

Packages

Total packages: 3
Total downloads:
- pypi 1,263 last-month

Total dependent packages: 0
(may contain duplicates)
Total dependent repositories: 9
(may contain duplicates)
Total versions: 61
Total maintainers: 7

pypi.org: merlin

The building blocks of workflows!

Homepage: https://github.com/LLNL/merlin
Documentation: https://merlin.readthedocs.io/
License: MIT
Latest release: 1.12.2
published over 1 year ago

Versions: 39
Dependent Packages: 0
Dependent Repositories: 8
Downloads: 1,218 Last month

Rankings

Dependent repos count: 5.2%

Downloads: 6.7%

Stargazers count: 7.1%

Average: 7.5%

Forks count: 8.4%

Dependent packages count: 10.0%

Maintainers (4)

ben-bay Alexander_Winter luc_peterson nagle5

Last synced: 6 months ago

pypi.org: merlinwf

The 'merlinwf' package has been deprecated and replaced by the 'merlin' package.

Homepage: https://github.com/LLNL/merlin
Documentation: https://merlinwf.readthedocs.io/
License: MIT
Latest release: 2.0.1
published almost 5 years ago

Versions: 17
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 45 Last month

Rankings

Stargazers count: 7.1%

Forks count: 8.4%

Dependent packages count: 10.1%

Average: 18.1%

Dependent repos count: 21.6%

Downloads: 43.1%

Maintainers (4)

ben-bay luc_peterson jsemler robinson96

Last synced: 6 months ago

spack.io: py-merlin

Merlin Workflow for HPC.

Homepage: https://github.com/LLNL/merlin
License: []
Latest release: 1.7.5
published almost 4 years ago

Versions: 5
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent repos count: 0.0%

Stargazers count: 20.4%

Forks count: 22.0%

Average: 24.9%

Dependent packages count: 57.3%

Maintainers (1)

adamjstewart

Last synced: 6 months ago

Dependencies

docs/source/modules/advanced_topics/advanced_requirements.txt pypi

fakers *
merlin *
pandas *

merlin/examples/workflows/feature_demo/requirements.txt pypi

merlin-spellbook *
sklearn *

merlin/examples/workflows/hello/requirements.txt pypi

names *
numpy *

merlin/examples/workflows/hpc_demo/requirements.txt pypi

faker *
maestrowf *
merlin *
pandas *

merlin/examples/workflows/iterative_demo/requirements.txt pypi

faker *
maestrowf *
merlin *
pandas *

merlin/examples/workflows/null_spec/requirements.txt pypi

matplotlib *
numpy *
scipy *

merlin/examples/workflows/openfoam_wf/requirements.txt pypi

Ofpp ==0.11
matplotlib ==3.1.1
scikit-learn ==0.21.3

merlin/examples/workflows/openfoam_wf_no_docker/requirements.txt pypi

Ofpp ==0.11
matplotlib ==3.1.1
scikit-learn ==0.21.3

merlin/examples/workflows/optimization/requirements.txt pypi

Jinja2 *
matplotlib *
merlin-spellbook *
numpy *
pyDOE *
scikit-learn *

merlin/examples/workflows/remote_feature_demo/requirements.txt pypi

merlin-spellbook *
sklearn *

requirements/dev.txt pypi

alabaster * development
black * development
build * development
dep-license * development
flake8 * development
isort * development
johnnydep * development
pylint * development
pytest * development
sphinx >=2.0.0 development
twine * development

requirements/release.txt pypi

cached_property *
celery >=5.0.3
coloredlogs *
cryptography *
importlib_resources *
maestrowf ==1.1.7dev0
numpy *
parse *
psutil >=5.1.0
pyyaml >=5.1.2
tabulate *

.github/workflows/push-pr_workflow.yml actions

actions/cache v2 composite
actions/checkout v2 composite
actions/checkout v1 composite
actions/setup-python v2 composite
redis * docker

.github/workflows/python-publish.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite
pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite

Dockerfile docker

ubuntu 18.04 build

docs/source/modules/installation/docker-compose.yml docker

llnl/merlin latest
redis latest

docs/source/modules/installation/docker-compose_rabbit.yml docker

llnl/merlin latest
rabbitmq 3-management
redis latest

docs/source/modules/installation/docker-compose_rabbit_redis_tls.yml docker

rabbitmq 3-management
redis latest

docs/requirements.in pypi

sphinx *

docs/requirements.txt pypi

sphinx >=5.3.0

merlin/examples/workflows/flux/requirements.txt pypi

merlin/examples/workflows/openfoam_wf_singularity/requirements.txt pypi

Ofpp ==0.11
matplotlib ==3.1.1
scikit-learn >=1.0.2

merlin/examples/workflows/slurm/requirements.txt pypi

requirements.txt pypi

setup.py pypi

merlin

Science Score: 36.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

A brief introduction to Merlin

Quick Start

Documentation

Code of Conduct

License

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: merlin

Rankings

Maintainers (4)

pypi.org: merlinwf

Rankings

Maintainers (4)

spack.io: py-merlin

Rankings

Maintainers (1)

Dependencies