https://github.com/converged-computing/slurm-operator

Testing if I can implement slurm in an operator

https://github.com/converged-computing/slurm-operator

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.5%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Testing if I can implement slurm in an operator

Basic Info
  • Host: GitHub
  • Owner: converged-computing
  • License: mit
  • Language: Go
  • Default Branch: main
  • Size: 165 KB
Statistics
  • Stars: 14
  • Watchers: 2
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created about 3 years ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

slurm-operator

What happens when I run out of things to do on a Monday... ohno

This will be an attempt at creating a slurm operator. I mostly want to learn a production setup for SLURM, and have some fun! Note that it's not working yet! The next step is to customize the configuration files (e.g., slurm.conf and slurmdbd.conf) to be config maps, and specific to the cluster.

Development

Creation

bash mkdir slurm-operator cd slurm-operator/ operator-sdk init --domain flux-framework.org --repo github.com/converged-computing/slurm-operator operator-sdk create api --version v1alpha1 --kind slurm --resource --controller

Getting Started

You’ll need a Kubernetes cluster to run against. You can use KIND to get a local cluster for testing, or run against a remote cluster. Note: Your controller will automatically use the current context in your kubeconfig file (i.e. whatever cluster kubectl cluster-info shows).

Examples

For examples, see the following subdirectories:

  • hello-world: a basic example with one slurm cluster to submit jobs to
  • federated: more than one cluster connected to the same database.

Note that we don't have pretty rendered docs yet, as this was mostly a quick, few day project, and we are just returning to it to try out federated slurm. If we use or develop beyond a few simple times we will definitely spruce up the docs here.

How it works

This project aims to follow the Kubernetes Operator pattern.

It uses Controllers, which provide a reconcile function responsible for synchronizing resources until the desired state is reached on the cluster.

TODO

  • Generate slurm.conf and slurmdbd.conf as templates, with custom hosts, etc.
  • Custom user generation?
  • If username/password not provided, generate as random
  • Add script logging levels / quiet
  • consider putting node start in loop (won't exit for job, maybe OK for now)
  • make more params in slurm configs variables
  • allow the command given to script to be given to srun (timing will be tough, probably need to ensure sinfo working)

License

HPCIC DevTools is distributed under the terms of the MIT license. All new contributions must be made under this license.

See LICENSE, COPYRIGHT, and NOTICE for details.

SPDX-License-Identifier: (MIT)

LLNL-CODE- 842614

Owner

  • Name: Converged Computing
  • Login: converged-computing
  • Kind: organization

The best of cloud and high performance computing: technology and community combined.

GitHub Events

Total
  • Watch event: 4
  • Delete event: 2
  • Push event: 5
  • Pull request event: 4
  • Fork event: 1
  • Create event: 2
Last Year
  • Watch event: 4
  • Delete event: 2
  • Push event: 5
  • Pull request event: 4
  • Fork event: 1
  • Create event: 2

Packages

  • Total packages: 1
  • Total downloads: unknown
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
proxy.golang.org: github.com/converged-computing/slurm-operator
  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 8.9%
Dependent repos count: 10.6%
Forks count: 13.6%
Average: 14.6%
Stargazers count: 25.5%
Last synced: 10 months ago