Recent Releases of https://github.com/centerforaisafety/cerberus-cluster

https://github.com/centerforaisafety/cerberus-cluster - v03.06.2025

What's Changed

  • 320 backups accounting by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/323
  • Slurm Backups by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/322
  • Added monitoring backups. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/325
  • Updated Slurm.conf to support usage limits. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/329

Full Changelog: https://github.com/centerforaisafety/cerberus-cluster/compare/v12.13.2024...v03.06.2025

- Python
Published by andriy-safe-ai over 1 year ago

https://github.com/centerforaisafety/cerberus-cluster - v12.13.2024

What's Changed

  • Changed instances of 'include' to 'include_tasks'. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/283
  • Separated oracle linux and ubuntu plays from each other for iptables. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/285
  • Separated oracle linux and ubuntu plays from each other for CAIS play… by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/287
  • Created ubuntu versions of CAIS playbooks. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/289
  • 290 migrate nix playbooks from ol7 to ubuntu by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/291
  • Removed all instances of bastion. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/294
  • Added Slurm conf changes to support Weka. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/296
  • Migrate passwordless SSH playbooks to ubuntu. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/299
  • Migrate billing system to ubuntu. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/301
  • Migrate iptables playbook to ubuntu. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/303
  • Configured motd on cluster by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/305
  • Added ability to deploy home fss separately from /data fss. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/306
  • Added flag to add new lab and new user to billing system. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/308
  • Added invoice by user script for billing. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/318
  • 320 backups ldap by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/321

Full Changelog: https://github.com/centerforaisafety/cerberus-cluster/commits/v12.13.2024

- Python
Published by andriy-safe-ai over 1 year ago

https://github.com/centerforaisafety/cerberus-cluster - v06.25.2024

What's Changed

  • 262 nix fix by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/263
  • Add ability to update login shell by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/264
  • 258 add dcgm low level gpu resource utilization by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/265
  • Added become true to grab node state on hidden partitions. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/272
  • Updated logic to not update the state of a node when state is mixed. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/273
  • Added onboarding script by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/267
  • Add libaio to cais-compute by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/274
  • Added playbooks to enable and disable passwordless ssh for root user. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/279
  • 250 billing system by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/252

Full Changelog: https://github.com/centerforaisafety/cerberus-cluster/compare/v02.26.2024...v06.25.2024

- Python
Published by andriy-safe-ai about 2 years ago

https://github.com/centerforaisafety/cerberus-cluster - v02.26.2024

What's Changed

  • Comment out monitoring cron if unused. by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/119
  • Admin notifications by @rumiah-safe in https://github.com/centerforaisafety/cerberus-cluster/pull/128
  • Slurm upgrade by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/129
  • update slurm.conf by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/134
  • remove devtoolset-5 from cais compute playbook by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/147
  • Adds opengl/mesa to cais-compute ansible role by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/154
  • removes broken play from cais-compute role by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/164
  • adds cais_* plays to site.yml playbook by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/156
  • added configureforweka.sh script by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/228
  • 205 migrate nix to weka by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/231
  • add script to resize boot volumes on compute nodes by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/177

Full Changelog: https://github.com/centerforaisafety/cerberus-cluster/compare/v06.21.23...v02.26.2024

- Python
Published by andriy-safe-ai over 2 years ago

https://github.com/centerforaisafety/cerberus-cluster - v1.06.21.2023

What's Changed

  • Installs git-lfs by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/57
  • Adds local storage support for slurm by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/83
  • Adds nix package manager by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/89
  • Adds nix bash functions by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/97
  • Adds bash functions for simpler nix usage by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/99
  • Adds ability for Cluster command to change group by @arnaudfroidmont in https://github.com/centerforaisafety/cerberus-cluster/pull/102
  • Adds Goslmailer install to allow for SLURM notifications by @rumiahkessel in https://github.com/centerforaisafety/cerberus-cluster/pull/109
  • Updates zsh to 5.9 by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/95
  • Turns off swap on all nodes by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/120

New Contributors

  • @rumiahkessel made their first contribution in https://github.com/centerforaisafety/cerberus-cluster/pull/109

Full Changelog: https://github.com/centerforaisafety/cerberus-cluster/commits/v1.06.21.2023

- Python
Published by steven-basart about 3 years ago

https://github.com/centerforaisafety/cerberus-cluster - v1.04.10.2023

Production version of the code that was deployed.

What's Changed

  • documentation - change user to group by @AndriyNovykov in https://github.com/centerforaisafety/cerberus-cluster/pull/3
  • Bugfix 230224 by @arnaudfroidmont in https://github.com/centerforaisafety/cerberus-cluster/pull/5
  • Adds slurm and fss monitoring files. by @xksteven in https://github.com/centerforaisafety/cerberus-cluster/pull/14
  • Update resize_add.yml by @arnaudfroidmont in https://github.com/centerforaisafety/cerberus-cluster/pull/20
  • Install LLVM, gflags, glog, tmux, and zsh. by @xksteven in https://github.com/centerforaisafety/cerberus-cluster/pull/17
  • Installs Go, GCC v12 and more packages for v10 by @xksteven in https://github.com/centerforaisafety/cerberus-cluster/pull/26
  • Update git to version 2.39.1 by @xksteven in https://github.com/centerforaisafety/cerberus-cluster/pull/27
  • Bugfix enroot by @arnaudfroidmont in https://github.com/centerforaisafety/cerberus-cluster/pull/30
  • Move Ksplice out of cron.d by @xksteven in https://github.com/centerforaisafety/cerberus-cluster/pull/37

Full Changelog: https://github.com/centerforaisafety/cerberus-cluster/commits/1.04.10.2023

- Python
Published by xksteven about 3 years ago