Recent Releases of https://github.com/centerforaisafety/cerberus-cluster
https://github.com/centerforaisafety/cerberus-cluster - v03.06.2025
What's Changed
- 320 backups accounting by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/323
- Slurm Backups by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/322
- Added monitoring backups. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/325
- Updated Slurm.conf to support usage limits. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/329
Full Changelog: https://github.com/centerforaisafety/cerberus-cluster/compare/v12.13.2024...v03.06.2025
- Python
Published by andriy-safe-ai over 1 year ago
https://github.com/centerforaisafety/cerberus-cluster - v12.13.2024
What's Changed
- Changed instances of 'include' to 'include_tasks'. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/283
- Separated oracle linux and ubuntu plays from each other for iptables. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/285
- Separated oracle linux and ubuntu plays from each other for CAIS play… by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/287
- Created ubuntu versions of CAIS playbooks. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/289
- 290 migrate nix playbooks from ol7 to ubuntu by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/291
- Removed all instances of bastion. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/294
- Added Slurm conf changes to support Weka. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/296
- Migrate passwordless SSH playbooks to ubuntu. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/299
- Migrate billing system to ubuntu. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/301
- Migrate iptables playbook to ubuntu. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/303
- Configured motd on cluster by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/305
- Added ability to deploy home fss separately from /data fss. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/306
- Added flag to add new lab and new user to billing system. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/308
- Added invoice by user script for billing. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/318
- 320 backups ldap by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/321
Full Changelog: https://github.com/centerforaisafety/cerberus-cluster/commits/v12.13.2024
- Python
Published by andriy-safe-ai over 1 year ago
https://github.com/centerforaisafety/cerberus-cluster - v06.25.2024
What's Changed
- 262 nix fix by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/263
- Add ability to update login shell by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/264
- 258 add dcgm low level gpu resource utilization by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/265
- Added become true to grab node state on hidden partitions. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/272
- Updated logic to not update the state of a node when state is mixed. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/273
- Added onboarding script by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/267
- Add libaio to cais-compute by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/274
- Added playbooks to enable and disable passwordless ssh for root user. by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/279
- 250 billing system by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/252
Full Changelog: https://github.com/centerforaisafety/cerberus-cluster/compare/v02.26.2024...v06.25.2024
- Python
Published by andriy-safe-ai about 2 years ago
https://github.com/centerforaisafety/cerberus-cluster - v02.26.2024
What's Changed
- Comment out monitoring cron if unused. by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/119
- Admin notifications by @rumiah-safe in https://github.com/centerforaisafety/cerberus-cluster/pull/128
- Slurm upgrade by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/129
- update slurm.conf by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/134
- remove devtoolset-5 from cais compute playbook by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/147
- Adds opengl/mesa to cais-compute ansible role by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/154
- removes broken play from cais-compute role by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/164
- adds cais_* plays to site.yml playbook by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/156
- added configureforweka.sh script by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/228
- 205 migrate nix to weka by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/231
- add script to resize boot volumes on compute nodes by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/177
Full Changelog: https://github.com/centerforaisafety/cerberus-cluster/compare/v06.21.23...v02.26.2024
- Python
Published by andriy-safe-ai over 2 years ago
https://github.com/centerforaisafety/cerberus-cluster - v1.06.21.2023
What's Changed
- Installs git-lfs by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/57
- Adds local storage support for slurm by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/83
- Adds nix package manager by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/89
- Adds nix bash functions by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/97
- Adds bash functions for simpler nix usage by @andriy-safe-ai in https://github.com/centerforaisafety/cerberus-cluster/pull/99
- Adds ability for Cluster command to change group by @arnaudfroidmont in https://github.com/centerforaisafety/cerberus-cluster/pull/102
- Adds Goslmailer install to allow for SLURM notifications by @rumiahkessel in https://github.com/centerforaisafety/cerberus-cluster/pull/109
- Updates zsh to 5.9 by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/95
- Turns off swap on all nodes by @steven-basart in https://github.com/centerforaisafety/cerberus-cluster/pull/120
New Contributors
- @rumiahkessel made their first contribution in https://github.com/centerforaisafety/cerberus-cluster/pull/109
Full Changelog: https://github.com/centerforaisafety/cerberus-cluster/commits/v1.06.21.2023
- Python
Published by steven-basart about 3 years ago
https://github.com/centerforaisafety/cerberus-cluster - v1.04.10.2023
Production version of the code that was deployed.
What's Changed
- documentation - change user to group by @AndriyNovykov in https://github.com/centerforaisafety/cerberus-cluster/pull/3
- Bugfix 230224 by @arnaudfroidmont in https://github.com/centerforaisafety/cerberus-cluster/pull/5
- Adds slurm and fss monitoring files. by @xksteven in https://github.com/centerforaisafety/cerberus-cluster/pull/14
- Update resize_add.yml by @arnaudfroidmont in https://github.com/centerforaisafety/cerberus-cluster/pull/20
- Install LLVM, gflags, glog, tmux, and zsh. by @xksteven in https://github.com/centerforaisafety/cerberus-cluster/pull/17
- Installs Go, GCC v12 and more packages for v10 by @xksteven in https://github.com/centerforaisafety/cerberus-cluster/pull/26
- Update git to version 2.39.1 by @xksteven in https://github.com/centerforaisafety/cerberus-cluster/pull/27
- Bugfix enroot by @arnaudfroidmont in https://github.com/centerforaisafety/cerberus-cluster/pull/30
- Move Ksplice out of cron.d by @xksteven in https://github.com/centerforaisafety/cerberus-cluster/pull/37
Full Changelog: https://github.com/centerforaisafety/cerberus-cluster/commits/1.04.10.2023
- Python
Published by xksteven about 3 years ago