Recent Releases of job-defense-shield
job-defense-shield - v1.2.1
- Added ability to use an external SMTP server
- Added
gpu_mem_eff_pctsetting to--low-gpu-efficiencyso that jobs with high GPU memory usage can be ignored
- Python
Published by jdh4 10 months ago
job-defense-shield - v1.2.0
- The "GPU Model Too Powerful" alert now supports multi-GPU jobs. This introduced breaking changes to the names of some settings (e.g.,
num_cores_thresholdhas been replaced bynum_cores_per_gpu). - The -E and -S options can be used to set the starttime and endtime for the call to sacct.
- Reports for low CPU/GPU utilization and excessive CPU/GPU time limits can now show all users with the
show_all_usersflag. Previously, only the offending users were shown in the report. - A debug option (
--dump-files) has been added to write the raw and processed dataframes.
- Python
Published by jdh4 11 months ago
job-defense-shield - v1.1.2
- When the sliding window cancellation method is used, the minimum elapsed time for a job to receive a warning is the max of
cancel_minutesplussampling_period_minutesandsliding_warning_minutes. - The default for
warnings_to_adminwas changed toFalse. - Admins are encouraged to use
warning_frac: 0.5instead of the default of 1.0.
- Python
Published by jdh4 about 1 year ago
job-defense-shield - v1.1.1
Added support for multiple alert entries for cancelling GPU jobs at 0% utilization - fractionofperiod can have a max value of 0.7 divided by number of entries - cache filename is different for each entry
- Python
Published by jdh4 about 1 year ago
job-defense-shield - v1.1.0
Can now cancel jobs with 0% GPU utilization over any time window of a specified length. Previously, job cancellations could only be done during the first N minutes of the job.
- Python
Published by jdh4 about 1 year ago
job-defense-shield - v1.0.2
- Code can identify jobs with 0% GPU utilization over the last N minutes
- Logging information sent to stdout and reports
- Python
Published by jdh4 about 1 year ago
job-defense-shield - v1.0.1
- Fixed printing of emails
- Fixed log path for report demo in docs
- Removed dates from reports
- Python
Published by jdh4 about 1 year ago
job-defense-shield - v1.0.0
First release
Code was published to PyPI today.
- Python
Published by jdh4 about 1 year ago