All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
- slurmdbd-backup:
- Fix conflict between "--single-transaction" and "-l" option
- Remove "-l" in mysqldump_parameters by default and add a condition to use it
- job_submit.lua: Fixed a typo where username wasn't spelled properly (it's user_name)
- slurm-gen-qos-conf:
- skip users' QoSes by default
- added multiple command line arguments to:
- specify the output file
- run in dry mode
- include users' QoSes
- include all QoSes (by default we exclude those not containing '_')
- wckeys: modify SQL INSERT request to avoid MariaDB regression with SELECT without FROM
- wckeys: validate format of project/application codes against maximum lengths
- wckeys: fix insertion on empty SlurmDBD wckey table (#8)
- wckeys: make curl fail when HTTP server does not respond status code 200 (ok)
- epilog: remove TaskEpilog kerberos_lustre.sh messages on stderr
- Aligned branch versions
- slurm-gen-qos-conf: Fixed syntax errors to make the script usable again
- job_submit.lua:
- Removed unneeded spaces in function definitions
- Added a prefix to all log messages and reformatted them to include more information
- job_submit.lua:
- Added '.' and ':' to the list of valid characters one can use for a job name
- Misc error formating fixes
- slurm-gen-qos-conf:
- only writes QoSes matching our format (separated by '_' with the first item being a partition)
- job_submit.lua:
- Make regex error message clearer by showing the list of allowed characters instead of the regex
- Exclude non matching QoS' names from partition validation: a QoS not matching the format specified above won't be checked for valid partition
- slurm-gen-qos-conf: some cleaning and only only keep QoS which matches partitions
- job_submit.lua:
- lots of cleaning
- added a logging function so we can return error messages to the user
- job names have to match a regex
- job names have to under a fixed length
- now returns an error if we don't find a matching QoS when it's not provided
- now returns an error if we don't find a default partition if none was provided
- ensure job's time limit is compatible with the QoS' time limit
- fixed a bug in to_minute where d-h would be converted to d-h, ensure we actually match digits and returns 0 if we found nothing
- slurm only expects a limited set of return results so some of them had to be converted to slurm.ERROR (for example: ESLURM_INVALID_WCKEY and ESLURM_INVALID_QOS)
- when the user provided both QoS and partition, make sure they match
- because we rewrote a portion of the build_qos, now we can iterate the list of QoSes matching a given partition to find one matching the time limit, number of nodes and accounts if needed
- if a QoS wasn't provided and we either used the default partition or the provided one, make sure we have matching QoSes, if not, returns an error
- slurm-sync-accounts: some cleaning and added logic so the script doesn't what's necessary when a user should have multiple associations or when a user should have its current association(s) replaced
- epilog: log meaningful error in cases epilog exits prematurely
- taskprolog: imporive renewer for canonical cache
- taskepilog: Add logging at the end of tasks
- taskprolog: imporive renewer for canonical cache
- taskprolog: lustre add renewer for canonical ccache
- Use /etc/slurm on all Linux distributions
- epilog: detect if cgroup uses slurm or slurm_HOSTNAME
- wckeys: EL8 compatibility
- taskprolog for lustre kerberos
- Python3 port and EL8 compatibilty
- Python3 port and EL8 compatibility
- genscripts: don't remove other jobs private-tmpdir
- slurmdbd-backup: Use single-transaction by default
- correct a bug on slurm-wckeys-setup when tmpfs is not umount when we have a exit code error
- genscripts: rework cgroup cleaning in epilog
- admin-utils: remove NUL char from node-list-procs
- admin-utils: Add slurm-check-cgroups-nojob
- add cgroups script to slurm-admin-utils
- admin-utils nodes-only to slurm-check-procs-nojob
- Epilog.d: Fix tmp cleaning bug and cgroups non cleaned problem
- admin-utils: add new admin script to manage kill task failled in slurmd nodes
- admin-utils: do not json decode failing nodes
- introduce new binary package slurm-admin-utils
- admin-utils: introduce slurm-check-procs-nojob
- genscripts: replace squeue with cpuset check
- genscripts: log pkill count in clean epilog
- sync-accounts: allow multiple groups per accounts
- correct an other bug introduce by 0e2b780d6ac1
- correct a bug introduce by 0e2b780d6ac1
- new release to be in sync with scibian repo
- job_submit.lua: correct a problem in build_qos_list by setting qos_maxtime to infinite when this variable is not define
- job_submit.lua: remove getent to get username
- job_submit.lua: replace grep with Lua stdlib io
- job_submit.lua: replace cat + remove sacctmgr
- job_submit.lua: handle unexiting account
- Ensure the files are in ASCII unix mode before operating on it
- Send cron output to /dev/null
- Make slurm-llnl-setup-wckeys Breaks/Replaces old slurm-llnl-job-submit-plugin to ensure smooth upgrades.
- Add dependancy between slurm-llnl-job-submit-plugin and slurm-llnl-setup-wckeys
- Remove bsdutils dependancy as it's an essential package
- Bugfix crontab for slurm-llnl-setup-wckeys
- Move slurm-wckeys-setup in new binary package
- Add a cronjob for slurm-wckeys-setup
- Use curl instead of wget Simpler logic in script and uniformisation for files locations in config
- Log messages and errors in syslog
- Bugfix packaging : rename script slurm-wckeys-setup in install file
- slurm-wckeys-setup: Update header
- slurm-wckeys-setup: Rename slurm_wckeys_setup.sh in slurm-wckeys-setup
- slurm-wckeys-setup: Bugfix : file not found is ${SLURMDB_FILE} and not ${CODES_FILE}
- slurm-wckeys-setup: Add possiblity to download pareo and codes files by http
- mysql-setup: manage password changes
- mysql-setup: do not create slurm DB
- mysql-setup: do not create users with grant opt
- mysql-setup: add feature to restrict slurmro hosts
- job_submit: split optimization
- sync-accounts: fix user account description string
- sync-accounts: handle multiple src posix groups
- pwmgt: disable SSH strict host key checking
- pwmgt: fix debug formatting string
- pwmgt: daemonize stop wrapper cmd
- pkg: pwmgt stop wrapper now depends on daemon lib
- packaging: fix distribution
- job_submit: os.execute compatible with lua 5.1 and 5.2
- slurmdbd-backup: Add SlurmDBD backup script
- slurmdbd-backup: Add packaging
- Introduce pwmgt utility
- slurm_wckeys_setup.sh: Fix bug wckeys (applications and projects order)
- slurm_wckeys_setup.sh: Manage multiple projects and applications CSV
- job_submit.lua: remove useless code branch
- gen qos.conf script now extract accounts
- job_submit.lua: handle empty maxcpus in CSV
- job_submit.lua: manage multiple qos same settings
- job_submit.lua: check allowed accounts from CSV
- sync-accounts: support multiple groups
- sync-account: add opts creation cmd params
- Fix bugs in slurm-gen-qos-conf
- Fix ugly bugs in sync-account script
- job-submit: Remove examples CSV and user exception files since
- sync-accounts: Remove user before account w/ user_account policy
- remove examples irrelevant in production.
- Remove all trailing whitespaces in slurm_wckeys_setup.sh
- Do not convert in uppercase in final wckeys file
- Fix mysql command params in slurm_wckeys_setup.sh
- Do not convert dash into underscore anymore
- Backport slurm-gen-qos-conf to python 2.6 and fix empty QOS case
- Backport slurm-sync-accounts to python 2.6
- sync-accounts: add missing dep on slurm-client
- Add new sync-accounts package
- Add job fields in slurm log to help debug in Lua submit plugin
- All changes are relative to the job_submit.lua script.
- For exclusive jobs, set job_desc.min_nodes to 1 by default.
- Cosmetic change: update file header.
- Replace tabs with spaces.
- Add ability to use a configuration file (/etc/slurm-llnl/job_submit.conf)
in which administrators can specify the following parameters:
- QOS_CONF
- QOS_SEP
- QOS_NAME_SEP
- NULL
- CORES_PER_NODE
- ESLURM_INVALID_WCKEY
- WCKEY_CONF_FILE
- WCKEY_USER_EXCEPTION_FILE The aformentioned keys have sensible values in the .lua script. Special care must be taken with the CORES_PER_NODE parameter which must be configured for each cluster depending on the configuration of the compute nodes. The syntax of /etc/slurm-llnl/job_submit.conf is simple (In fact, it is a lua script): Lines of the shape " = ".
- Emit a message when the user specifies a QOS.
- Fix clean epilog script
- Do not fail when JOB_ID is not numeric
- Do not consider UIDs less than 1000
- Fix job submit LUA script
- Somes fixes in shell script slurm_wckeys_setup.sh
- Do not recommend slurm-llnl anymore.
- Add QOS exceptions in slurm-gen-qos-conf
- Add function to force the use of wckey.
- Large update of job_submit.lua for slurm 14.11.x
- Package job submit plugin now depends on slurm >= 14.11
- Updated dep from slurm-llnl-basic-plugins to
- slurm-wlm-basic-plugins since pkg name has changed.
- Add a python script to setup mysql for SlurmDBD
- in a new package slurm-llnl-setup-mysql.
- New script slurm-gen-qos-conf to generate qos.conf for
- Add missing dependency to members and infiniband-diags on
- slurm-llnl-node-health-plugin pkg.
- Moved content of job-submit package in dedicated subdir Lua job submit plugin.
- Check return code of ibstat in check_node_health script
- Restrict IB rate check to port 1 only
- Update job_submit.lua
- Fix detection of fs usage in check_node_health.sh
- Yet anoter typo fix in generic-script.sh.
- Fix generic-script.sh to check for regular files as well as symlinks.
- Ldap check bug fix
- Update check_node_health.sh script
- Initial Release