Setting Up SLURM Database (slurmdbd) - HPC Series Part 2
This is part 2 of the HPC/SLURM series. Read Part 1: Beginner's Guide to HPCs for an introduction to SLURM and cluster setup.
Why Do You Need SLURM Database?
After setting up your basic SLURM cluster, you'll quickly realize the need to track job history, resource usage, and user accounting. This is where slurmdbd
(SLURM Database Daemon) comes in. It provides:
- Job Accounting: Track who ran what jobs and when
- Resource Usage Statistics: Monitor CPU, GPU, and memory utilization over time
- Fair-share Scheduling: Prioritize jobs based on historical usage
- Reporting: Generate reports for billing, auditing, and resource planning
The SLURM database can be deployed either on the control node (simpler for small clusters) or on a separate database server (recommended for larger deployments).
For detailed information on separate node deployment, see: https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_database/
Deploying slurmdbd on the Control Node
Step 1: Install and Configure MariaDB
First, install the MariaDB server and client packages:
1sudo apt-get install mariadb-server mariadb-client
Secure the MariaDB installation and set a root password:
1sudo mysql_secure_installation
This interactive script will guide you through:
- Setting a root password
- Removing anonymous users
- Disallowing root login remotely
- Removing test databases
Next, create the SLURM accounting database and user. Log into MariaDB:
1sudo mysql -u root -p
Run the following SQL commands to set up the database:
1CREATE DATABASE slurm_acct_db;
2CREATE USER 'slurm'@'localhost' IDENTIFIED BY 'password';
3GRANT ALL ON slurm_acct_db.* TO 'slurm'@'localhost';
4FLUSH PRIVILEGES;
5EXIT;
Important: Replace 'password'
with a strong, unique password for production use.
Step 2: Configure SLURMDBD
Install the SLURM database daemon package:
1sudo apt install slurmdbd
Create and edit the slurmdbd configuration file at /etc/slurm/slurmdbd.conf
:
1DbdHost=control-node
2DbdPort=6819
3StorageType=accounting_storage/mysql
4StorageHost=localhost
5StoragePass=password
6StorageUser=slurm
7StorageLoc=slurm_acct_db
Configuration Parameters Explained:
DbdHost
: The hostname of the server running slurmdbd (use your control node's hostname)DbdPort
: Port for slurmdbd communication (default: 6819)StorageType
: Backend storage type (MySQL/MariaDB)StorageHost
: Database server location (localhost if on same node)StoragePass
: Password for the slurm database userStorageUser
: Database usernameStorageLoc
: Name of the SLURM accounting database
Set proper permissions on the configuration file:
1sudo chmod 600 /etc/slurm/slurmdbd.conf
2sudo chown slurm:slurm /etc/slurm/slurmdbd.conf
Enable and start the SLURM database daemon:
1sudo systemctl enable slurmdbd
2sudo systemctl start slurmdbd
Verify the service is running:
1sudo systemctl status slurmdbd
Step 3: Configure SLURM to Use SLURMDBD
Update the SLURM configuration file /etc/slurm/slurm.conf
on the control node to enable accounting. Add these lines to your slurm.conf:
1AccountingStorageType=accounting_storage/slurmdbd
2AccountingStorageHost=control-node
3AccountingStoragePort=6819
Replace control-node
with your actual control node hostname.
Restart SLURM services on the control node to apply the changes:
1sudo systemctl restart slurmctld
Verifying Your Setup
After configuration, verify that accounting is working properly:
1# Check cluster status
2sacctmgr show cluster
3
4# Add your cluster if it doesn't exist
5sacctmgr add cluster <cluster-name>
6
7# View accounting data
8sacct
9
10# Check job history
11sacct -a --format=JobID,JobName,User,Partition,State,Start,End,Elapsed
Next Steps
Now that you have job accounting set up, you can:
- Set up user accounts and associations with
sacctmgr
- Configure fair-share scheduling policies
- Generate usage reports for resource planning
- Implement job limits per user or partition
Continue to Part 3: Setting Up SLURM REST API to learn how to interact with your cluster programmatically.