1. Home
  2. Docs
  3. Running Jobs
  4. Getting Started

Getting Started

When you login to the Supercomputer system you land on a login node. Login nodes are for editing, compiling, preparing jobs. They are not for running jobs. Failure to follow this policy may result in account revocation. From the login node you can submit job scripts using sbatch or start interactive jobs with salloc.

Slurm

We use Slurm for cluster/resource management and job scheduling. Slurm is responsible for allocating resources to users, providing a framework for starting, executing and monitoring work on allocated resources and scheduling work for future execution.

Job Submissions

sbatch

sbatch submits a batch script to Slurm. The batch script may be given to sbatch through a file name on the command line, or if no file name is specified, sbatch will read in a script from standard input. The batch script may contain options preceded with “#SBATCH” before any executable commands in the script.

sbatch exits immediately after the script is successfully transferred to the Slurm controller and assigned a Slurm job ID. The batch script is not necessarily granted resources immediately, it may sit in the queue of pending jobs for some time before its required resources become available.

When you submit the job, Slurm responds with the job’s ID, which will be used to identify this job in reports from Slurm.

login:~> sbatch run_example
Submitted batch job 1234567

salloc

salloc is used to allocate resources for a job in real time. Typically this is used to allocate resources and spawn a shell. The shell is then used to execute srun commands to launch parallel tasks.

login:~> salloc -N 1 -p shared
salloc: Granted job allocation 1234567
salloc: Waiting for resource configuration
salloc: Nodes n1234 are ready for your job
n1234:~>

Monitoring Jobs

squeue

You can monitor your jobs using the squeue command. To view jobs specific to your user use squeue -u your_username. Please do not script or “watch” the squeue as this can put unnecessary load on the queuing system.

Email Notifications

You can use the email options in your submission scripts to notify you when a job starts/stops/errors. We recommend using your NETL email address for these notifications. External email address may experience delays or may fail to relay through the NETL email system.

#SBATCH --mail-type=begin,end,fail
#SBATCH --mail-user=user@domain.com

Cancel Jobs

Cancel a job by JOB ID.

login:~> scancel -j $JOBID

Hold Jobs

Prevent a pending job from being started:

login:~> scontrol hold $jobid

Release Held Jobs

Allow a held job to accrue priority and run:

login:~> scontrol release $jobid