Skip to main content

Running Jobs

PJM is the utility used for batch processing support, so all jobs must be run through it. This section provides information for getting started with job execution at the Cluster.

Job Queues

There are several queues present in the machines and different users may access different queues. You can check anytime all queues you have access to using:

pjshowrsc --rg

[ CLST: compute ]
[ RSCUNIT: rscunit_ft02 ]
RSCGRP NODE
TOTAL CNS FREE ALLOC
small 24 24 0
middle 96 96 0
important 192 192 0
def_grp 96 96 0
large 192 192 0

Submitting Jobs

The method for submitting jobs is to use the PJM directives directly.

A job is the execution unit for PJM. A job is defined by a text file containing a set of directives describing the job's requirements and the commands to execute. These are the basic directives to submit jobs:

pjsub <job_script>; 

Submits a "job script" to the queue system, similar to sbatch in SLURM.

pjstat 

Shows all the submitted jobs, similar to squeue in SLURM.

pjhold <job_id>; / pjrls <job_id>

Holds and releases respectively a non-empty set of jobs with the given job id.

pjdel <job_id>

Deletes the job with the given <job_id>.

For a deep explanation of each command, please refer to their man pages.

Disclaimer About Job Submissions

caution

If you are used to using our other HPC clusters, there's a big difference that you need to take into account when using CTE-ARM. In this cluster, in order to avoid potential issues when trying to write or access files at job execution time, it is imperative that the output files and the working directories are located inside the /fefs filesystem. This also includes the paths of the job output/error files.

Failing to do so can make your jobs to fail unexpectedly, so make sure to follow this general rule.

Interactive Sessions

Allocation of an interactive session has to be done through PJM:

pjsub --interact

Although, it is possible that allocating resources without specifying a resource group may lead to issues, so it is recommended to submit the allocation by using:

pjsub --interact -L rscgrp=large

Job Directives

A job must contain a series of directives to inform the batch system about the characteristics of the job. These directives appear as comments in the job script, here you may find the most common directives for both syntaxes:

#PJM -N <name>

Specify the name of the job

#PJM -j

Store both Standard Output and Standard Error to the same file, it will ignore -e directive if specified.

#PJM -X

Inherit environmental variables at batch job submission to the running environment of the job

#PJM -L rscgrp=<name>

Name of the resource group to submit the job, similar to qos in SLURM.

#PJM -L elapse=[[HH:]MM:]SS

The limit of wall clock time. you must set it to a value greater than real execution time for your application. Notice that your job will be killed after the time has passed.

#PJM -L node=<number>

The number of requested nodes.

pjsub -L proc-core=<size limit> <job_script> 

Generate core files if your processes fail unexpectedly. This option only works correctly when submitting the job.

The size limit for each core file can be written directly in MB or using units, with an integer being followed by the unit symbol.

The possible options are:

SYMBOLUNIT
Kkilobyte (10³)
Mmegabyte (10⁶)
Ggigabyte (10⁹)
Tterabyte (10¹²)
Ppetabyte (10¹⁵)

Although, the size limit is 2147MB in both cases.

Please note that core files are not written in a human-readable format, you can use the command xxd to be able to read the hex dump or gdb to debug its execution. Refer to their man pages for a further explanation.

#PJM --mpi "parameter[,...]"

This option specifies the parameters of an MPI job. These are more common parameters:

  • proc="number" -> The number of processes to start.

  • max-proc-per-node="number" -> The number of processes by node.

In order to use both options simultaneously, you have to use the following syntax:

#PJM --mpi "proc=<number1>,max-proc-per-node=<number2>"

This way, you can tune the mpi settings for your job.

#PJM -o filename

The name of the file to collect the standard output (stdout) of the job.

#PJM -e filename

The name of the file to collect the standard error output (stderr) of the job.

#PJM -s 

To print the system job statistical information, such as the memory and CPU limits, node ID's, power consumption, elapsed time, etc... This will be printed in a separate file whose name contains the job ID and ends by ".stats".

#PJM -S 

Prints information just as the previous one, but adds the statistical information about the nodes that executed the job.

Standard output/error management

Standard output and standard error output are saved in files. If the output files were not specified, they will be created in the directory where the pjsub command was issued (%n is the job name, name of the jobscript if not specificed, and %j is the job id):

  • %n.%j.out:standard output
  • %n.%j.err:standard error output

Examples

Here you have an example for a sequential job:

#!/bin/bash
#------ pjsub option --------#
#PJM -L "rscgrp=small"
# Name of the resource group (= queue) to submit the job
#PJM -N serial
# Name of the job (optional)
#PJM -L node=1
# Specify the number of required nodes
#PJM -L elapse=00:05:00
# Specify the maximum duration of a job
#PJM -j
# Store stdout and stderr in the same file
#------- Program execution -------#
/usr/bin/hostname

The job would be submitted using "pjsub <jobscript>". The output will be stored in the same directory as the file serial..out, where is the job id.

In this case we have an example of a parallel job using MPI:

#!/bin/bash 
#PJM -N parallel
#PJM -L rscgrp=small
#PJM -L node=2
#PJM -L elapse=0:30:00
#PJM --mpi "proc=6,max-proc-per-node=3"
# The number of MPI processes and the maximum of processes per node
#PJM -o job-%j.out
# File where standard output will be stored
#PJM -e job-%j.err
# File where standard errors will be stored

export PATH=/opt/FJSVxtclanga/tcsds-1.1.18/bin:$PATH
export LD_LIBRARY_PATH=/opt/FJSVxtclanga/tcsds-1.1.18/lib64:$LD_LIBRARY_PATH

mpirun -np 6 /fefs/apps/examples/test

This job will launch six MPI tasks distributed in two nodes