GPU computing
GPU job
To run GPU calculation, the user needs to:
- specify number of GPU cards (parameter
ngpus
), and - choose one of the gpu queues explicitly.
name of GPU queue must be specified
Contrary to normal job, the GPU jobs will not be routed into appropriate queue according to parameter ngpus
only. The name of the queue (parameter -q
) has to be specified, too.
GPU queue name | Walltime range |
---|---|
gpu@meta-pbs.metacentrum.cz | 00:00:00 - 24:00:00 |
gpu_long@meta-pbs.metacentrum.cz | 00:00:00 - 336:00:00 |
gpu@cerit-pbs.cerit-sc.cz | 00:00:00 - 24:00:00 |
GPU jobs on the konos cluster can be also run via the priority queue iti@meta-pbs.metacentrum.cz
(queue for users from ITI - Institute of Theoretical Informatics, Univ. of West Bohemia).
Example
qsub -I -q gpu -l select=1:ncpus=1:ngpus=1:scratch_local=10gb -l walltime=24:0:0
PBS resources
gpu mem
PBS parameter gpu_mem
specifies minimum amount of memory that the GPU card will have.
qsub -q gpu -l select=1:ncpus=1:ngpus=1:gpu_mem=10gb ...
gpu_cap
PBS parameter gpu_cap
is Cuda compute capability as defined on this page.
Warning
With the introduction of new PBS server running on OpenPBS, the specification of gpu_cap
parameter is done in two distinct ways depending whether you submit to PBS Pro or OpenPBS scheduler.
OpenPBS (new scheduler)
The user can specify a minimal required architecture (compute_XY
), or a minimal required version within a given architecture (sm_XY
).
Minimal architecture:
gpu_cap=compute_70 # will give you 7.0, 7.1, ... 7.5, but also 8.0, 9.0 ...
Minimal version of a chosen architecture, e.g. 7 ("Volta"):
gpu_cap=sm_72 # will give you 7.2 till 7.5, but not 8.0 and higher
The requirements can be combined in a comma-separated string.
Note
The commas are evaluated as an OR operand.
Example:
qsub -l select=1:ngpus=1:gpu_cap=\"sm_65,compute_70\":mem=4gb # 6.5 or 7.0 and higher
qsub -l 'select=1:ngpus=1:gpu_cap="sm_65,compute_70":mem=4gb' # dtto
Note
Note that the quotes enclosing the gpu_cap
options must be protected against shell expansion either by escaping them or by enclosing the whole qsub
command into single quotes.
PBS Pro (old schedulers)
qsub -q gpu -l select=1:ncpus=1:ngpus=1:gpu_cap=cuda70 ...
cuda_version
PBS parameter cuda_version
is version of CUDA installed.
System variables
IDs of GPU cards are stored in CUDA_VISIBLE_DEVICES
variable.
These IDs are mapped to CUDA tools virtual IDs. Though if CUDA_VISIBLE_DEVICES
contains value 2, 3 then CUDA tools will report IDs 0, 1.