submitting_cuda_or_openacc_jobs.md 3.01 KB
Newer Older
Adam Caprez's avatar
Adam Caprez committed
1
2
3
4
+++
title = "Submitting CUDA or OpenACC Jobs"
description =  "How to submit GPU (CUDA/OpenACC) jobs on HCC resources."
+++
5

Adam Caprez's avatar
Adam Caprez committed
6
### Available GPUs
7
8
9
10
11

Crane has four types of GPUs available in the **gpu** partition. The
type of GPU is configured as a SLURM feature, so you can specify a type
of GPU in your job resource requirements if necessary.

Carrie A Brown's avatar
Carrie A Brown committed
12
|    Description       | SLURM Feature |      Available Hardware      | 
Adam Caprez's avatar
Adam Caprez committed
13
| -------------------- | ------------- | ---------------------------- |
Carrie A Brown's avatar
Carrie A Brown committed
14
15
16
17
| Tesla K20, non-IB    | gpu_k20       | 3 nodes - 2 GPUs with 4 GB mem per node  |
| Teska K20, with IB   | gpu_k20       | 3 nodes - 3 GPUs with 4 GB mem per node    |
| Tesla K40, with IB   | gpu_k40       | 5 nodes - 4 K40M GPUs with 11 GB mem per node<br> 1 node - 2 K40C GPUs |
| Tesla P100, with OPA | gpu_p100      | 2 nodes - 2 GPUs with 12 GB per node |
Adam Caprez's avatar
Adam Caprez committed
18

19
20
21
22

To run your job on the next available GPU regardless of type, add the
following options to your srun or sbatch command:

Adam Caprez's avatar
Adam Caprez committed
23
{{< highlight batch >}}
24
--partition=gpu --gres=gpu
Adam Caprez's avatar
Adam Caprez committed
25
{{< /highlight >}}
26
27
28
29

To run on a specific type of GPU, you can constrain your job to require
a feature. To run on K40 GPUs for example:

Adam Caprez's avatar
Adam Caprez committed
30
{{< highlight batch >}}
31
--partition=gpu --gres=gpu --constraint=gpu_k40
Adam Caprez's avatar
Adam Caprez committed
32
{{< /highlight >}}
33

Adam Caprez's avatar
Adam Caprez committed
34
{{% notice info %}}
35
36
37
38
You may request multiple GPUs by changing the` --gres` value to
-`-gres=gpu:2`. Note that this value is **per node**. For example,
`--nodes=2 --gres=gpu:2 `will request 2 nodes with 2 GPUs each, for a
total of 4 GPUs.
Adam Caprez's avatar
Adam Caprez committed
39
{{% /notice %}}
40

Adam Caprez's avatar
Adam Caprez committed
41
### Compiling
42
43

Compilation of CUDA or OpenACC jobs must be performed on the GPU nodes.
Adam Caprez's avatar
Adam Caprez committed
44
45
Therefore, you must run an [interactive job]({{< relref "submitting_an_interactive_job" >}})
to compile. An example command to compile in the **gpu** partition could be:
46

Adam Caprez's avatar
Adam Caprez committed
47
{{< highlight batch >}}
48
$ srun --partition=gpu --gres=gpu --mem-per-cpu=1024 --ntasks-per-node=6 --nodes=1 --pty $SHELL
Adam Caprez's avatar
Adam Caprez committed
49
{{< /highlight >}}
50
51
52
53
54

The above command will start a shell on a GPU node with 6 cores and 6GB
of ram in order to compile a GPU job.  The above command could also be
useful if you want to run a test GPU job interactively.

Adam Caprez's avatar
Adam Caprez committed
55
### Submitting Jobs
56
57
58

CUDA and OpenACC submissions require running on GPU nodes.

Adam Caprez's avatar
Adam Caprez committed
59
60
61
62
63
64
65
66
67
68
69
70
{{% panel theme="info" header="cuda.submit" %}}
{{< highlight batch >}}
#!/bin/sh
#SBATCH --time=03:15:00
#SBATCH --mem-per-cpu=1024
#SBATCH --job-name=cuda
#SBATCH --partition=gpu
#SBATCH --gres=gpu
#SBATCH --error=/work/[groupname]/[username]/job.%J.err
#SBATCH --output=/work/[groupname]/[username]/job.%J.out

module load cuda/8.0
71
./cuda-app.exe
Adam Caprez's avatar
Adam Caprez committed
72
73
{{< /highlight >}}
{{% /panel %}}
74
75
76
77
78

OpenACC submissions require loading the PGI compiler (which is currently
required to compile as well).


Adam Caprez's avatar
Adam Caprez committed
79
80
81
82
83
84
85
86
87
88
{{% panel theme="info" header="openacc.submit" %}}
{{< highlight batch >}}
#!/bin/sh
#SBATCH --time=03:15:00
#SBATCH --mem-per-cpu=1024
#SBATCH --job-name=cuda-acc
#SBATCH --partition=gpu
#SBATCH --gres=gpu
#SBATCH --error=/work/[groupname]/[username]/job.%J.err
#SBATCH --output=/work/[groupname]/[username]/job.%J.out
89
90


Adam Caprez's avatar
Adam Caprez committed
91
module load cuda/8.0 compiler/pgi/16
92
./acc-app.exe
Adam Caprez's avatar
Adam Caprez committed
93
94
{{< /highlight >}}
{{% /panel %}}