Skip to content
Snippets Groups Projects
Commit e699d6ae authored by Carrie A Brown's avatar Carrie A Brown
Browse files

Created running_apps and submitting_jobs pages

parent 63994f95
No related branches found
No related tags found
4 merge requests!78Added Office Hours Page,!77Added DUO Screenshot Update and SandStone Docs,!76Sandstone Doc Page And DUO Screenshots,!62Anaconda Number Fixed
+++
title = "How to setup X11 forwarding"
description = "Use X11 forwarding to view GUI programs remotely"
weight = "35"
+++
##### If you are connecting to HCC clusters via a PC running Windows, please take the following steps to setup X11 forwarding.
1. Download Xming to your local PC and install. Download
link: https://downloads.sourceforge.net/project/xming/Xming/6.9.0.31/Xming-6-9-0-31-setup.exe
2. Download PuTTY to your local PC and install. Download link: http://the.earth.li/~sgtatham/putty/latest/x86/putty.exe
3. Open Xming and keep it running in the background.
4. Configure PuTTY as below:
{{< figure src="/images/11637370.png" height="400" >}}
{{< figure src="/images/11637371.jpg" height="400" >}}
5. To test your X11 setup, after login, type command `xeyes` and press
enter.
{{< figure src="/images/11637372.png" height="400" >}}
6. Close the xeyes application by "Ctrl + c" from the terminal or click
the close button on the up-right corner of the graphical window.
##### If you are connecting to HCC clusters via a Macintosh, please take the following steps to setup X11 forwarding.
- Check the OS version on your Mac, if it's below 10.8., you can simply type `ssh -Y username@hostname` in your terminal to login.
- If your OS version is newer than 10.8, please do the following:
1. Download and install XQuartz.
Download link: https://dl.bintray.com/xquartz/downloads/XQuartz-2.7.11.dmg
2. Type `ssh -Y username@hostname` in your terminal to login.
3. To test your X11 setup, after login, type command "xeyes" and press
enter.
{{< figure src="/images/11637374.png" height="400" >}}
4. Close the xeyes application by "Control + c" from the terminal or
click the close button on the up-left corner of the graphical
window.
##### If you are connecting to HCC clusters via a Linux laptop, please take the following steps to setup X11 forwarding.
1. Open the remote client terminal.
2. Type `ssh -Y username@hostname`" in your terminal to login.
3. To test your X11 setup, after login, type command "xeyes" and press
enter.
4. Close the xeyes application by "Ctrl + c" from the terminal or click
the close button on the up-right corner of the graphical window.
#### Related articles
[X11 on Windows](http://www.straightrunning.com/XmingNotes)
[X11 on Mac](https://en.wikipedia.org/wiki/XQuartz)
[X11 on Linux](http://www.wikihow.com/Configure-X11-in-Linux)
+++
title = "MPI Jobs on HCC"
description = "How to compile and run MPI programs on HCC machines"
weight = "52"
+++
This quick start demonstrates how to implement a parallel (MPI)
Fortran/C program on HCC supercomputers. The sample codes and submit
scripts can be downloaded from [mpi_dir.zip](/attachments/mpi_dir.zip).
#### Login to a HCC Cluster
Log in to a HCC cluster through PuTTY ([For Windows Users]({{< relref "/quickstarts/for_windows_users">}})) or Terminal ([For Mac/Linux
Users]({{< relref "/quickstarts/for_maclinux_users">}})) and make a subdirectory called `mpi_dir` under the `$WORK` directory.
{{< highlight bash >}}
$ cd $WORK
$ mkdir mpi_dir
{{< /highlight >}}
In the subdirectory `mpi_dir`, save all the relevant codes. Here we
include two demo programs, `demo_f_mpi.f90` and `demo_c_mpi.c`, that
compute the sum from 1 to 20 through parallel processes. A
straightforward parallelization scheme is used for demonstration
purpose. First, the master core (i.e. `myid=0`) distributes equal
computation workload to a certain number of cores (as specified by
`--ntasks `in the submit script). Then, each worker core computes a
partial summation as output. Finally, the master core collects the
outputs from all worker cores and perform an overall summation. For easy
comparison with the serial code ([Fortran/C on HCC]({{< relref "fortran_c_on_hcc">}})), the
added lines in the parallel code (MPI) are marked with "!=" or "//=".
{{%expand "demo_f_mpi.f90" %}}
{{< highlight fortran >}}
Program demo_f_mpi
!====== MPI =====
use mpi
!================
implicit none
integer, parameter :: N = 20
real*8 w
integer i
common/sol/ x
real*8 x
real*8, dimension(N) :: y
!============================== MPI =================================
integer ind
real*8, dimension(:), allocatable :: y_local
integer numnodes,myid,rc,ierr,start_local,end_local,N_local
real*8 allsum
!====================================================================
!============================== MPI =================================
call mpi_init( ierr )
call mpi_comm_rank ( mpi_comm_world, myid, ierr )
call mpi_comm_size ( mpi_comm_world, numnodes, ierr )
!
N_local = N/numnodes
allocate ( y_local(N_local) )
start_local = N_local*myid + 1
end_local = N_local*myid + N_local
!====================================================================
do i = start_local, end_local
w = i*1d0
call proc(w)
ind = i - N_local*myid
y_local(ind) = x
! y(i) = x
! write(6,*) 'i, y(i)', i, y(i)
enddo
! write(6,*) 'sum(y) =',sum(y)
!============================================== MPI =====================================================
call mpi_reduce( sum(y_local), allsum, 1, mpi_real8, mpi_sum, 0, mpi_comm_world, ierr )
call mpi_gather ( y_local, N_local, mpi_real8, y, N_local, mpi_real8, 0, mpi_comm_world, ierr )
if (myid == 0) then
write(6,*) '-----------------------------------------'
write(6,*) '*Final output from... myid=', myid
write(6,*) 'numnodes =', numnodes
write(6,*) 'mpi_sum =', allsum
write(6,*) 'y=...'
do i = 1, N
write(6,*) y(i)
enddo
write(6,*) 'sum(y)=', sum(y)
endif
deallocate( y_local )
call mpi_finalize(rc)
!========================================================================================================
Stop
End Program
Subroutine proc(w)
real*8, intent(in) :: w
common/sol/ x
real*8 x
x = w
Return
End Subroutine
{{< /highlight >}}
{{% /expand %}}
{{%expand "demo_c_mpi.c" %}}
{{< highlight c >}}
//demo_c_mpi
#include <stdio.h>
//======= MPI ========
#include "mpi.h"
#include <stdlib.h>
//====================
double proc(double w){
double x;
x = w;
return x;
}
int main(int argc, char* argv[]){
int N=20;
double w;
int i;
double x;
double y[N];
double sum;
//=============================== MPI ============================
int ind;
double *y_local;
int numnodes,myid,rc,ierr,start_local,end_local,N_local;
double allsum;
//================================================================
//=============================== MPI ============================
MPI_Init(&argc, &argv);
MPI_Comm_rank( MPI_COMM_WORLD, &myid );
MPI_Comm_size ( MPI_COMM_WORLD, &numnodes );
N_local = N/numnodes;
y_local=(double *) malloc(N_local*sizeof(double));
start_local = N_local*myid + 1;
end_local = N_local*myid + N_local;
//================================================================
for (i = start_local; i <= end_local; i++){
w = i*1e0;
x = proc(w);
ind = i - N_local*myid;
y_local[ind-1] = x;
// y[i-1] = x;
// printf("i,x= %d %lf\n", i, y[i-1]) ;
}
sum = 0e0;
for (i = 1; i<= N_local; i++){
sum = sum + y_local[i-1];
}
// printf("sum(y)= %lf\n", sum);
//====================================== MPI ===========================================
MPI_Reduce( &sum, &allsum, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD );
MPI_Gather( &y_local[0], N_local, MPI_DOUBLE, &y[0], N_local, MPI_DOUBLE, 0, MPI_COMM_WORLD );
if (myid == 0){
printf("-----------------------------------\n");
printf("*Final output from... myid= %d\n", myid);
printf("numnodes = %d\n", numnodes);
printf("mpi_sum = %lf\n", allsum);
printf("y=...\n");
for (i = 1; i <= N; i++){
printf("%lf\n", y[i-1]);
}
sum = 0e0;
for (i = 1; i<= N; i++){
sum = sum + y[i-1];
}
printf("sum(y) = %lf\n", sum);
}
free( y_local );
MPI_Finalize ();
//======================================================================================
return 0;
}
{{< /highlight >}}
{{% /expand %}}
---
#### Compiling the Code
The compiling of a MPI code requires first loading a compiler "engine"
such as `gcc`, `intel`, or `pgi` and then loading a MPI wrapper
`openmpi`. Here we will use the GNU Complier Collection, `gcc`, for
demonstration.
{{< highlight bash >}}
$ module load compiler/gcc/6.1 openmpi/2.1
$ mpif90 demo_f_mpi.f90 -o demo_f_mpi.x
$ mpicc demo_c_mpi.c -o demo_c_mpi.x
{{< /highlight >}}
The above commends load the `gcc` complier with the `openmpi` wrapper.
The compiling commands `mpif90` or `mpicc` are used to compile the codes
to`.x` files (executables).
### Creating a Submit Script
Create a submit script to request 5 cores (with `--ntasks`). A parallel
execution command `mpirun ./` needs to enter to last line before the
main program name.
{{% panel header="`submit_f.mpi`"%}}
{{< highlight bash >}}
#!/bin/sh
#SBATCH --ntasks=5
#SBATCH --mem-per-cpu=1024
#SBATCH --time=00:01:00
#SBATCH --job-name=Fortran
#SBATCH --error=Fortran.%J.err
#SBATCH --output=Fortran.%J.out
mpirun ./demo_f_mpi.x
{{< /highlight >}}
{{% /panel %}}
{{% panel header="`submit_c.mpi`"%}}
{{< highlight bash >}}
#!/bin/sh
#SBATCH --ntasks=5
#SBATCH --mem-per-cpu=1024
#SBATCH --time=00:01:00
#SBATCH --job-name=C
#SBATCH --error=C.%J.err
#SBATCH --output=C.%J.out
mpirun ./demo_c_mpi.x
{{< /highlight >}}
{{% /panel %}}
#### Submit the Job
The job can be submitted through the command `sbatch`. The job status
can be monitored by entering `squeue` with the `-u` option.
{{< highlight bash >}}
$ sbatch submit_f.mpi
$ sbatch submit_c.mpi
$ squeue -u <username>
{{< /highlight >}}
Replace `<username>` with your HCC username.
Sample Output
-------------
The sum from 1 to 20 is computed and printed to the `.out` file (see
below). The outputs from the 5 cores are collected and processed by the
master core (i.e. `myid=0`).
{{%expand "Fortran.out" %}}
{{< highlight batchfile>}}
-----------------------------------------
*Final output from... myid= 0
numnodes = 5
mpi_sum = 210.00000000000000
y=...
1.0000000000000000
2.0000000000000000
3.0000000000000000
4.0000000000000000
5.0000000000000000
6.0000000000000000
7.0000000000000000
8.0000000000000000
9.0000000000000000
10.000000000000000
11.000000000000000
12.000000000000000
13.000000000000000
14.000000000000000
15.000000000000000
16.000000000000000
17.000000000000000
18.000000000000000
19.000000000000000
20.000000000000000
sum(y)= 210.00000000000000
{{< /highlight >}}
{{% /expand %}}
{{%expand "C.out" %}}
{{< highlight batchfile>}}
-----------------------------------
*Final output from... myid= 0
numnodes = 5
mpi_sum = 210.000000
y=...
1.000000
2.000000
3.000000
4.000000
5.000000
6.000000
7.000000
8.000000
9.000000
10.000000
11.000000
12.000000
13.000000
14.000000
15.000000
16.000000
17.000000
18.000000
19.000000
20.000000
sum(y) = 210.000000
{{< /highlight >}}
{{% /expand %}}
+++
title = "Running Applications"
description = "How to run various applications on HCC resources."
weight = "20"
+++
# Using Installed Software
HCC Clusters use the Lmod module system to manage applications. You can search, view and load installed software with the `module` command.
## Available Software
## Using Modules
### Searching Available Modules
### Loading Modules
### Unloading Modules
# Installing Software
## Compiling from Source Code
## Using Anaconda
## Request Installation
{{% children %}}
+++
title = "Submitting Jobs"
description = "How to submit jobs to HCC resources"
weight = "10"
+++
Crane and Tusker are managed by
the [SLURM](https://slurm.schedmd.com) resource manager.
In order to run processing on Crane or Tusker, you
must create a SLURM script that will run your processing. After
submitting the job, SLURM will schedule your processing on an available
worker node.
Before writing a submit file, you may need to
[compile your application]({{< relref "/guides/running_applications/compiling_source_code" >}}).
- [Ensure proper working directory for job output](#ensure-proper-working-directory-for-job-output)
- [Creating a SLURM Submit File](#creating-a-slurm-submit-file)
- [Submitting the job](#submitting-the-job)
- [Checking Job Status](#checking-job-status)
- [Checking Job Start](#checking-job-start)
- [Next Steps](#next-steps)
### Ensure proper working directory for job output
{{% notice info %}}
Because the /home directories are not writable from the worker nodes, all SLURM job output should be directed to your /work path.
{{% /notice %}}
{{% panel theme="info" header="Manual specification of /work path" %}}
{{< highlight bash >}}
$ cd /work/[groupname]/[username]
{{< /highlight >}}
{{% /panel %}}
The environment variable `$WORK` can also be used.
{{% panel theme="info" header="Using environment variable for /work path" %}}
{{< highlight bash >}}
$ cd $WORK
$ pwd
/work/[groupname]/[username]
{{< /highlight >}}
{{% /panel %}}
Review how /work differs from /home [here.]({{< relref "/guides/handling_data/_index.md" >}})
### Creating a SLURM Submit File
{{% notice info %}}
The below example is for a serial job. For submitting MPI jobs, please
look at the [MPI Submission Guide.]({{< relref "submitting_an_mpi_job" >}})
{{% /notice %}}
A SLURM submit file is broken into 2 sections, the job description and
the processing. SLURM job description are prepended with `#SBATCH` in
the submit file.
**SLURM Submit File**
{{< highlight batch >}}
#!/bin/sh
#SBATCH --time=03:15:00 # Run time in hh:mm:ss
#SBATCH --mem-per-cpu=1024 # Maximum memory required per CPU (in megabytes)
#SBATCH --job-name=hello-world
#SBATCH --error=/work/[groupname]/[username]/job.%J.err
#SBATCH --output=/work/[groupname]/[username]/job.%J.out
module load example/test
hostname
sleep 60
{{< /highlight >}}
- **time**
Maximum walltime the job can run. After this time has expired, the
job will be stopped.
- **mem-per-cpu**
Memory that is allocated per core for the job. If you exceed this
memory limit, your job will be stopped.
- **mem**
Specify the real memory required per node in MegaBytes. If you
exceed this limit, your job will be stopped. Note that for you
should ask for less memory than each node actually has. For
instance, Tusker has 1TB, 512GB and 256GB of RAM per node. You may
only request 1000GB of RAM for the 1TB node, 500GB of RAM for the
512GB nodes, and 250GB of RAM for the 256GB nodes. For Crane, the
max is 500GB.
- **job-name**
The name of the job. Will be reported in the job listing.
- **partition**
The partition the job should run in. Partitions determine the job's
priority and on what nodes the partition can run on. See the
[Partitions]({{< relref "guides/submitting_jobs/partitions" >}}) page for a list of possible partitions.
- **error**
Location of the stderr will be written for the job. `[groupname]`
and `[username]` should be replaced your group name and username.
Your username can be retrieved with the command `id -un` and your
group with `id -ng`.
- **output**
Location of the stdout will be written for the job.
More advanced submit commands can be found on the [SLURM Docs](https://slurm.schedmd.com/sbatch.html).
You can also find an example of a MPI submission on [Submitting an MPI Job]({{< relref "submitting_an_mpi_job" >}}).
### Submitting the job
Submitting the SLURM job is done by command `sbatch`. SLURM will read
the submit file, and schedule the job according to the description in
the submit file.
Submitting the job described above is:
{{% panel theme="info" header="SLURM Submission" %}}
{{< highlight batch >}}
$ sbatch example.slurm
Submitted batch job 24603
{{< /highlight >}}
{{% /panel %}}
The job was successfully submitted.
### Checking Job Status
Job status is found with the command `squeue`. It will provide
information such as:
- The State of the job:
- **R** - Running
- **PD** - Pending - Job is awaiting resource allocation.
- Additional codes are available
on the [squeue](http://slurm.schedmd.com/squeue.html)
page.
- Job Name
- Run Time
- Nodes running the job
Checking the status of the job is easiest by filtering by your username,
using the `-u` option to squeue.
{{< highlight batch >}}
$ squeue -u <username>
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
24605 batch hello-wo <username> R 0:56 1 b01
{{< /highlight >}}
Additionally, if you want to see the status of a specific partition, for
example if you are part of a [partition]({{< relref "partitions" >}}),
you can use the `-p` option to `squeue`:
{{< highlight batch >}}
$ squeue -p esquared
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
73435 esquared MyRandom tingting R 10:35:20 1 ri19n10
73436 esquared MyRandom tingting R 10:35:20 1 ri19n12
73735 esquared SW2_driv hroehr R 10:14:11 1 ri20n07
73736 esquared SW2_driv hroehr R 10:14:11 1 ri20n07
{{< /highlight >}}
#### Checking Job Start
You may view the start time of your job with the
command `squeue --start`. The output of the command will show the
expected start time of the jobs.
{{< highlight batch >}}
$ squeue --start --user lypeng
JOBID PARTITION NAME USER ST START_TIME NODES NODELIST(REASON)
5822 batch Starace lypeng PD 2013-06-08T00:05:09 3 (Priority)
5823 batch Starace lypeng PD 2013-06-08T00:07:39 3 (Priority)
5824 batch Starace lypeng PD 2013-06-08T00:09:09 3 (Priority)
5825 batch Starace lypeng PD 2013-06-08T00:12:09 3 (Priority)
5826 batch Starace lypeng PD 2013-06-08T00:12:39 3 (Priority)
5827 batch Starace lypeng PD 2013-06-08T00:12:39 3 (Priority)
5828 batch Starace lypeng PD 2013-06-08T00:12:39 3 (Priority)
5829 batch Starace lypeng PD 2013-06-08T00:13:09 3 (Priority)
5830 batch Starace lypeng PD 2013-06-08T00:13:09 3 (Priority)
5831 batch Starace lypeng PD 2013-06-08T00:14:09 3 (Priority)
5832 batch Starace lypeng PD N/A 3 (Priority)
{{< /highlight >}}
The output shows the expected start time of the jobs, as well as the
reason that the jobs are currently idle (in this case, low priority of
the user due to running numerous jobs already).
#### Removing the Job
Removing the job is done with the `scancel` command. The only argument
to the `scancel` command is the job id. For the job above, the command
is:
{{< highlight batch >}}
$ scancel 24605
{{< /highlight >}}
### Next Steps
{{% children %}}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment