Created running_apps and submitting_jobs pages

e699d6ae · Carrie A Brown · 63994f95 · e699d6ae · e699d6ae · e699d6ae
Commit e699d6ae authored 6 years ago by Carrie A Brown
--- a/content/guides/running_applications/how_to_setup_x11_forwarding.md
+++ b/content/guides/running_applications/how_to_setup_x11_forwarding.md
+++
+title = "How to setup X11 forwarding"
+description = "Use X11 forwarding to view GUI programs remotely"
+weight = "35"
+++
+##### If you are connecting to HCC clusters via a PC running Windows, please take the following steps to setup X11 forwarding.
+1.  Download Xming to your local PC and install. Download
+    link: https://downloads.sourceforge.net/project/xming/Xming/6.9.0.31/Xming-6-9-0-31-setup.exe
+2.  Download PuTTY to your local PC and install.  Download link: http://the.earth.li/~sgtatham/putty/latest/x86/putty.exe
+3.  Open Xming and keep it running in the background.
+4.  Configure PuTTY as below:  
+    {{< figure src="/images/11637370.png" height="400" >}}
+    {{< figure src="/images/11637371.jpg" height="400" >}}
+5.  To test your X11 setup, after login, type command `xeyes` and press
+    enter.  
+    {{< figure src="/images/11637372.png" height="400" >}}
+6.  Close the xeyes application by "Ctrl + c" from the terminal or click
+    the close button on the up-right corner of the graphical window.
+##### If you are connecting to HCC clusters via a Macintosh, please take the following steps to setup X11 forwarding.
+- Check the OS version on your Mac, if it's below 10.8., you can simply type `ssh -Y username@hostname` in your terminal to login.
+- If your OS version is newer than 10.8, please do the following:
+  1. Download and install XQuartz.
+    Download link: https://dl.bintray.com/xquartz/downloads/XQuartz-2.7.11.dmg
+  2. Type `ssh -Y username@hostname` in your terminal to login.
+  3. To test your X11 setup, after login, type command "xeyes" and press
+    enter. 
+    {{< figure src="/images/11637374.png" height="400" >}} 
+  4. Close the xeyes application by "Control + c" from the terminal or
+    click the close button on the up-left corner of the graphical
+    window.  
+##### If you are connecting to HCC clusters via a Linux laptop, please take the following steps to setup X11 forwarding.
+1.  Open the remote client terminal.
+2.  Type `ssh -Y username@hostname`" in your terminal to login.
+3.  To test your X11 setup, after login, type command "xeyes" and press
+    enter.
+4.  Close the xeyes application by "Ctrl + c" from the terminal or click
+    the close button on the up-right corner of the graphical window.
+#### Related articles
+[X11 on Windows](http://www.straightrunning.com/XmingNotes)
+[X11 on Mac](https://en.wikipedia.org/wiki/XQuartz)
+[X11 on Linux](http://www.wikihow.com/Configure-X11-in-Linux)
--- a/content/guides/running_applications/mpi_jobs_on_hcc.md
+++ b/content/guides/running_applications/mpi_jobs_on_hcc.md
+++
+title = "MPI Jobs on HCC"
+description = "How to compile and run MPI programs on HCC machines"
+weight = "52"
+++
+This quick start demonstrates how to implement a parallel (MPI)
+Fortran/C program on HCC supercomputers. The sample codes and submit
+scripts can be downloaded from [mpi_dir.zip](/attachments/mpi_dir.zip).
+#### Login to a HCC Cluster
+Log in to a HCC cluster through PuTTY ([For Windows Users]({{< relref "/quickstarts/for_windows_users">}})) or Terminal ([For Mac/Linux
+Users]({{< relref "/quickstarts/for_maclinux_users">}})) and make a subdirectory called `mpi_dir` under the `$WORK` directory.
+{{< highlight bash >}}
+$ cd $WORK
+$ mkdir mpi_dir
+{{< /highlight >}}
+In the subdirectory `mpi_dir`, save all the relevant codes. Here we
+include two demo programs, `demo_f_mpi.f90` and `demo_c_mpi.c`, that
+compute the sum from 1 to 20 through parallel processes. A
+straightforward parallelization scheme is used for demonstration
+purpose. First, the master core (i.e. `myid=0`) distributes equal
+computation workload to a certain number of cores (as specified by
+`--ntasks `in the submit script). Then, each worker core computes a
+partial summation as output. Finally, the master core collects the
+outputs from all worker cores and perform an overall summation. For easy
+comparison with the serial code ([Fortran/C on HCC]({{< relref "fortran_c_on_hcc">}})), the
+added lines in the parallel code (MPI) are marked with "!=" or "//=".
+{{%expand "demo_f_mpi.f90" %}}
+{{< highlight fortran >}}
+Program demo_f_mpi
+!====== MPI =====
+    use mpi     
+!================
+    implicit none
+    integer, parameter :: N = 20
+    real*8 w
+    integer i
+    common/sol/ x
+    real*8 x
+    real*8, dimension(N) :: y 
+!============================== MPI =================================
+    integer ind
+    real*8, dimension(:), allocatable :: y_local                    
+    integer numnodes,myid,rc,ierr,start_local,end_local,N_local     
+    real*8 allsum                                                   
+!====================================================================
+!============================== MPI =================================
+    call mpi_init( ierr )                                           
+    call mpi_comm_rank ( mpi_comm_world, myid, ierr )               
+    call mpi_comm_size ( mpi_comm_world, numnodes, ierr )           
+                                                                                                                                        !
+    N_local = N/numnodes                                            
+    allocate ( y_local(N_local) )                                   
+    start_local = N_local*myid + 1                                  
+    end_local =  N_local*myid + N_local                             
+!====================================================================
+    do i = start_local, end_local
+        w = i*1d0
+        call proc(w)
+        ind = i - N_local*myid
+        y_local(ind) = x
+!       y(i) = x
+!       write(6,*) 'i, y(i)', i, y(i)
+    enddo   
+!       write(6,*) 'sum(y) =',sum(y)
+!============================================== MPI =====================================================
+    call mpi_reduce( sum(y_local), allsum, 1, mpi_real8, mpi_sum, 0, mpi_comm_world, ierr )             
+    call mpi_gather ( y_local, N_local, mpi_real8, y, N_local, mpi_real8, 0, mpi_comm_world, ierr )     
+    if (myid == 0) then                                                                                 
+        write(6,*) '-----------------------------------------'                                          
+        write(6,*) '*Final output from... myid=', myid                                                  
+        write(6,*) 'numnodes =', numnodes                                                               
+        write(6,*) 'mpi_sum =', allsum  
+        write(6,*) 'y=...'
+        do i = 1, N
+            write(6,*) y(i)
+        enddo                                                                                       
+        write(6,*) 'sum(y)=', sum(y)                                                                
+    endif                                                                                               
+    deallocate( y_local )                                                                               
+    call mpi_finalize(rc)                                                                               
+!========================================================================================================
+Stop
+End Program
+Subroutine proc(w)
+    real*8, intent(in) :: w
+    common/sol/ x
+    real*8 x
+    x = w
+Return
+End Subroutine
+{{< /highlight >}}
+{{% /expand %}}
+{{%expand "demo_c_mpi.c" %}}
+{{< highlight c >}}
+//demo_c_mpi
+#include <stdio.h>
+//======= MPI ========
+#include "mpi.h"    
+#include <stdlib.h>   
+//====================
+double proc(double w){
+        double x;       
+        x = w;  
+        return x;
+}
+int main(int argc, char* argv[]){
+    int N=20;
+    double w;
+    int i;
+    double x;
+    double y[N];
+    double sum;
+//=============================== MPI ============================
+    int ind;                                                    
+    double *y_local;                                            
+    int numnodes,myid,rc,ierr,start_local,end_local,N_local;    
+    double allsum;                                              
+//================================================================
+//=============================== MPI ============================
+    MPI_Init(&argc, &argv);
+    MPI_Comm_rank( MPI_COMM_WORLD, &myid );
+    MPI_Comm_size ( MPI_COMM_WORLD, &numnodes );
+    N_local = N/numnodes;
+    y_local=(double *) malloc(N_local*sizeof(double));
+    start_local = N_local*myid + 1;
+    end_local = N_local*myid + N_local;
+//================================================================
+    for (i = start_local; i <= end_local; i++){        
+        w = i*1e0;
+        x = proc(w);
+        ind = i - N_local*myid;
+        y_local[ind-1] = x;
+//      y[i-1] = x;
+//      printf("i,x= %d %lf\n", i, y[i-1]) ;
+    }
+    sum = 0e0;
+    for (i = 1; i<= N_local; i++){
+        sum = sum + y_local[i-1];   
+    }
+//  printf("sum(y)= %lf\n", sum);    
+//====================================== MPI ===========================================
+    MPI_Reduce( &sum, &allsum, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD );
+    MPI_Gather( &y_local[0], N_local, MPI_DOUBLE, &y[0], N_local, MPI_DOUBLE, 0, MPI_COMM_WORLD );
+    if (myid == 0){
+    printf("-----------------------------------\n");
+    printf("*Final output from... myid= %d\n", myid);
+    printf("numnodes = %d\n", numnodes);
+    printf("mpi_sum = %lf\n", allsum);
+    printf("y=...\n");
+    for (i = 1; i <= N; i++){
+        printf("%lf\n", y[i-1]);
+    }   
+    sum = 0e0;
+    for (i = 1; i<= N; i++){
+        sum = sum + y[i-1]; 
+    }
+    printf("sum(y) = %lf\n", sum);
+    }
+    free( y_local );
+    MPI_Finalize ();
+//======================================================================================        
+return 0;
+}
+{{< /highlight >}}
+{{% /expand %}}
+---
+#### Compiling the Code
+The compiling of a MPI code requires first loading a compiler "engine"
+such as `gcc`, `intel`, or `pgi` and then loading a MPI wrapper
+`openmpi`. Here we will use the GNU Complier Collection, `gcc`, for
+demonstration.
+{{< highlight bash >}}
+$ module load compiler/gcc/6.1 openmpi/2.1
+$ mpif90 demo_f_mpi.f90 -o demo_f_mpi.x  
+$ mpicc demo_c_mpi.c -o demo_c_mpi.x
+{{< /highlight >}}
+The above commends load the `gcc` complier with the `openmpi` wrapper.
+The compiling commands `mpif90` or `mpicc` are used to compile the codes
+to`.x` files (executables). 
+### Creating a Submit Script
+Create a submit script to request 5 cores (with `--ntasks`). A parallel
+execution command `mpirun ./` needs to enter to last line before the
+main program name.
+{{% panel header="`submit_f.mpi`"%}}
+{{< highlight bash >}}
+#!/bin/sh
+#SBATCH --ntasks=5
+#SBATCH --mem-per-cpu=1024
+#SBATCH --time=00:01:00
+#SBATCH --job-name=Fortran
+#SBATCH --error=Fortran.%J.err
+#SBATCH --output=Fortran.%J.out
+mpirun ./demo_f_mpi.x 
+{{< /highlight >}}
+{{% /panel %}}
+{{% panel header="`submit_c.mpi`"%}}
+{{< highlight bash >}}
+#!/bin/sh
+#SBATCH --ntasks=5
+#SBATCH --mem-per-cpu=1024
+#SBATCH --time=00:01:00
+#SBATCH --job-name=C
+#SBATCH --error=C.%J.err
+#SBATCH --output=C.%J.out
+mpirun ./demo_c_mpi.x 
+{{< /highlight >}}
+{{% /panel %}}
+#### Submit the Job
+The job can be submitted through the command `sbatch`. The job status
+can be monitored by entering `squeue` with the `-u` option.
+{{< highlight bash >}}
+$ sbatch submit_f.mpi
+$ sbatch submit_c.mpi
+$ squeue -u <username>
+{{< /highlight >}}
+Replace `<username>` with your HCC username.
+Sample Output
+-------------
+The sum from 1 to 20 is computed and printed to the `.out` file (see
+below). The outputs from the 5 cores are collected and processed by the
+master core (i.e. `myid=0`).
+{{%expand "Fortran.out" %}}
+{{< highlight batchfile>}}
+ -----------------------------------------
+ *Final output from... myid=           0
+ numnodes =           5
+ mpi_sum =   210.00000000000000     
+ y=...
+   1.0000000000000000     
+   2.0000000000000000     
+   3.0000000000000000     
+   4.0000000000000000     
+   5.0000000000000000     
+   6.0000000000000000     
+   7.0000000000000000     
+   8.0000000000000000     
+   9.0000000000000000     
+   10.000000000000000     
+   11.000000000000000     
+   12.000000000000000     
+   13.000000000000000     
+   14.000000000000000     
+   15.000000000000000     
+   16.000000000000000     
+   17.000000000000000     
+   18.000000000000000     
+   19.000000000000000     
+   20.000000000000000     
+ sum(y)=   210.00000000000000     
+{{< /highlight >}}
+{{% /expand %}} 
+{{%expand "C.out" %}}
+{{< highlight batchfile>}}
+-----------------------------------
+*Final output from... myid= 0
+numnodes = 5
+mpi_sum = 210.000000
+y=...
+1.000000
+2.000000
+3.000000
+4.000000
+5.000000
+6.000000
+7.000000
+8.000000
+9.000000
+10.000000
+11.000000
+12.000000
+13.000000
+14.000000
+15.000000
+16.000000
+17.000000
+18.000000
+19.000000
+20.000000
+sum(y) = 210.000000
+{{< /highlight >}}
+{{% /expand %}}
--- a/content/quickstarts/running_applications.md
+++ b/content/quickstarts/running_applications.md
+++
+title = "Running Applications"
+description = "How to run various applications on HCC resources."
+weight = "20"
+++
+# Using Installed Software
+HCC Clusters use the Lmod module system to manage applications. You can search, view and load installed software with the `module` command.
+## Available Software
+## Using Modules
+### Searching Available Modules
+### Loading Modules
+### Unloading Modules
+# Installing Software
+## Compiling from Source Code
+## Using Anaconda
+## Request Installation
+{{% children %}}
--- a/content/quickstarts/submitting_jobs.md
+++ b/content/quickstarts/submitting_jobs.md
+++
+title = "Submitting Jobs"
+description =  "How to submit jobs to HCC resources"
+weight = "10"
+++
+Crane and Tusker are managed by
+the [SLURM](https://slurm.schedmd.com) resource manager.  
+In order to run processing on Crane or Tusker, you
+must create a SLURM script that will run your processing. After
+submitting the job, SLURM will schedule your processing on an available
+worker node.
+Before writing a submit file, you may need to
+[compile your application]({{< relref "/guides/running_applications/compiling_source_code" >}}).
+- [Ensure proper working directory for job output](#ensure-proper-working-directory-for-job-output)
+- [Creating a SLURM Submit File](#creating-a-slurm-submit-file)
+- [Submitting the job](#submitting-the-job)
+- [Checking Job Status](#checking-job-status)
+  -   [Checking Job Start](#checking-job-start)
+- [Next Steps](#next-steps)
+### Ensure proper working directory for job output
+{{% notice info %}}
+Because the /home directories are not writable from the worker nodes, all SLURM job output should be directed to your /work path.
+{{% /notice %}}
+{{% panel theme="info" header="Manual specification of /work path" %}}
+{{< highlight bash >}}
+$ cd /work/[groupname]/[username]
+{{< /highlight >}}
+{{% /panel %}}
+The environment variable `$WORK` can also be used.
+{{% panel theme="info" header="Using environment variable for /work path" %}}
+{{< highlight bash >}}
+$ cd $WORK
+$ pwd
+/work/[groupname]/[username]
+{{< /highlight >}}
+{{% /panel %}}
+Review how /work differs from /home [here.]({{< relref "/guides/handling_data/_index.md" >}})
+### Creating a SLURM Submit File
+{{% notice info %}}
+The below example is for a serial job. For submitting MPI jobs, please
+look at the [MPI Submission Guide.]({{< relref "submitting_an_mpi_job" >}})
+{{% /notice %}}
+A SLURM submit file is broken into 2 sections, the job description and
+the processing.  SLURM job description are prepended with `#SBATCH` in
+the submit file.
+**SLURM Submit File**
+{{< highlight batch >}}
+#!/bin/sh
+#SBATCH --time=03:15:00          # Run time in hh:mm:ss
+#SBATCH --mem-per-cpu=1024       # Maximum memory required per CPU (in megabytes)
+#SBATCH --job-name=hello-world
+#SBATCH --error=/work/[groupname]/[username]/job.%J.err
+#SBATCH --output=/work/[groupname]/[username]/job.%J.out
+module load example/test
+hostname
+sleep 60
+{{< /highlight >}}
+- **time**  
+  Maximum walltime the job can run.  After this time has expired, the
+  job will be stopped.
+- **mem-per-cpu**  
+  Memory that is allocated per core for the job.  If you exceed this
+  memory limit, your job will be stopped.
+- **mem**  
+  Specify the real memory required per node in MegaBytes. If you
+  exceed this limit, your job will be stopped. Note that for you
+  should ask for less memory than each node actually has. For
+  instance, Tusker has 1TB, 512GB and 256GB of RAM per node. You may
+  only request 1000GB of RAM for the 1TB node, 500GB of RAM for the
+  512GB nodes, and 250GB of RAM for the 256GB nodes. For Crane, the
+  max is 500GB.
+- **job-name**
+  The name of the job.  Will be reported in the job listing.
+- **partition**  
+  The partition the job should run in.  Partitions determine the job's
+  priority and on what nodes the partition can run on.  See the
+  [Partitions]({{< relref "guides/submitting_jobs/partitions" >}}) page for a list of possible partitions.
+- **error**  
+  Location of the stderr will be written for the job.  `[groupname]`
+  and `[username]` should be replaced your group name and username.
+  Your username can be retrieved with the command `id -un` and your
+  group with `id -ng`.
+- **output**  
+  Location of the stdout will be written for the job.
+More advanced submit commands can be found on the [SLURM Docs](https://slurm.schedmd.com/sbatch.html).
+You can also find an example of a MPI submission on [Submitting an MPI Job]({{< relref "submitting_an_mpi_job" >}}).
+### Submitting the job
+Submitting the SLURM job is done by command `sbatch`.  SLURM will read
+the submit file, and schedule the job according to the description in
+the submit file.
+Submitting the job described above is:
+{{% panel theme="info" header="SLURM Submission" %}}
+{{< highlight batch >}}
+$ sbatch example.slurm
+Submitted batch job 24603
+{{< /highlight >}}
+{{% /panel %}}
+The job was successfully submitted.
+### Checking Job Status
+Job status is found with the command `squeue`.  It will provide
+information such as:
+- The State of the job: 
+    - **R** - Running
+    - **PD** - Pending - Job is awaiting resource allocation.
+    - Additional codes are available
+      on the [squeue](http://slurm.schedmd.com/squeue.html)
+      page.
+- Job Name
+- Run Time
+- Nodes running the job
+Checking the status of the job is easiest by filtering by your username,
+using the `-u` option to squeue.
+{{< highlight batch >}}
+$ squeue -u <username>
+  JOBID PARTITION     NAME       USER  ST       TIME  NODES NODELIST(REASON)
+  24605     batch hello-wo <username>   R       0:56      1 b01
+{{< /highlight >}}
+Additionally, if you want to see the status of a specific partition, for
+example if you are part of a [partition]({{< relref "partitions" >}}),
+you can use the `-p` option to `squeue`:
+{{< highlight batch >}}
+$ squeue -p esquared
+  JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
+  73435  esquared MyRandom tingting   R   10:35:20      1 ri19n10
+  73436  esquared MyRandom tingting   R   10:35:20      1 ri19n12
+  73735  esquared SW2_driv   hroehr   R   10:14:11      1 ri20n07
+  73736  esquared SW2_driv   hroehr   R   10:14:11      1 ri20n07
+{{< /highlight >}}
+#### Checking Job Start
+You may view the start time of your job with the
+command `squeue --start`.  The output of the command will show the
+expected start time of the jobs.
+{{< highlight batch >}}
+$ squeue --start --user lypeng
+  JOBID PARTITION     NAME     USER  ST           START_TIME  NODES NODELIST(REASON)
+   5822     batch  Starace   lypeng  PD  2013-06-08T00:05:09      3 (Priority)
+   5823     batch  Starace   lypeng  PD  2013-06-08T00:07:39      3 (Priority)
+   5824     batch  Starace   lypeng  PD  2013-06-08T00:09:09      3 (Priority)
+   5825     batch  Starace   lypeng  PD  2013-06-08T00:12:09      3 (Priority)
+   5826     batch  Starace   lypeng  PD  2013-06-08T00:12:39      3 (Priority)
+   5827     batch  Starace   lypeng  PD  2013-06-08T00:12:39      3 (Priority)
+   5828     batch  Starace   lypeng  PD  2013-06-08T00:12:39      3 (Priority)
+   5829     batch  Starace   lypeng  PD  2013-06-08T00:13:09      3 (Priority)
+   5830     batch  Starace   lypeng  PD  2013-06-08T00:13:09      3 (Priority)
+   5831     batch  Starace   lypeng  PD  2013-06-08T00:14:09      3 (Priority)
+   5832     batch  Starace   lypeng  PD                  N/A      3 (Priority)
+{{< /highlight >}}
+The output shows the expected start time of the jobs, as well as the
+reason that the jobs are currently idle (in this case, low priority of
+the user due to running numerous jobs already).
+#### Removing the Job
+Removing the job is done with the `scancel` command.  The only argument
+to the `scancel` command is the job id.  For the job above, the command
+is:
+{{< highlight batch >}}
+$ scancel 24605
+{{< /highlight >}}
+### Next Steps
+{{% children  %}}