Commit 197127fe authored by Adam Caprez's avatar Adam Caprez
Browse files

Running applications

parent f225e664
......@@ -4,8 +4,4 @@ description = "How to run various applications on HCC resources."
weight = "20"
+++
Created by <span class="author"> Adam Caprez</span>, last modified on
Nov 30, 2016
{{% children %}}
1. [HCC-DOCS](index.html)
2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
3. [HCC Documentation](HCC-Documentation_332651.html)
4. [Running Applications](Running-Applications_7471153.html)
<span id="title-text"> HCC-DOCS : Allinea Profiling & Debugging Tools </span>
=============================================================================
Created by <span class="author"> Adam Caprez</span>, last modified on
Jun 28, 2016
+++
title = "Allinea Profiling & Debugging Tools"
description = "How to use the Allinea suite of tools for profiling and debugging."
+++
HCC provides both the Allinea Forge suite and Performance Reports to
assist with debugging and profiling C/C++/Fortran code.  These tools
......@@ -19,9 +13,6 @@ easy-to-read single-page HTML report.
For information on using each tool, see the following pages.
[Using Allinea Forge via Reverse
Connect](Using-Allinea-Forge-via-Reverse-Connect_16516461.html)
[Allinea Performance Reports](Allinea-Performance-Reports_11635289.html)
[Using Allinea Forge via Reverse Connect]({{< relref "using_allinea_forge_via_reverse_connect" >}})
[Allinea Performance Reports]({{< relref "allinea_performance_reports" >}})
1. [HCC-DOCS](index.html)
2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
3. [HCC Documentation](HCC-Documentation_332651.html)
4. [Running Applications](Running-Applications_7471153.html)
5. [Allinea Profiling & Debugging Tools](16516466.html)
+++
title = "Allinea Performance Reports"
description = "How to use Allinea Performance Reports to profile application on HCC resources."
+++
<span id="title-text"> HCC-DOCS : Allinea Performance Reports </span>
=====================================================================
Created by <span class="author"> Adam Caprez</span>, last modified on
Jul 10, 2015
| Name | Version | Resource |
|---------|---------|----------|
| allinea | 4.2 | tusker |
| allinea | 5.0 | tusker |
| | | |
|---------|-----|-------|
| allinea | 4.2 | crane |
| allinea | 5.0 | crane |
 
<a href="http://www.allinea.com/products/allinea-performance-reports" class="external-link">Allinea Performance Reports</a> is
a performance evaluation tool that provides a scalable and effective way
[Allinea Performance Reports](https://www.arm.com/products/development-tools/server-and-hpc/performance-reports)
is a performance evaluation tool that provides a scalable and effective way
to understand and analyze the performance of applications executed on
high-performance systems. Allinea Performance Reports can be used with
any application - no source code, recompilation or instrumentation is
......@@ -42,7 +23,7 @@ and suggestions of possible application improvements. Allinea
Performance Reports has low runtime overhead of less than 5%.
Using Allinea Performance Reports on HCC
========================================
----------------------------------------
The Holland Computing Center owns **512 Allinea Performance Reports
licenses** that can be used to evaluate applications executed on Tusker
......@@ -51,67 +32,63 @@ In order to use Allinea Performance Reports on HCC, the appropriate
module needs to be loaded first. To load the module on Tusker or Crane,
use
| |
|-------------------------|
| module load allinea/5.0 |
{{< highlight bash >}}
module load allinea/5.0
{{< /highlight >}}
Once the module is loaded, Allinea Performance Reports runs by adding
the **perf-report** command in front of the standard application
the `perf-report` command in front of the standard application
command.
Basic Allinea Performance Reports Usage
---------------------------------------
The basic usage of **perf-report** is:
### Basic Allinea Performance Reports Usage
**perf-report usage**
The basic usage of ``perf-report` is:
``` syntaxhighlighter-pre
{{% panel theme="info" header="perf-report usage" %}}
{{< highlight bash >}}
perf-report [OPTION...] PROGRAM [PROGRAM_ARGS]
or
perf-report [OPTION...] (mpirun|mpiexec|aprun|...) [MPI_ARGS] PROGRAM [PROGRAM_ARGS]
```
{{< /highlight >}}
{{% /panel %}}
For example, the command below shows how to run **perf-report** with the
application **hello\_world**:
For example, the command below shows how to run `perf-report` with the
application `hello_world`:
**perf-report example**
``` syntaxhighlighter-pre
{{% panel theme="info" header="perf-report example" %}}
{{< highlight bash >}}
[<username>@login.tusker ~]$ perf-report ./hello-world
```
{{< /highlight >}}
{{% /panel %}}
Stdin redirection
<span
class="aui-icon aui-icon-small aui-iconfont-warning confluence-information-macro-icon"></span>
If your program normally uses the '&lt;' syntax to redirect standard in
{{% notice info %}}
If your program normally uses the '`<`' syntax to redirect standard in
to read from a file, you must use the `--input` option to the
`perf-report `command instead.
{{% /notice %}}
**perf-report stdin redirection**
``` syntaxhighlighter-pre
{{% panel theme="info" header="perf-report stdin redirection" %}}
{{< highlight bash >}}
[<username>@login.tusker ~]$ perf-report --input=my_input.txt ./hello-world
```
{{< /highlight >}}
{{% /panel %}}
Allinea Performance Reports Options
-----------------------------------
### Allinea Performance Reports Options
More **perf-report** options can be seen by using:
**perf-report options**
``` syntaxhighlighter-pre
{{% panel theme="info" header="perf-report options" %}}
{{< highlight bash >}}
[<username>@login.tusker ~]$ perf-report --help
```
{{< /highlight >}}
{{% /panel %}}
Some of the most useful options are:
**perf-report useful options**
``` syntaxhighlighter-pre
{{% panel theme="info" header="perf-report useful options" %}}
{{< highlight bash >}}
--input=FILE (pass the contents of FILE to the target's stdin)
--nompi, --no-mpi (run without MPI support)
--mpiargs=ARGUMENTS (command line arguments to pass to mpirun)
......@@ -119,48 +96,21 @@ Some of the most useful options are:
--openmp-threads=NUMTHREADS (configure the number of OpenMP threads for the target)
-n, --np, --processes=NUMPROCS (specify the number of MPI processes)
--procs-per-node=PROCS (configure the number of processes per node for MPI jobs)
```
<span style="line-height: 1.4285715;">
</span>
<span style="line-height: 1.4285715;">The following pages, </span>[Blast
with Allinea Performance
Reports](Blast-with-Allinea-Performance-Reports_11635295.html)<span
style="line-height: 1.4285715;">, </span>[Ray with Allinea Performance
Reports](Ray-with-Allinea-Performance-Reports_11635300.html) and [LAMMPS
with Allinea Performance
Reports](LAMMPS-with-Allinea-Performance-Reports_11635305.html), <span
style="line-height: 1.4285715;">show how to run Allinea Performance
Reports with applications using OpenMP, MPI and standard input/output
respectively.</span>
 
{{< /highlight >}}
{{% /panel %}}
<span
class="aui-icon aui-icon-small aui-iconfont-warning confluence-information-macro-icon"></span>
**Currently, Allinea Performance Reports works best with compiled
binaries of an application (for some perl/python files perf-report needs
to be added in the actual file).**
Attachments:
------------
<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
[cb\_allinea\_module.xsl](attachments/11635289/11635290.xsl)
(application/octet-stream)
<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
[crane\_allinea\_version.xsl](attachments/11635289/11635291.xsl)
(application/octet-stream)
<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
[crane\_modules.xml](attachments/11635289/11635292.xml)
(application/octet-stream)
<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
[tusker\_allinea\_version.xsl](attachments/11635289/11635293.xsl)
(application/octet-stream)
<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
[tusker\_modules.xml](attachments/11635289/11635294.xml)
(application/octet-stream)
The following pages
- [Blast with Allinea Performance Reports]({{< relref "blast_with_allinea_performance_reports" >}})
- [Ray with Allinea Performance Reports]({{< relref "ray_with_allinea_performance_reports" >}})
- [LAMMPS with Allinea Performance Reports]({{< relref "lammps_with_allinea_performance_reports" >}})
show how to run Allinea Performance Reports with applications using OpenMP, MPI and standard input/output
respectively.
 
{{% notice tip %}}
Currently, Allinea Performance Reports works best with compiled
binaries of an application (for some perl/python files perf-report needs
to be added in the actual file).
{{% /notice %}}
+++
title = "BLAST with Allinea Performance Reports"
description = "Example of how to profile BLAST using Allinea Performance Reports."
+++
Simple example of using
[BLAST]({{< relref "/guides/running_applications/bioinformatics_tools/alignment_tools/blast/running_blast_alignment" >}}) 
with Allinea Performance Reports (`perf-report`) on Crane is shown below:
{{% panel theme="info" header="blastn_perf_report.submit" %}}
{{< highlight batch >}}
#!/bin/sh
#SBATCH --job-name=BlastN
#SBATCH --nodes=1
#SBATCH --ntasks=16
#SBATCH --time=20:00:00
#SBATCH --mem=50gb
#SBATCH --output=BlastN.info
#SBATCH --error=BlastN.error
module load allinea
module load blast/2.2.29
cd $WORK/<project_folder>
cp -r /work/HCC/DATA/blastdb/nt/ /tmp/
cp input_reads.fasta /tmp/
perf-report --openmp-threads=$SLURM_NTASKS_PER_NODE --nompi `which blastn` \
-query /tmp/input_reads.fasta -db /tmp/nt/nt -out \
blastn_output.alignments -num_threads $SLURM_NTASKS_PER_NODE
cp blastn\_output.alignments .
{{< /highlight >}}
{{% /panel %}}
BLAST uses OpenMP and therefore the Allinea Performance Reports options
`--openmp-threads` and `--nompi` are used. The perf-report
part, `perf-report --openmp-threads=$SLURM_NTASKS_PER_NODE --nompi`,
is placed in front of the actual `blastn` command we want
to analyze.
{{% notice info %}}
If you see the error "**Allinea Performance Reports - target file
'application' does not exist on this machine... exiting**", this means
that instead of just using the executable '*application*', the full path
to that application is required. This is the reason why in the script
above, instead of using "*blastn*", we use *\`which blastn\`* which
gives the full path of the *blastn* executable.
{{% /notice %}}
When the application finishes, the performance report is generated in
the working directory.
For the executed application, this is how the report looks like:
{{< figure src="/images/11635296.png" width="850" >}}
From the report, we can see that **blastn** is Compute-Bound
application. The difference between mean (11.1 GB) and peak (26.3 GB)
memory is significant, and this may be sign of workload imbalance or a
memory leak. Moreover, 89.6% of the time is spent in synchronizing
threads in parallel regions which can lead to workload imbalance.
Running Allinea Performance Reports and identifying application
bottlenecks is really useful for improving the application and better
utilization of the available resources.
+++
title = "LAMMPS with Allinea Performance Reports"
description = "Example of how to profile LAMMPS using Allinea Performance Reports."
+++
Simple example of using [LAAMPS](http://lammps.sandia.gov)
with Allinea Performance Reports (`perf-report`) on Crane is shown
below:
{{% panel theme="info" header="lammps_perf_report.submit" %}}
{{< highlight batch >}}
#!/bin/sh
#SBATCH --job-name=LAMMPS
#SBATCH --ntasks=64
#SBATCH --time=12:00:00
#SBATCH --mem=2gb
#SBATCH --partition=batch
#SBATCH --output=Lammps.%J.info
#SBATCH --error=Lammps.%J.error
 
module load allinea
module load compiler/gcc/4.8 openmpi/1.8
module load lammps/30Oct2014
perf-report --input=in.copper --np=64 `which lmp_ompi_g++`
{{< /highlight >}}
{{% /panel %}}
LAMMPS runs on a single processor or in parallel using message-passing
interface, and therefore additional Allinea Performance Reports options
are not required. However, the input file for LAMMPS is read from
standard input. In this case, instead of using "`<`", Allinea
Performance Reports uses the option `--input` where the input file is
defined. Therefore, in the perf-report part we have `perf-report
--input=in.copper`, where `in.copper` is the input file used for
LAMMPS.
{{% notice info %}}
If you see the error "**Allinea Performance Reports - target file
'application' does not exist on this machine... exiting**", this means
that instead of just using the executable '*application*', the full path
to that application is required. This is the reason why in the script
above, instead of using "*lmp\_ompi\_g++*", we use *\`which
lmp\_ompi\_g++\`* which gives the full path of the *LAMMPS* executable.
{{% /notice %}}
When the application finishes, the performance report is generated in
the working directory.
For the executed application, this is how the report looks like:
{{< figure src="/images/11635309.png" width="850" >}}
From the report, we can see that **LAMMPS **is Compute-Bound
application. Most of the MPI communication is spent in collective calls
with a very low transfer rate that suggests load imbalance. Alos,
significant time is spent on memory accesses, and using a profiler may
help identify time-consuming loops and check their cache performance.
Running Allinea Performance Reports and identifying application
bottlenecks is really useful for improving the application and better
utilization of the available resources.
+++
title = "Ray with Allinea Performance Reports"
description = "Example of how to profile Ray using Allinea Performance Reports"
+++
Simple example of using [Ray]({{< relref "/guides/running_applications/bioinformatics_tools/de_novo_assembly_tools/ray" >}})
with Allinea PerformanceReports (`perf-report`) on Tusker is shown below:
{{% panel theme="info" header="ray_perf_report.submit" %}}
{{< highlight batch >}}
#!/bin/sh
#SBATCH --job-name=Ray
#SBATCH --ntasks-per-node=16
#SBATCH --time=10:00:00
#SBATCH --mem=70gb
#SBATCH --output=Ray.info
#SBATCH --error=Ray.error
module load allinea
module load compiler/gcc/4.7 openmpi/1.6 ray/2.3
perf-report mpiexec -n 16 Ray -k 31 -p -p input_reads_pair_1.fasta input_reads\_pair_2.fasta -o output_directory
{{< /highlight >}}
{{% /panel %}}
Ray is MPI and therefore additional Allinea Performance Reports options
are not required. The `perf-report` command is placed in front of the
actual `Ray` command we want to analyze.
When the application finishes, the performance report is generated in
the working directory.
For the executed application, this is how the report looks like:
{{< figure src="/images/11635303.png" width="850" >}}
From the report, we can see that **Ray **is Compute-Bound application.
Most of the running time is spent in point-to-point calls with a low
transfer rate which may be caused by inefficient message sizes.
Therefore, running this application with fewer MPI processes and more
data on each process may be more efficient.
Running Allinea Performance Reports and identifying application
bottlenecks is really useful for improving the application and better
utilization of the available resources.
+++
title = "Using Allinea Forge via Reverse Connect"
description = "How to use the Reverse Connect feature of Allinea Forge."
+++
### Setup the Allinea client software to use Reverse Connect
The Allinea DDT/MAP software supports a Reverse Connect feature.  The
GUI is installed and run locally, and information from the job running
on the cluster is sent back to your laptop/workstation via SSH.  This is
the recommended way to use the software for interactive
debugging/profiling of a job on HCC resources.  Traditional X11
forwarding will also work, but is not recommended as the interface can
be much slower to respond.  In order to follow along with the demos, use
these instructions to setup the Allinea client on your laptop.
First, download and install the remote client software for either
Windows or OS X from
[this page](https://developer.arm.com/products/software-development-tools/hpc/downloads/download-arm-forge#remote-client).
Alternatively, download the software directly for your OS:
[[OS X direct link]](http://content.allinea.com/downloads/arm-forge-client-latest-MacOSX-10.7.5-x86_64.dmg)
[[Windows 64-bit direct link]](http://content.allinea.com/downloads/arm-forge-client-latest-Windows-10.0-x64.exe)
Start the Allinea software, and choose *Configure...* from the *Remote
Launch* dropdown menu.
{{< figure src="/images/16516460.png" width="300" >}}
Click the *Add* button on the new window.
{{< figure src="/images/16516459.png" width="400" >}}
To setup a connection to Crane, fill in the fields as follows:
*Connection Name:* Crane
*Host Name:* \<username\>@crane.unl.edu
*Remote Installation Directory:*  /util/opt/allinea/18.2
It should appear similar to this:
{{< figure src="/images/16519633.png" width="500" >}}
Be sure to replace *demo02* with your HCC username.
Click *OK* to close this dialog, and then *Close* on *Configure Remote
Connections* to return back to the main Allinea window.
Next, log in to Crane.  The Allinea software uses a `.allinea` directory
in your home directory to store configuration information.  Since `/home`
is read-only from the nodes in the cluster, the directory will be
created in `/work` and symlink'd.  To do so, run the following commands:
{{% panel theme="info" header="Create and symlink .allinea directory" %}}
{{< highlight bash >}}
rm -rf $HOME/.allinea
mkdir -p $WORK/.allinea
ln -s $WORK/.allinea $HOME/.allinea
{{< /highlight >}}
{{% /panel %}}
### Test the Reverse Connect feature
To test the connection, choose *Crane* from the *Remote Launch* menu.
{{< figure src="/images/16516457.png" width="300" >}}
A *Connect to Remote Host* dialog will appear and prompt for a password.
{{< figure src="/images/16516456.png" width="500" >}}
The login procedure is the same as for PuTTY or any other SSH program.
Enter your HCC password followed by the Duo login.
If the login was successful, you should see
*Connected to: \<username\>@crane.unl.edu* in the lower right corner of
the Allinea window.
The next step is to run a sample interactive job and test the Reverse
Connect connection.  Start an interactive job by running the following
command.
{{% panel theme="info" header="Start an interactive job" %}}
{{< highlight bash >}}
srun --pty --qos=short bash
{{< /highlight >}}
{{% /panel %}}
Once the job has started, load the allinea module and start DDT using
the `--connect` option.
{{% panel theme="info" header="Start DDT" %}}
{{< highlight bash >}}
module load allinea
ddt --connect
{{< /highlight >}}
{{% /panel %}}
On your local machine, a pop-up box should appear prompting you to
accept the Reverse Connect request.
{{< figure src="/images/16516453.png" width="450" >}}
Choose *Accept.*  The *Remote Launch* section should change to indicate
you are connected via tunnel, similar to:
{{< figure src="/images/16516455.png" width="250" >}}
Once that happens, Reverse Connect is working successfully and a
debugging or profiling session can be started.
#### Starting interactive jobs for serial, OpenMP/pthreads, MPI, and CUDA code
The `srun `syntax is slightly different depending on which type of code
you are debugging.  Use the following commands to start interactive jobs
for each type.
{{% panel theme="info" header="Serial code" %}}
{{< highlight bash >}}
 srun --pty --time=2:00:00 bash
{{< /highlight >}}
{{% /panel %}}
{{% panel theme="info" header="OpenMP/pthreads code" %}}
{{< highlight bash >}}
srun --pty --time=2:00:00 --nodes=1 --ntasks-per-node=4 bash 
{{< /highlight >}}
{{% /panel %}}
{{% panel theme="info" header="MPI code" %}}
{{< highlight bash >}}
srun --pty --time=2:00:00 --ntasks=4 bash
{{< /highlight >}}
{{% /panel %}}
{{% panel theme="info" header="CUDA code" %}}
{{< highlight bash >}}
srun --pty --time=2:00:00 --partition=gpu --gres=gpu bash
{{< /highlight >}}
{{% /panel %}}
1. [HCC-DOCS](index.html)
2. [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
3. [HCC Documentation](HCC-Documentation_332651.html)
4. [Running Applications](Running-Applications_7471153.html)
<span id="title-text"> HCC-DOCS : Compiling Source Code </span>
===============================================================
Created by <span class="author"> Derek Weitzel</span>, last modified by
<span class="editor"> Adam Caprez</span> on Mar 19, 2018