Compare revisions

6c7ccf12 · 6c7ccf12 · 6c7ccf12 · 6c7ccf12 · 6c7ccf12 · 6c7ccf12
--- a/content/_footer.md
+++ b/content/_footer.md
-+++
-  title = "Footer"
-+++
-{{< icon name="copyright-mark" >}} [Holland Computing Center](https://hcc.unl.edu) | 118 Schorr Center, Lincoln NE 68588 | {{< icon name="envelope" >}}[hcc-support@unl.edu](mailto:hcc-support@unl.edu) | {{< icon name="phone-alt" >}}402-472-5041
-See something wrong?  Help us fix it by [contributing](https://git.unl.edu/hcc/hcc-docs/blob/master/CONTRIBUTING.md)!
--- a/content/_header.md
+++ b/content/_header.md
-+++
-title = "Header"
-+++
-{{< figure src="/images/UNMasterwhite.gif" link="https://nebraska.edu" target="_blank" >}}
-### [Holland Computing Center](https://hcc.unl.edu)
-#### [HCC-DOCS]({{< relref "/" >}})
--- a/content/anvil/available_images.md
+++ b/content/anvil/available_images.md
-+++
-title = "Available images"
-description = "HCC-provided images for Anvil"
-+++
-HCC provides pre-configured images available to researchers.  Below is a
-list of available images.
-{{< sorttable >}}
-{{< readfile file="static/markdown/anvil-images.md" markdown="true" >}}
-{{< /sorttable >}}
-Additional images can be produced by HCC staff by request at
-{{< icon name="envelope" >}}[hcc-support@unl.edu](mailto:hcc-support@unl.edu).
--- a/content/anvil/formatting_and_mounting_a_volume_in_linux.md
+++ b/content/anvil/formatting_and_mounting_a_volume_in_linux.md
-+++
-title = "Formatting and mounting a volume in Linux"
-description = "How to format and mount volume as a hard drive in Linux."
-+++
-{{% notice info %}}
-This guide assumes you associated your SSH Key Pair with the instance
-when it was created, and that you are connected to the [Anvil VPN]({{< relref "connecting_to_the_anvil_vpn" >}}).
-{{% /notice %}}
-Once you have [created and attached]({{< relref "creating_and_attaching_a_volume" >}})
-your volume, it must be formatted and mounted in your Linux instance to be usable.  This
-procedure is identical to what would be done when attaching a second
-hard drive to a physical machine.  In this example, a 1GB volume was
-created and attached to the instance.  Note that the majority of this
-guide is for a newly created volume.  
-{{% notice note %}}
-**If you are attaching an existing volume with data already on it,
-skip to [creating a directory and mounting the volume](#mounting-the-volume).**
-{{% /notice %}}
-#### Formatting the volume
-Follow the relevant guide
-([Windows]({{< relref "connecting_to_linux_instances_from_windows">}})
-| [Mac]({{< relref "connecting_to_linux_instances_from_mac" >}})) for your
-operating system to connect to your instance.  Formatting and mounting
-the volume requires root privileges, so first run the
-command `sudo su -` to get a root shell.
-{{% panel theme="danger" header="**Running commands as root**" %}}**Extreme care should be taken when running commands as `root.`** It is very easy to permanently delete data or cause irreparable damage to your instance.{{% /panel %}}
-{{< figure src="/images/anvil-volumes/1-sudo.png" width="576" >}}
-Next, you will need to determine what device the volume is presented as
-within Linux.  Typically this will be `/dev/vdb`, but it is necessary to
-verify this to avoid mistakes, especially if you have more than one
-volume attached to an instance.  The command `lsblk` will list the
-hard drive devices and partitions.
-{{< figure src="/images//anvil-volumes/2-lsblk.png" width="576" >}}
-Here there is a completely empty (no partitions) disk device matching
-the 1GB size of the volume, so `/dev/vdb` is the correct device.
-The `parted` utility will first be used to label the device and then create a partition.
-{{< highlight bash >}}
-parted /dev/vdb mklabel gpt
-parted /dev/vdb mkpart primary 0% 100%
-{{< /highlight >}}
-{{< figure src="/images/anvil-volumes/3-mkpart.png" width="576" >}}
-Now that a partition has been created, it can be formatted.  Here, the
-ext4 filesystem will be used.  This is the default filesystem used by
-many Linux distributions including CentOS and Ubuntu, and is a good
-general choice.  An alternate filesystem may be used by running a
-different format command.  To format the partition using ext4, run the
-command `mkfs.ext4 /dev/vdb1`.  You will see a progress message and then
-be returned to the shell prompt.
-{{< figure src="/images/anvil-volumes/4-mkfs.png" width="576" >}}
-#### Mounting the volume
-{{% notice note %}}
-If you are attaching a pre-existing volume, start here.
-{{% /notice %}}
-Finally, the formatted partition must be mounted as a directory to be
-used.  By convention this is done under `/mnt`, but you may choose to
-mount it elsewhere depending on the usage.  Here, a directory
-called `myvolume` will be created and the volume mounted there.  Run the
-following commands to make the directory and mount the volume:
-{{< highlight bash >}}
-mkdir /mnt/myvolume
-mount /dev/vdb1 /mnt/myvolume
-{{< /highlight >}}
-{{< figure src="/images/anvil-volumes/5-mount.png" width="576" >}}
-Running the command `df -h` should then show the new mounted empty
-volume.  
-{{< figure src="/images/anvil-volumes/6-df.png" width="576" >}}
-The volume can now be used.
--- a/content/applications/_index.md
+++ b/content/applications/_index.md
-+++
-title = "Running Applications"
-weight = "40"
-+++
-In-depth guides for using applications on HCC resources
--------------------------------------
-{{% children description="true" %}}
--- a/content/applications/app_specific/Jupyter.md
+++ b/content/applications/app_specific/Jupyter.md
-+++
-title = "Jupyter Notebooks"
-description = "How to access and use a Jupyter Notebook"
-weight = 20
-+++
- [Connecting to JupyterHub](#connecting-to-jupyterhub)
- [Running Code](#running-code)
- [Opening a Terminal](#opening-a-terminal)
- [Using Custom Packages](#using-custom-packages)
-## Connecting to JupyterHub
-----------------------
- Jupyter defines it's notebooks ("Jupyter Notebooks") as 
-	an open-source web application that allows you to create and share documents that contain live code,
-	equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation,
-	statistical modeling, data visualization, machine learning, and much more.
-1.  To open a Jupyter notebook, go to the address of the cluster, below [Crane](crane.unl.edu) will be used as an example, sign in using your hcc credentials (NOT your 
-	campus credentials).
-{{< figure src="/images/jupyterLogin.png" >}}
-2.	Select your preferred authentication method.
-	{{< figure src="/images/jupyterPush.png" >}}  
-3.	Choose a job profile. Select "Noteboook via SLURM Job | Small (1 core, 4GB RAM, 8 hours)" for light tasks such as debugging or small-scale testing.
-Select the other options based on your computing needs. Note that a SLURM Job will save to your "work" directory.
-{{< figure src="/images/jupyterjob.png" >}}
-## Running Code
-1.  Select the "New" dropdown menu and select the file type you want to create.   
-{{< figure src="/images/jupyterNew.png" >}}
-2.	A new tab will open, where you can enter your code. Run your code by selecting the "play" icon.
-{{< figure src="/images/jupyterCode.png">}}
-## Opening a Terminal
-1.	From your user home page, select "terminal" from the "New" drop-down menu.
-{{< figure src="/images/jupyterTerminal.png">}}
-2.	A terminal opens in a new tab. You can enter [Linux commands]({{< relref "basic_linux_commands" >}})
- at the prompt.
-{{< figure src="/images/jupyterTerminal2.png">}}
-## Using Custom Packages
-Many popular `python` and `R` packages are already installed and available within Jupyter Notebooks. 
-However, it is possible to install custom packages to be used in notebooks by creating a custom Anaconda 
-Environment. Detailed information on how to create such an environment can be found at
- [Using an Anaconda Environment in a Jupyter Notebook on Crane]({{< relref "/applications/user_software/using_anaconda_package_manager#using-an-anaconda-environment-in-a-jupyter-notebook-on-crane" >}}).
---
--- a/content/applications/app_specific/allinea_profiling_and_debugging/_index.md
+++ b/content/applications/app_specific/allinea_profiling_and_debugging/_index.md
-+++
-title = "Allinea Profiling & Debugging Tools"
-description = "How to use the Allinea suite of tools for profiling and debugging."
-+++
-HCC provides both the Allinea Forge suite and Performance Reports to
-assist with debugging and profiling C/C++/Fortran code.  These tools
-support single-threaded, multi-threaded (pthreads/OpenMP), MPI, and CUDA
-code.  The Allinea Forge suite consists of two programs:  DDT for
-debugging and MAP for profiling.  The Performance Reports software
-provides a convenient way to profile HPC applications.  It generates an
-easy-to-read single-page HTML report.
-For information on using each tool, see the following pages.
-[Using Allinea Forge via Reverse Connect]({{< relref "using_allinea_forge_via_reverse_connect" >}})
-[Allinea Performance Reports]({{< relref "allinea_performance_reports" >}})
--- a/content/applications/app_specific/bioinformatics_tools/alignment_tools/_index.md
+++ b/content/applications/app_specific/bioinformatics_tools/alignment_tools/_index.md
-+++
-title = "Alignment Tools"
-description = "How to use various alignment tools on HCC machines"
-weight = "52"
-+++
-{{% children %}}
\ No newline at end of file
--- a/content/applications/app_specific/bioinformatics_tools/data_manipulation_tools/_index.md
+++ b/content/applications/app_specific/bioinformatics_tools/data_manipulation_tools/_index.md
-+++
-title = "Data Manipulation Tools"
-description = "How to use data manipulation tools on HCC machines"
-weight = "52"
-+++
-{{% children %}}
\ No newline at end of file
--- a/content/applications/app_specific/bioinformatics_tools/de_novo_assembly_tools/_index.md
+++ b/content/applications/app_specific/bioinformatics_tools/de_novo_assembly_tools/_index.md
-+++
-title = "De Novo Assembly Tools"
-description = "How to use de novo assembly tools on HCC machines"
-weight = "52"
-+++
-{{% children %}}
--- a/content/applications/app_specific/bioinformatics_tools/de_novo_assembly_tools/trinity/running_trinity_in_multiple_steps.md
+++ b/content/applications/app_specific/bioinformatics_tools/de_novo_assembly_tools/trinity/running_trinity_in_multiple_steps.md
-+++
-title = "Running Trinity in Multiple Steps"
-description =  "How to run Trinity in multiple steps on HCC resources"
-weight = "10"
-+++
-## Running Trinity with Paired-End fastq data with 8 CPUs and 100GB of RAM
-The first step of running Trinity is to run Trinity with the option **--no_run_chrysalis**:
-{{% panel header="`trinity_step1.submit`"%}}
-{{< highlight bash >}}
-#!/bin/sh
-#SBATCH --job-name=Trinity_Step1
-#SBATCH --nodes=1
-#SBATCH --ntasks-per-node=8
-#SBATCH --time=168:00:00
-#SBATCH --mem=100gb
-#SBATCH --output=Trinity_Step1.%J.out
-#SBATCH --error=Trinity_Step1.%J.err
-module load trinity/2.6
-Trinity --seqType fq --JM 100G --left input_reads_pair_1.fastq --right input_reads_pair_2.fastq --SS_lib_type FR --output trinity_out/ --CPU $SLURM_NTASKS_PER_NODE --no_run_chrysalis
-{{< /highlight >}}
-{{% /panel %}}
-The second step of running Trinity is to run Trinity with the option **--no_run_quantifygraph**:
-{{% panel header="`trinity_step2.submit`"%}}
-{{< highlight bash >}}
-#!/bin/sh
-#SBATCH --job-name=Trinity_Step2
-#SBATCH --nodes=1
-#SBATCH --ntasks-per-node=8
-#SBATCH --time=168:00:00
-#SBATCH --mem=100gb
-#SBATCH --output=Trinity_Step2.%J.out
-#SBATCH --error=Trinity_Step2.%J.err
-module load trinity/2.6
-Trinity --seqType fq --JM 100G --left input_reads_pair_1.fastq --right input_reads_pair_2.fastq --SS_lib_type FR --output trinity_out/ --CPU $SLURM_NTASKS_PER_NODE --no_run_quantifygraph
-{{< /highlight >}}
-{{% /panel %}}
-The third step of running Trinity is to run Trinity with the option **--no_run_butterfly**:
-{{% panel header="`trinity_step3.submit`"%}}
-{{< highlight bash >}}
-#!/bin/sh
-#SBATCH --job-name=Trinity_Step3
-#SBATCH --nodes=1
-#SBATCH --ntasks-per-node=8
-#SBATCH --time=168:00:00
-#SBATCH --mem=100gb
-#SBATCH --output=Trinity_Step3.%J.out
-#SBATCH --error=Trinity_Step3.%J.err
-module load trinity/2.6
-Trinity --seqType fq --JM 100G --left input_reads_pair_1.fastq --right input_reads_pair_2.fastq --SS_lib_type FR --output trinity_out/ --CPU $SLURM_NTASKS_PER_NODE --no_run_butterfly
-{{< /highlight >}}
-{{% /panel %}}
-The fourth step of running Trinity is to run Trinity without any additional option:
-{{% panel header="`trinity_step4.submit`"%}}
-{{< highlight bash >}}
-#!/bin/sh
-#SBATCH --job-name=Trinity_Step4
-#SBATCH --nodes=1
-#SBATCH --ntasks-per-node=8
-#SBATCH --time=168:00:00
-#SBATCH --mem=100gb
-#SBATCH --output=Trinity_Step4.%J.out
-#SBATCH --error=Trinity_Step4.%J.err
-module load trinity/2.6
-Trinity --seqType fq --JM 100G --left input_reads_pair_1.fastq --right input_reads_pair_2.fastq --SS_lib_type FR --output trinity_out/ --CPU $SLURM_NTASKS_PER_NODE
-{{< /highlight >}}
-{{% /panel %}}
-### Trinity Output
-Trinity outputs number of files in its `trinity_out/` output directory after each executed step. The output file `Trinity.fasta` is the final Trinity output that contains the assembled transcripts.
-{{% notice tip %}}
-The Inchworm (step 1) and Chrysalis (step 2) steps can be memory intensive. A basic recommendation is to have **1GB of RAM per 1M ~76 base Illumina paired-end reads**.
-{{% /notice %}}
--- a/content/applications/app_specific/bioinformatics_tools/de_novo_assembly_tools/velvet/running_velvet_with_paired_end_data.md
+++ b/content/applications/app_specific/bioinformatics_tools/de_novo_assembly_tools/velvet/running_velvet_with_paired_end_data.md
-+++
-title = "Running Velvet with Paired-End Data"
-description =  "How to run velvet with paired-end data on HCC resources"
-weight = "10"
-+++
-## Running Velvet with Paired-End long fastq data with k-mer=43, 8 CPUs and 100GB of RAM
-The first step of running Velvet is to run **velveth**:
-{{% panel header="`velveth.submit`"%}}
-{{< highlight bash >}}
-#!/bin/sh
-#SBATCH --job-name=Velvet_Velveth
-#SBATCH --nodes=1
-#SBATCH --ntasks-per-node=8
-#SBATCH --time=168:00:00
-#SBATCH --mem=10gb
-#SBATCH --output=Velveth.%J.out
-#SBATCH --error=Velveth.%J.err
-module load velvet/1.2
-export OMP_NUM_THREADS=$SLURM_NTASKS_PER_NODE
-velveth output_directory/ 43 -fastq -longPaired -separate input_reads_pair_1.fastq input_reads_pair_2.fastq
-{{< /highlight >}}
-{{% /panel %}}
-After running **velveth**, the next step is to run **velvetg** on the `output_directory/` and files generated from **velveth**:
-{{% panel header="`velvetg.submit`"%}}
-{{< highlight bash >}}
-#!/bin/sh
-#SBATCH --job-name=Velvet_Velvetg
-#SBATCH --nodes=1
-#SBATCH --ntasks-per-node=8
-#SBATCH --time=168:00:00
-#SBATCH --mem=100gb
-#SBATCH --output=Velvetg.%J.out
-#SBATCH --error=Velvetg.%J.err
-module load velvet/1.2
-export OMP_NUM_THREADS=$SLURM_NTASKS_PER_NODE
-velvetg output_directory/ -min_contig_lgth 200
-{{< /highlight >}}
-{{% /panel %}}
-Both **velveth** and **velvetg** are multi-threaded.
-### Velvet Output
-{{% panel header="`Output directory after velveth`"%}}
-{{< highlight bash >}}
-$ ls output_directory/
-Log  Roadmaps  Sequences
-{{< /highlight >}}
-{{% /panel %}}
-{{% panel header="`Output directory after velvetg`"%}}
-{{< highlight bash >}}
-$ ls output_directory/
-contigs.fa  Graph  LastGraph  Log  PreGraph  Roadmaps  Sequences  stats.txt
-{{< /highlight >}}
-{{% /panel %}}
-The output fasta file `contigs.fa` is the final Velvet output that contains the assembled contigs. More information about the output files is provided in the Velvet manual.
--- a/content/applications/app_specific/bioinformatics_tools/de_novo_assembly_tools/velvet/running_velvet_with_single_end_and_paired_end_data.md
+++ b/content/applications/app_specific/bioinformatics_tools/de_novo_assembly_tools/velvet/running_velvet_with_single_end_and_paired_end_data.md
-+++
-title = "Running Velvet with Single-End and Paired-End Data"
-description =  "How to run velvet with single-end and paired-end data on HCC resources"
-weight = "10"
-+++
-## Running Velvet with Single-End and Paired-End short fasta data with k-mer=51, 8 CPUs and 100GB of RAM
-The first step of running Velvet is to run **velveth**:
-{{% panel header="`velveth.submit`"%}}
-{{< highlight bash >}}
-#!/bin/sh
-#SBATCH --job-name=Velvet_Velveth
-#SBATCH --nodes=1
-#SBATCH --ntasks-per-node=8
-#SBATCH --time=168:00:00
-#SBATCH --mem=10gb
-#SBATCH --output=Velveth.%J.out
-#SBATCH --error=Velveth.%J.err
-module load velvet/1.2
-export OMP_NUM_THREADS=$SLURM_NTASKS_PER_NODE
-velveth output_directory/ 51 -fasta -short input_reads.fasta -fasta -shortPaired2 -separate input_reads_pair_1.fasta input_reads_pair_2.fasta
-{{< /highlight >}}
-{{% /panel %}}
-After running **velveth**, the next step is to run **velvetg** on the `output_directory/` and files generated from **velveth**:
-{{% panel header="`velvetg.submit`"%}}
-{{< highlight bash >}}
-#!/bin/sh
-#SBATCH --job-name=Velvet_Velvetg
-#SBATCH --nodes=1
-#SBATCH --ntasks-per-node=8
-#SBATCH --time=168:00:00
-#SBATCH --mem=100gb
-#SBATCH --output=Velvetg.%J.out
-#SBATCH --error=Velvetg.%J.err
-module load velvet/1.2
-export OMP_NUM_THREADS=$SLURM_NTASKS_PER_NODE
-velvetg output_directory/ -min_contig_lgth 200
-{{< /highlight >}}
-{{% /panel %}}
-Both **velveth** and **velvetg** are multi-threaded.
-### Velvet Output
-{{% panel header="`Output directory after velveth`"%}}
-{{< highlight bash >}}
-$ ls output_directory/
-Log  Roadmaps  Sequences
-{{< /highlight >}}
-{{% /panel %}}
-{{% panel header="`Output directory after velvetg`"%}}
-{{< highlight bash >}}
-$ ls output_directory/
-contigs.fa  Graph  LastGraph  Log  PreGraph  Roadmaps  Sequences  stats.txt
-{{< /highlight >}}
-{{% /panel %}}
-The output fasta file `contigs.fa` is the final Velvet output that contains the assembled contigs. More information about the output files is provided in the Velvet manual.
--- a/content/applications/app_specific/bioinformatics_tools/de_novo_assembly_tools/velvet/running_velvet_with_single_end_data.md
+++ b/content/applications/app_specific/bioinformatics_tools/de_novo_assembly_tools/velvet/running_velvet_with_single_end_data.md
-+++
-title = "Running Velvet with Single-End Data"
-description =  "How to run velvet with single-end data on HCC resources"
-weight = "10"
-+++
-## Running Velvet with Single-End short fasta data with k-mer=31, 8 CPUs and 100GB of RAM
-The first step of running Velvet is to run **velveth**:
-{{% panel header="`velveth.submit`"%}}
-{{< highlight bash >}}
-#!/bin/sh
-#SBATCH --job-name=Velvet_Velveth
-#SBATCH --nodes=1
-#SBATCH --ntasks-per-node=8
-#SBATCH --time=168:00:00
-#SBATCH --mem=10gb
-#SBATCH --output=Velveth.%J.out
-#SBATCH --error=Velveth.%J.err
-module load velvet/1.2
-export OMP_NUM_THREADS=$SLURM_NTASKS_PER_NODE
-velveth output_directory/ 31 -fasta -short input_reads.fasta
-{{< /highlight >}}
-{{% /panel %}}
-After running **velveth**, the next step is to run **velvetg** on the `output_directory/` and files generated from **velveth**:
-{{% panel header="`velvetg.submit`"%}}
-{{< highlight bash >}}
-#!/bin/sh
-#SBATCH --job-name=Velvet_Velvetg
-#SBATCH --nodes=1
-#SBATCH --ntasks-per-node=8
-#SBATCH --time=168:00:00
-#SBATCH --mem=100gb
-#SBATCH --output=Velvetg.%J.out
-#SBATCH --error=Velvetg.%J.err
-module load velvet/1.2
-export OMP_NUM_THREADS=$SLURM_NTASKS_PER_NODE
-velvetg output_directory/ -min_contig_lgth 200
-{{< /highlight >}}
-{{% /panel %}}
-Both **velveth** and **velvetg** are multi-threaded.
-### Velvet Output
-{{% panel header="`Output directory after velveth`"%}}
-{{< highlight bash >}}
-$ ls output_directory/
-Log  Roadmaps  Sequences
-{{< /highlight >}}
-{{% /panel %}}
-{{% panel header="`Output directory after velvetg`"%}}
-{{< highlight bash >}}
-$ ls output_directory/
-contigs.fa  Graph  LastGraph  Log  PreGraph  Roadmaps  Sequences  stats.txt
-{{< /highlight >}}
-{{% /panel %}}
-The output fasta file `contigs.fa` is the final Velvet output that contains the assembled contigs. More information about the output files is provided in the Velvet manual.
--- a/content/applications/app_specific/bioinformatics_tools/downloading_sra_data_from_ncbi.md
+++ b/content/applications/app_specific/bioinformatics_tools/downloading_sra_data_from_ncbi.md
-+++
-title = "Downloading SRA data from NCBI"
-description = "How to download data from NCBI"
-weight = "52"
-+++
-One way to download high-volume data from NCBI is to use command line
-utilities, such as **wget**, **ftp** or Aspera Connect **ascp**
-plugin. The Aspera Connect plugin is commonly used high-performance transfer
-plugin that provides the best transfer speed.
-This plugin is available on our clusters as a module. In order to use it, load the appropriate module first:
-{{< highlight bash >}}
-$ module load aspera-cli
-{{< /highlight >}}
-The basic usage of the Aspera plugin is
-{{< highlight bash >}}
-$ ascp -i $ASPERA_PUBLIC_KEY -k 1 -T -l <max_download_rate_in_Mbps>m anonftp@ftp.ncbi.nlm.nih.gov:/<files_to_transfer> <local_work_output_directory>
-{{< /highlight >}}
-where **-k 1** enables resume of partial transfers, **-T** disables encryption for maximum throughput, and **-l** sets the transfer rate.
-**\<files_to_transfer\>** mentioned in the basic usage of Aspera
-plugin has a specifically defined pattern that needs to be followed:
-{{< highlight bash >}}
-<files_to_transfer> = /sra/sra-instant/reads/ByRun/sra/SRR|ERR|DRR/<first_6_characters_of_accession>/<accession>/<accession>.sra
-{{< /highlight >}}
-where **SRR\|ERR\|DRR** should be either **SRR**, **ERR **or **DRR** and should match the prefix of the target **.sra** file.
-More **ascp** options can be seen by using:
-{{< highlight bash >}}
-$ ascp --help
-{{< /highlight >}}
-For example, if you want to download the **SRR304976** file from NCBI in your $WORK **data/** directory with downloading speed of **1000 Mbps**, you should use the following command:
-{{< highlight bash >}}
-$ ascp -i $ASPERA_PUBLIC_KEY -k 1 -T -l 1000m anonftp@ftp.ncbi.nlm.nih.gov:/sra/sra-instant/reads/ByRun/sra/SRR/SRR304/SRR304976/SRR304976.sra /work/[groupname]/[username]/data/
-{{< /highlight >}}
--- a/content/applications/app_specific/bioinformatics_tools/pre_processing_tools/_index.md
+++ b/content/applications/app_specific/bioinformatics_tools/pre_processing_tools/_index.md
-+++
-title = "Pre-processing Tools"
-description = "How to use pre-processing tools on HCC machines"
-weight = "52"
-+++
-{{% children %}}
\ No newline at end of file
--- a/content/applications/app_specific/bioinformatics_tools/reference_based_assembly_tools/_index.md
+++ b/content/applications/app_specific/bioinformatics_tools/reference_based_assembly_tools/_index.md
-+++
-title = "Reference-Based Assembly Tools"
-description = "How to use reference based assembly tools on HCC machines"
-weight = "52"
-+++
-{{% children %}}
\ No newline at end of file
--- a/content/applications/app_specific/bioinformatics_tools/removing_detecting_redundant_sequences/_index.md
+++ b/content/applications/app_specific/bioinformatics_tools/removing_detecting_redundant_sequences/_index.md
-+++
-title = "Tools for Removing/Detecting Redundant Sequences"
-description = "How to use tools for removing/detecting redundant sequences on HCC machines"
-weight = "52"
-+++
-{{% children %}}
--- a/content/applications/app_specific/mpi_jobs_on_hcc.md
+++ b/content/applications/app_specific/mpi_jobs_on_hcc.md
-+++
-title = "MPI Jobs on HCC"
-description = "How to compile and run MPI programs on HCC machines"
-weight = "52"
-+++
-This quick start demonstrates how to implement a parallel (MPI)
-Fortran/C program on HCC supercomputers. The sample codes and submit
-scripts can be downloaded from [mpi_dir.zip](/attachments/mpi_dir.zip).
-#### Login to a HCC Cluster
-Connect to a HCC cluster]({{< relref "../../connecting/" >}}) and make a subdirectory 
-and make a subdirectory called `mpi_dir` under your `$WORK` directory.
-{{< highlight bash >}}
-$ cd $WORK
-$ mkdir mpi_dir
-{{< /highlight >}}
-In the subdirectory `mpi_dir`, save all the relevant codes. Here we
-include two demo programs, `demo_f_mpi.f90` and `demo_c_mpi.c`, that
-compute the sum from 1 to 20 through parallel processes. A
-straightforward parallelization scheme is used for demonstration
-purpose. First, the master core (i.e. `myid=0`) distributes equal
-computation workload to a certain number of cores (as specified by
-`--ntasks `in the submit script). Then, each worker core computes a
-partial summation as output. Finally, the master core collects the
-outputs from all worker cores and perform an overall summation. For easy
-comparison with the serial code ([Fortran/C on HCC]({{< relref "fortran_c_on_hcc">}})), the
-added lines in the parallel code (MPI) are marked with "!=" or "//=".
-{{%expand "demo_f_mpi.f90" %}}
-{{< highlight fortran >}}
-Program demo_f_mpi
-!====== MPI =====
-    use mpi     
-!================
-    implicit none
-    integer, parameter :: N = 20
-    real*8 w
-    integer i
-    common/sol/ x
-    real*8 x
-    real*8, dimension(N) :: y 
-!============================== MPI =================================
-    integer ind
-    real*8, dimension(:), allocatable :: y_local                    
-    integer numnodes,myid,rc,ierr,start_local,end_local,N_local     
-    real*8 allsum                                                   
-!====================================================================
-!============================== MPI =================================
-    call mpi_init( ierr )                                           
-    call mpi_comm_rank ( mpi_comm_world, myid, ierr )               
-    call mpi_comm_size ( mpi_comm_world, numnodes, ierr )           
-                                                                                                                                        !
-    N_local = N/numnodes                                            
-    allocate ( y_local(N_local) )                                   
-    start_local = N_local*myid + 1                                  
-    end_local =  N_local*myid + N_local                             
-!====================================================================
-    do i = start_local, end_local
-        w = i*1d0
-        call proc(w)
-        ind = i - N_local*myid
-        y_local(ind) = x
-!       y(i) = x
-!       write(6,*) 'i, y(i)', i, y(i)
-    enddo   
-!       write(6,*) 'sum(y) =',sum(y)
-!============================================== MPI =====================================================
-    call mpi_reduce( sum(y_local), allsum, 1, mpi_real8, mpi_sum, 0, mpi_comm_world, ierr )             
-    call mpi_gather ( y_local, N_local, mpi_real8, y, N_local, mpi_real8, 0, mpi_comm_world, ierr )     
-    if (myid == 0) then                                                                                 
-        write(6,*) '-----------------------------------------'                                          
-        write(6,*) '*Final output from... myid=', myid                                                  
-        write(6,*) 'numnodes =', numnodes                                                               
-        write(6,*) 'mpi_sum =', allsum  
-        write(6,*) 'y=...'
-        do i = 1, N
-            write(6,*) y(i)
-        enddo                                                                                       
-        write(6,*) 'sum(y)=', sum(y)                                                                
-    endif                                                                                               
-    deallocate( y_local )                                                                               
-    call mpi_finalize(rc)                                                                               
-!========================================================================================================
-Stop
-End Program
-Subroutine proc(w)
-    real*8, intent(in) :: w
-    common/sol/ x
-    real*8 x
-    x = w
-Return
-End Subroutine
-{{< /highlight >}}
-{{% /expand %}}
-{{%expand "demo_c_mpi.c" %}}
-{{< highlight c >}}
-//demo_c_mpi
-#include <stdio.h>
-//======= MPI ========
-#include "mpi.h"    
-#include <stdlib.h>   
-//====================
-double proc(double w){
-        double x;       
-        x = w;  
-        return x;
-}
-int main(int argc, char* argv[]){
-    int N=20;
-    double w;
-    int i;
-    double x;
-    double y[N];
-    double sum;
-//=============================== MPI ============================
-    int ind;                                                    
-    double *y_local;                                            
-    int numnodes,myid,rc,ierr,start_local,end_local,N_local;    
-    double allsum;                                              
-//================================================================
-//=============================== MPI ============================
-    MPI_Init(&argc, &argv);
-    MPI_Comm_rank( MPI_COMM_WORLD, &myid );
-    MPI_Comm_size ( MPI_COMM_WORLD, &numnodes );
-    N_local = N/numnodes;
-    y_local=(double *) malloc(N_local*sizeof(double));
-    start_local = N_local*myid + 1;
-    end_local = N_local*myid + N_local;
-//================================================================
-    for (i = start_local; i <= end_local; i++){        
-        w = i*1e0;
-        x = proc(w);
-        ind = i - N_local*myid;
-        y_local[ind-1] = x;
-//      y[i-1] = x;
-//      printf("i,x= %d %lf\n", i, y[i-1]) ;
-    }
-    sum = 0e0;
-    for (i = 1; i<= N_local; i++){
-        sum = sum + y_local[i-1];   
-    }
-//  printf("sum(y)= %lf\n", sum);    
-//====================================== MPI ===========================================
-    MPI_Reduce( &sum, &allsum, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD );
-    MPI_Gather( &y_local[0], N_local, MPI_DOUBLE, &y[0], N_local, MPI_DOUBLE, 0, MPI_COMM_WORLD );
-    if (myid == 0){
-    printf("-----------------------------------\n");
-    printf("*Final output from... myid= %d\n", myid);
-    printf("numnodes = %d\n", numnodes);
-    printf("mpi_sum = %lf\n", allsum);
-    printf("y=...\n");
-    for (i = 1; i <= N; i++){
-        printf("%lf\n", y[i-1]);
-    }   
-    sum = 0e0;
-    for (i = 1; i<= N; i++){
-        sum = sum + y[i-1]; 
-    }
-    printf("sum(y) = %lf\n", sum);
-    }
-    free( y_local );
-    MPI_Finalize ();
-//======================================================================================        
-return 0;
-}
-{{< /highlight >}}
-{{% /expand %}}
---
-#### Compiling the Code
-The compiling of a MPI code requires first loading a compiler "engine"
-such as `gcc`, `intel`, or `pgi` and then loading a MPI wrapper
-`openmpi`. Here we will use the GNU Complier Collection, `gcc`, for
-demonstration.
-{{< highlight bash >}}
-$ module load compiler/gcc/6.1 openmpi/2.1
-$ mpif90 demo_f_mpi.f90 -o demo_f_mpi.x  
-$ mpicc demo_c_mpi.c -o demo_c_mpi.x
-{{< /highlight >}}
-The above commends load the `gcc` complier with the `openmpi` wrapper.
-The compiling commands `mpif90` or `mpicc` are used to compile the codes
-to`.x` files (executables). 
-### Creating a Submit Script
-Create a submit script to request 5 cores (with `--ntasks`). A parallel
-execution command `mpirun ./` needs to enter to last line before the
-main program name.
-{{% panel header="`submit_f.mpi`"%}}
-{{< highlight bash >}}
-#!/bin/sh
-#SBATCH --ntasks=5
-#SBATCH --mem-per-cpu=1024
-#SBATCH --time=00:01:00
-#SBATCH --job-name=Fortran
-#SBATCH --error=Fortran.%J.err
-#SBATCH --output=Fortran.%J.out
-mpirun ./demo_f_mpi.x 
-{{< /highlight >}}
-{{% /panel %}}
-{{% panel header="`submit_c.mpi`"%}}
-{{< highlight bash >}}
-#!/bin/sh
-#SBATCH --ntasks=5
-#SBATCH --mem-per-cpu=1024
-#SBATCH --time=00:01:00
-#SBATCH --job-name=C
-#SBATCH --error=C.%J.err
-#SBATCH --output=C.%J.out
-mpirun ./demo_c_mpi.x 
-{{< /highlight >}}
-{{% /panel %}}
-#### Submit the Job
-The job can be submitted through the command `sbatch`. The job status
-can be monitored by entering `squeue` with the `-u` option.
-{{< highlight bash >}}
-$ sbatch submit_f.mpi
-$ sbatch submit_c.mpi
-$ squeue -u <username>
-{{< /highlight >}}
-Replace `<username>` with your HCC username.
-Sample Output
-------------
-The sum from 1 to 20 is computed and printed to the `.out` file (see
-below). The outputs from the 5 cores are collected and processed by the
-master core (i.e. `myid=0`).
-{{%expand "Fortran.out" %}}
-{{< highlight batchfile>}}
- -----------------------------------------
- *Final output from... myid=           0
- numnodes =           5
- mpi_sum =   210.00000000000000     
- y=...
-   1.0000000000000000     
-   2.0000000000000000     
-   3.0000000000000000     
-   4.0000000000000000     
-   5.0000000000000000     
-   6.0000000000000000     
-   7.0000000000000000     
-   8.0000000000000000     
-   9.0000000000000000     
-   10.000000000000000     
-   11.000000000000000     
-   12.000000000000000     
-   13.000000000000000     
-   14.000000000000000     
-   15.000000000000000     
-   16.000000000000000     
-   17.000000000000000     
-   18.000000000000000     
-   19.000000000000000     
-   20.000000000000000     
- sum(y)=   210.00000000000000     
-{{< /highlight >}}
-{{% /expand %}} 
-{{%expand "C.out" %}}
-{{< highlight batchfile>}}
-----------------------------------
-*Final output from... myid= 0
-numnodes = 5
-mpi_sum = 210.000000
-y=...
-1.000000
-2.000000
-3.000000
-4.000000
-5.000000
-6.000000
-7.000000
-8.000000
-9.000000
-10.000000
-11.000000
-12.000000
-13.000000
-14.000000
-15.000000
-16.000000
-17.000000
-18.000000
-19.000000
-20.000000
-sum(y) = 210.000000
-{{< /highlight >}}
-{{% /expand %}}
--- a/content/applications/app_specific/running_matlab_parallel_server.md
+++ b/content/applications/app_specific/running_matlab_parallel_server.md
-+++
-title = "Running Matlab Parallel Server"
-description = "How to run Matlab Parallel Server on HCC resources."
-+++
-## This document provides the steps to configure MATLAB to submit jobs to a cluster, retrieve results, and debug errors.
-### CONFIGURATION    
-After logging into the cluster, start MATLAB.  Configure MATLAB to run parallel jobs on your cluster by calling configCluster.
-```Matlab
->> configCluster
-```
-Jobs will now default to the cluster rather than submit to the local machine.
-NOTE: If you would like to submit to the local machine then run the following command:
-```Matlab
->> % Get a handle to the local resources
->> c = parcluster('local');
-```
-### CONFIGURING JOBS    
-Prior to submitting the job, we can specify various parameters to pass to our jobs, such as queue, e-mail, walltime, etc. 
-```Matlab
->> % Get a handle to the cluster
->> c = parcluster;
->> % Specify a partition to use for MATLAB jobs. The default partition is batch.			
->> c.AdditionalProperties.QueueName = 'partition-name';
->> % Run time in hh:mm:ss
->> c.AdditionalProperties.WallTime = '05:00:00';
->> % Maximum memory required per CPU (in megabytes)
->> c.AdditionalProperties.MemUsage = '4000';
->> % Specify e-mail address to receive notifications about your job
->> c.AdditionalProperties.EmailAddress = 'user-id@company.com';
->> % If you have other SLURM directives to specify such as a reservation, use the command below:
->> c.AdditionalProperties.AdditionalSubmitArgs = '';
-```
-Save changes after modifying AdditionalProperties for the above changes to persist between MATLAB sessions.
-```Matlab
->> c.saveProfile
-```
-To see the values of the current configuration options, display AdditionalProperties.
-```Matlab
->> % To view current properties
->> c.AdditionalProperties
-```
-Unset a value when no longer needed.
-```Matlab
->> % Turn off email notifications 
->> c.AdditionalProperties.EmailAddress = '';
->> c.saveProfile
-```
-### INTERACTIVE JOBS    
-To run an interactive pool job on the cluster, continue to use parpool as you’ve done before.
-```Matlab
->> % Get a handle to the cluster
->> c = parcluster;
->> % Open a pool of 64 workers on the cluster
->> p = c.parpool(64);
-```
-Rather than running local on the local machine, the pool can now run across multiple nodes on the cluster.
-```Matlab
->> % Run a parfor over 1000 iterations
->> parfor idx = 1:1000
-      a(idx) = …
-   end
-```
-Once we’re done with the pool, delete it.
-```Matlab
->> % Delete the pool
->> p.delete
-```
-### INDEPENDENT BATCH JOB    
-Rather than running interactively, use the batch command to submit asynchronous jobs to the cluster.  The batch command will return a job object which is used to access the output of the submitted job.  See the MATLAB documentation for more help on batch.
-```Matlab
->> % Get a handle to the cluster
->> c = parcluster;
->> % Submit job to query where MATLAB is running on the cluster
->> j = c.batch(@pwd, 1, {});
->> % Query job for state
->> j.State
->> % If state is finished, fetch the results
->> j.fetchOutputs{:}
->> % Delete the job after results are no longer needed
->> j.delete
-```
-To retrieve a list of currently running or completed jobs, call parcluster to retrieve the cluster object.  The cluster object stores an array of jobs that were run, are running, or are queued to run.  This allows us to fetch the results of completed jobs.  Retrieve and view the list of jobs as shown below.
-```Matlab
->> c = parcluster;
->> jobs = c.Jobs;
-```
-Once we’ve identified the job we want, we can retrieve the results as we’ve done previously. 
-fetchOutputs is used to retrieve function output arguments; if calling batch with a script, use load instead.   Data that has been written to files on the cluster needs be retrieved directly from the file system (e.g. via ftp).
-To view results of a previously completed job:
-```Matlab
->> % Get a handle to the job with ID 2
->> j2 = c.Jobs(2);
-```
-NOTE: You can view a list of your jobs, as well as their IDs, using the above c.Jobs command.  
-```Matlab
->> % Fetch results for job with ID 2
->> j2.fetchOutputs{:}
-```
-### PARALLEL BATCH JOB    
-Users can also submit parallel workflows with the batch command.  Let’s use the following example for a parallel job 'parallel_example.m'.   
-```Matlab
-function t = parallel_example(iter)
-if nargin==0, iter = 16; end
-disp('Start sim')
-t0 = tic;
-parfor idx = 1:iter
-    A(idx) = idx;
-    pause(2)
-end
-t = toc(t0);
-disp('Sim completed.')
-```    
-This time when we use the batch command, in order to run a parallel job, we’ll also specify a MATLAB Pool.    
-```Matlab
->> % Get a handle to the cluster
->> c = parcluster;
->> % Submit a batch pool job using 4 workers for 16 simulations
->> j = c.batch(@parallel_example, 1, {}, 'Pool',4);
->> % View current job status
->> j.State
->> % Fetch the results after a finished state is retrieved
->> j.fetchOutputs{:}
-ans = 
-	8.8872
-```
-The job ran in 8.89 seconds using four workers.  Note that these jobs will always request N+1 CPU cores, since one worker is required to manage the batch job and pool of workers.   For example, a job that needs eight workers will consume nine CPU cores.  	
-We’ll run the same simulation but increase the Pool size.  This time, to retrieve the results later, we’ll keep track of the job ID.
-NOTE: For some applications, there will be a diminishing return when allocating too many workers, as the overhead may exceed computation time.    
-```Matlab
->> % Get a handle to the cluster
->> c = parcluster;
->> % Submit a batch pool job using 8 workers for 16 simulations
->> j = c.batch(@parallel_example, 1, {}, 'Pool', 8);
->> % Get the job ID
->> id = j.ID
-id =
-	4
->> % Clear j from workspace (as though we quit MATLAB)
->> clear j
-```
-Once we have a handle to the cluster, we’ll call the findJob method to search for the job with the specified job ID.   
-```Matlab
->> % Get a handle to the cluster
->> c = parcluster;
->> % Find the old job
->> j = c.findJob('ID', 4);
->> % Retrieve the state of the job
->> j.State
-ans
-finished
->> % Fetch the results
->> j.fetchOutputs{:};
-ans = 
-4.7270
-```
-The job now runs in 4.73 seconds using eight workers.  Run code with different number of workers to determine the ideal number to use.
-Alternatively, to retrieve job results via a graphical user interface, use the Job Monitor (Parallel > Monitor Jobs).
-### DEBUGGING    
-If a serial job produces an error, call the getDebugLog method to view the error log file.  When submitting independent jobs, with multiple tasks, specify the task number.  
-```Matlab
->> c.getDebugLog(j.Tasks(3))
-```
-For Pool jobs, only specify the job object.
-```Matlab
->> c.getDebugLog(j)
-```
-When troubleshooting a job, the cluster admin may request the scheduler ID of the job.  This can be derived by calling schedID
-```Matlab
->> schedID(j)
-ans
-25539
-```
No results found