add bio pages part 4

4a59f001 · npavlovikj · ba5c0711 · 4a59f001 · 4a59f001 · 4a59f001
Commit 4a59f001 authored 6 years ago by npavlovikj
--- a/content/guides/running_applications/bioinformatics_tools/_index.md
+++ b/content/guides/running_applications/bioinformatics_tools/_index.md
 +++
 title = "Bioinformatics Tools"
+description = "How to use various bioinformatics tools on HCC machines"
+weight = "52"
 +++

-<span style="color: rgb(0,0,0);">The following is a categorized list of
-bioinformatics tools available on HCC. Each page contains summary of the
-tool, information about the HCC resources that have the specific
-tool, links to user documentation, as well as example SLURM submit
-scripts. More detailed information about submitting SLURM jobs and
-checking job status on HCC can be
-found [here](Submitting-Jobs_332222.html).</span>
+The following is a categorized list of bioinformatics tools available on HCC. Each page contains summary of the tool, information about the HCC resources that have the specific tool, links to user documentation, as well as example SLURM submit scripts.

-<span style="color: rgb(0,0,0);"> </span>
+More detailed information about submitting SLURM jobs and checking job status on HCC can be found [here](../../submitting_jobs)
+
+{{% children %}}


--- a/content/guides/running_applications/bioinformatics_tools/alignment_tools/_index.md
+++ b/content/guides/running_applications/bioinformatics_tools/alignment_tools/_index.md
 +++
 title = "Alignment Tools"
+description = "How to use various alignment tools on HCC machines"
+weight = "52"
 +++

-1.  [HCC-DOCS](index.html)
-2.  [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
-3.  [HCC Documentation](HCC-Documentation_332651.html)
-4.  [Running Applications](Running-Applications_7471153.html)
-5.  [Bioinformatics Tools](Bioinformatics-Tools_8193279.html)
-
-<span id="title-text"> HCC-DOCS : Alignment Tools </span>
-=========================================================
-
-Created by <span class="author"> Adam Caprez</span> on Sep 04, 2014
-
- 
-
-
+{{% children %}}
\ No newline at end of file
--- a/content/guides/running_applications/bioinformatics_tools/alignment_tools/blast/_index.md
+++ b/content/guides/running_applications/bioinformatics_tools/alignment_tools/blast/_index.md
--- a/content/guides/running_applications/bioinformatics_tools/alignment_tools/blast/create_local_blast_database.md
+++ b/content/guides/running_applications/bioinformatics_tools/alignment_tools/blast/create_local_blast_database.md
--- a/content/guides/running_applications/bioinformatics_tools/alignment_tools/blast/running_blast_alignment.md
+++ b/content/guides/running_applications/bioinformatics_tools/alignment_tools/blast/running_blast_alignment.md
--- a/content/guides/running_applications/bioinformatics_tools/alignment_tools/blat.md
+++ b/content/guides/running_applications/bioinformatics_tools/alignment_tools/blat.md
-1.  [HCC-DOCS](index.html)
-2.  [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
-3.  [HCC Documentation](HCC-Documentation_332651.html)
-4.  [Running Applications](Running-Applications_7471153.html)
-5.  [Bioinformatics Tools](Bioinformatics-Tools_8193279.html)
-6.  [Alignment Tools](Alignment-Tools_8193288.html)
+++
+title = "BLAT"
+description =  "How to run BLAT on HCC resources"
+weight = "10"
+++

-<span id="title-text"> HCC-DOCS : BLAT </span>
-==============================================

-Created by <span class="author"> Adam Caprez</span>, last modified by
-<span class="editor"> Natasha Pavlovikj</span> on Dec 12, 2016
-
-| Name | Version | Resource |
-|------|---------|----------|
-| blat | 35x1    | Tusker   |
-
-|      |      |       |
-|------|------|-------|
-| blat | 35x1 | Crane |
-
-<span style="line-height: 1.4285715;">  
-</span>
-
-<span style="line-height: 1.4285715;">BLAT is a pairwise alignment tool
-similar to BLAST. It is more accurate and about 500 times faster than
-the existing tools for mRNA/DNA alignments and it is about 50 times
-faster with protein/protein alignments. BLAT accepts short and long
-query and database sequences as input files.</span>
+BLAT is a pairwise alignment tool similar to BLAST. It is more accurate and about 500 times faster than the existing tools for mRNA/DNA alignments and it is about 50 times faster with protein/protein alignments. BLAT accepts short and long query and database sequences as input files.

 The basic usage of BLAT is:
-
-**General BLAT Usage**
-
-``` syntaxhighlighter-pre
-blat database query output_alignment.txt [options]
-```
-
-where **database** is the name of the database used for the alignment,
-**query** is the name of the input file of sequence data in
-fasta/nib/2bit format, and **output\_alignment.txt** is the output
-alignment file. Additional parameters for BLAT alignment can be found in
-the
-manual: <a href="http://genome.ucsc.edu/goldenPath/help/blatSpec.html" class="external-link">http://genome.ucsc.edu/goldenPath/help/blatSpec.html</a>,
-or by using
-
-**Additional BLAT Options**
-
-``` syntaxhighlighter-pre
-[<username>@login.tusker~]$ blat
-```
-
-Running BLAT on Tusker with query file **input\_reads.fasta** and
-database **db.fa** is shown below:
-
-**blat\_alignment.submit**
-
-\#!/bin/sh  
-\#SBATCH --job-name=Blat  
-\#SBATCH --nodes=1  
-\#SBATCH --ntasks-per-node=1  
-\#SBATCH --time=168:00:00  
-\#SBATCH --mem=50gb  
-\#SBATCH --output=Blat.%J.out  
-\#SBATCH --error=Blat.%J.err
-
- 
-
-|                       |
-|-----------------------|
-| module load blat/35x1 |
-
-blat db.fa input\_reads.fasta output\_alignment.txt
-
-Although BLAT is a single threaded program (**\#SBATCH --nodes=1**,
-**\#SBATCH --ntasks-per-node=1**) it is still much faster than the other
-alignment tools.
-
- 
-
-**BLAT Output**
-
-BLAT output is a list containing the following information: *the score
-of the alignment*, *the region of query sequence that matches the
-database sequence*, *the size of the query sequence*, *the level of
-identity as a percentage of the alignment* and *the chromosome and
-position that the query sequence maps to*.
-
-Attachments:
------------
-
-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
-[cb\_blat\_module.xsl](attachments/8193292/8127546.xsl)
-(application/octet-stream)  
-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
-[crane\_blat\_version.xsl](attachments/8193292/8127547.xsl)
-(application/octet-stream)  
-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
-[crane\_modules.xml](attachments/8193292/8127548.xml)
-(application/octet-stream)  
-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
-[tusker\_blat\_version.xsl](attachments/8193292/8127549.xsl)
-(application/octet-stream)  
-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
-[tusker\_modules.xml](attachments/8193292/8127550.xml)
-(application/octet-stream)  
-
-
+{{< highlight bash >}}
+$ blat database query output_alignment.txt [options]
+{{< /highlight >}}
+where **database** is the name of the database used for the alignment, **query** is the name of the input file of sequence data in `fasta/nib/2bit` format, and **output_alignment.txt** is the output alignment file.
+
+Additional parameters for BLAT alignment can be found in the [manual] (http://genome.ucsc.edu/FAQ/FAQblat), or by using:
+{{< highlight bash >}}
+$ blat
+{{< /highlight >}}
+
+\\
+Running BLAT on Tusker with query file `input_reads.fasta` and database `db.fa` is shown below:
+{{% panel header="`blat_alignment.submit`"%}}
+{{< highlight bash >}}
+#!/bin/sh
+#SBATCH --job-name=Blat
+#SBATCH --nodes=1
+#SBATCH --ntasks-per-node=1
+#SBATCH --time=168:00:00
+#SBATCH --mem=50gb
+#SBATCH --output=Blat.%J.out
+#SBATCH --error=Blat.%J.err
+
+module load blat/35x1
+
+blat db.fa input_reads.fasta output_alignment.txt
+{{< /highlight >}}
+{{% /panel %}}
+
+Although BLAT is a single threaded program (`#SBATCH --nodes=1`, `#SBATCH --ntasks-per-node=1`) it is still much faster than the other alignment tools.
+
+\\
+<span style="color: rgb(0,0,0);font-size: 20.0px;line-height: 1.5;">BLAT Output</span>
+
+BLAT output is a list containing the following information:
+
+- the score of the alignment
+- the region of query sequence that matches the database sequence
+- the size of the query sequence
+- the level of identity as a percentage of the alignment
+- the chromosome and position that the query sequence maps to
\ No newline at end of file
--- a/content/guides/running_applications/bioinformatics_tools/alignment_tools/bowtie.md
+++ b/content/guides/running_applications/bioinformatics_tools/alignment_tools/bowtie.md
--- a/content/guides/running_applications/bioinformatics_tools/alignment_tools/bowtie2.md
+++ b/content/guides/running_applications/bioinformatics_tools/alignment_tools/bowtie2.md
--- a/content/guides/running_applications/bioinformatics_tools/alignment_tools/bwa/_index.md
+++ b/content/guides/running_applications/bioinformatics_tools/alignment_tools/bwa/_index.md
--- a/content/guides/running_applications/bioinformatics_tools/alignment_tools/bwa/running_bwa_commands.md
+++ b/content/guides/running_applications/bioinformatics_tools/alignment_tools/bwa/running_bwa_commands.md
--- a/content/guides/running_applications/bioinformatics_tools/alignment_tools/clustal_omega.md
+++ b/content/guides/running_applications/bioinformatics_tools/alignment_tools/clustal_omega.md
-1.  [HCC-DOCS](index.html)
-2.  [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
-3.  [HCC Documentation](HCC-Documentation_332651.html)
-4.  [Running Applications](Running-Applications_7471153.html)
-5.  [Bioinformatics Tools](Bioinformatics-Tools_8193279.html)
-6.  [Alignment Tools](Alignment-Tools_8193288.html)
+++
+title = "Clustal Omega"
+description =  "How to run Clustal Omega on HCC resources"
+weight = "10"
+++

-<span id="title-text"> HCC-DOCS : Clustal Omega </span>
-=======================================================
-
-Created by <span class="author"> Adam Caprez</span>, last modified by
-<span class="editor"> Natasha Pavlovikj</span> on Dec 12, 2016
-
-| Name          | Version | Resource |
-|---------------|---------|----------|
-| clustal-omega | 1.2     | Tusker   |
-
-|               |     |       |
-|---------------|-----|-------|
-| clustal-omega | 1.2 | Crane |
-
- 
-
-Clustal Omega
-(<a href="http://www.clustal.org/omega/" class="external-link">http://www.clustal.org/omega/</a>)
-is a general purpose multiple sequence alignment (MSA) tool used mainly
-with protein, as well as DNA and RNA sequences. Clustal Omega is fast
-and scalable aligner that can align datasets of hundreds of thousands of
-sequences in reasonable time.
+[Clustal Omega] (http://www.clustal.org/omega/) is a general purpose multiple sequence alignment (MSA) tool used mainly with protein, as well as DNA and RNA sequences. Clustal Omega is fast and scalable aligner that can align datasets of hundreds of thousands of sequences in reasonable time.

 The general usage of Clustal Omega is:
+{{< highlight bash >}}
+$ clustalo -i input_file.fasta -o output_file.fasta [options]
+{{< /highlight >}}
+where **input_file.fasta** is the multiple sequence input file in `fasta` format, and **output_file.fasta** is the multiple sequence alignment output file in `fasta` format.

-**General Clustal Omega Usage**
-
-``` syntaxhighlighter-pre
-clustalo -i input_file.fasta -o output_file.fasta [options]
-```
-
-where **input\_file.fasta** is the multiple sequence input file in
-*fasta* format, and **output\_file.fasta** is the multiple sequence
-alignment output file in *fasta* format.  
+\\
 Clustal Omega accepts 3 types of sequence input files:

-   sequence file with aligned/unaligned sequences
-
-<!-- -->
+- sequence file with aligned/unaligned sequences
+- multiple alignment in a file/profile of aligned sequences
+- Hidden Markov Model (HMM) 

-   multiple alignment in a file/profile of aligned sequences
-
-<!-- -->
-
-   Hidden Markov Model (HMM) 
-
-These input files must contain at least 2 sequences and must be in one
-of the following MSA file formats: **a2m**, **fa\[sta\]**,
-**clu\[stal\]**, **msf**, **phy\[lip\]**, **selex**, **st\[ockholm\]**,
-**vie\[nna\]**. Moreover, if not specified, the generated output file is
-in *fasta* format.
+These input files must contain at least 2 sequences and must be in one of the following MSA file formats: `a2m`, `fa[sta]`, `clu[stal]`, `msf`, `phy[lip]`, `selex`, `st[ockholm]`, `vie[nna]`. Moreover, if not specified, the generated output file is in `fasta` format.

+\\
 More Clustal Omega options can be found by typing:
-
-**Additional Clustal Omega Options**
-
-``` syntaxhighlighter-pre
-[<username>@login.tusker~]$ clustalo -h
-```
-
-  
-Running Clustal Omega on Tusker with input
-file **input\_reads.fasta** with **8 threads** and **10GB memory** is
-shown below:
-
-**clustal\_omega.submit**
-
-\#!/bin/sh  
-\#SBATCH --job-name=Clustal\_Omega  
-\#SBATCH --nodes=1  
-\#SBATCH --ntasks-per-node=8  
-\#SBATCH --time=10:00:00  
-\#SBATCH --mem=10gb  
-\#SBATCH --output=ClustalOmega.%J.out  
-\#SBATCH --error=ClustalOmega.%J.err
-
- 
-
-|                               |
-|-------------------------------|
-| module load clustal-omega/1.2 |
-
-clustalo -i input\_reads.fasta -o output\_msa.sto --outfmt=st
--threads=$SLURM\_NTASKS\_PER\_NODE
-
-The output file **output\_msa.sto** contains the resulting multiple
-sequence alignments in Stockholm format (**--outfmt=st**).
+{{< highlight bash >}}
+$ clustalo -h
+{{< /highlight >}}
+
+\\
+Running Clustal Omega on Tusker with input file `input_reads.fasta` with `8 threads` and `10GB memory` is shown below:
+{{% panel header="`clustal_omega.submit`"%}}
+{{< highlight bash >}}
+#!/bin/sh
+#SBATCH --job-name=Clustal_Omega
+#SBATCH --nodes=1
+#SBATCH --ntasks-per-node=8
+#SBATCH --time=10:00:00
+#SBATCH --mem=10gb
+#SBATCH --output=ClustalOmega.%J.out
+#SBATCH --error=ClustalOmega.%J.err
+
+module load clustal-omega/1.2
+
+clustalo -i input_reads.fasta -o output_msa.sto --outfmt=st 	--threads=$SLURM_NTASKS_PER_NODE
+{{< /highlight >}}
+{{% /panel %}}
+
+The output file `output_msa.sto` contains the resulting multiple sequence alignments in Stockholm format (**--outfmt=st**).

 Moreover, if you change the command above with:
+{{< highlight bash >}}
+$ clustalo -i input_reads.sto --dealign -v
+{{< /highlight >}}
+Clustal Omega will read the input file in Stockholm format, de-align the sequences, and then re-align them, printing progress report in meanwhile (**-v**). Because it is not specified, the output will be in the default `fasta` format.

-**Clustal Omega with De-align Option**
-
-``` syntaxhighlighter-pre
-clustalo -i input_reads.sto --dealign -v
-```
-
-Clustal Omega will read the input file in Stockholm format, de-align the
-sequences, and then re-align them, printing progress report in meanwhile
-(**-v**). Because it is not specified, the output will be in the default
-**fasta** format.
-
- 
-
-**Clustal Omega Output**
-
-The basic Clustal Omega output produces one alignment file in the
-specified output format. More intermediate outputs can be generated
-using specific Clustal Omega options, such
-as: **--distmat-out=&lt;file&gt;** (*pairwise distance matrix output
-file*) and **--guidetree-out=&lt;file&gt;** (*guide tree output file*).
-
-**  
-Useful Information**
-
-In order to test the Clustal Omega performance on Tusker, we used three
-DNA and protein input fasta files: **data\_1. fasta, data\_2. fasta,
-data\_3.fasta**. Some statistics about the input files and the time and
-memory resources required for Clustal Omega are shown on the table
-below:
-
-<table style="width:100%;">
-<colgroup>
-<col style="width: 14%" />
-<col style="width: 14%" />
-<col style="width: 14%" />
-<col style="width: 14%" />
-<col style="width: 14%" />
-<col style="width: 14%" />
-<col style="width: 14%" />
-</colgroup>
-<thead>
-<tr class="header">
-<th> </th>
-<th><p><strong>total # of sequences</strong></p></th>
-<th><p><strong>average sequence length</strong></p></th>
-<th><p><strong>total size in MB</strong></p></th>
-<th><p><strong>Clustal Omega required time</strong></p></th>
-<th><p><strong>Clustal Omega required memory</strong></p></th>
-<th># of used CPUs</th>
-</tr>
-</thead>
-<tbody>
-<tr class="odd">
-<td><p><strong>data_1.fasta</strong></p></td>
-<td><p>1,200</p></td>
-<td><p>510.17</p></td>
-<td><p>641 KB</p></td>
-<td><p>~ 5 minutes</p></td>
-<td><span>~ 65 MB</span></td>
-<td>8</td>
-</tr>
-<tr class="even">
-<td><p><strong>data_2.fasta</strong></p></td>
-<td><p>5,715</p></td>
-<td><p>174.20</p></td>
-<td><p>1,100 KB</p></td>
-<td>~ 5 minutes</td>
-<td><p>~ 140 MB</p></td>
-<td><p>8</p></td>
-</tr>
-<tr class="odd">
-<td><p><strong>data_3.fasta</strong></p></td>
-<td><p>93,675</p></td>
-<td><p>94.29</p></td>
-<td><p>11,000 KB</p></td>
-<td><p>~ 30 minutes</p></td>
-<td><p>~ 2 GB</p></td>
-<td><p>8</p></td>
-</tr>
-</tbody>
-</table>
-
-Attachments:
------------
+\\
+<span style="color: rgb(0,0,0);font-size: 20.0px;line-height: 1.5;">Clustal Omega Output</span>

-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
-[crane\_clustal\_omega\_version.xsl](attachments/9470379/9863812.xsl)
-(application/octet-stream)  
-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
-[cb\_clustal\_omega\_module.xsl](attachments/9470379/9863813.xsl)
-(application/octet-stream)  
-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
-[tusker\_clustal\_omega\_version.xsl](attachments/9470379/9863814.xsl)
-(application/octet-stream)  
-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
-[crane\_modules.xml](attachments/9470379/9863815.xml)
-(application/octet-stream)  
-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
-[tusker\_modules.xml](attachments/9470379/9863816.xml)
-(application/octet-stream)  
+The basic Clustal Omega output produces one alignment file in the specified output format. More intermediate outputs can be generated using specific Clustal Omega options, such as: **--distmat-out=<file>** (*pairwise distance matrix output file*) and **--guidetree-out=<file>** (*guide tree output file*).

+\\
+<span style="color: rgb(0,0,0);font-size: 20.0px;line-height: 1.5;">Useful Information</span>

+In order to test the Clustal Omega performance on Tusker, we used three DNA and protein input fasta files, `data_1.fasta`, `data_2.fasta`, `data_3.fasta`. Some statistics about the input files and the time and memory resources used by Clustal Omega on Tusker are shown on the table below:
+{{< readfile file="/static/html/clustal_omega.html" >}}
\ No newline at end of file
--- a/content/guides/running_applications/bioinformatics_tools/alignment_tools/tophat_tophat2.md
+++ b/content/guides/running_applications/bioinformatics_tools/alignment_tools/tophat_tophat2.md
--- a/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/_index.md
+++ b/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/_index.md
 +++
 title = "Data Manipulation Tools"
+description = "How to use data manipulation tools on HCC machines"
+weight = "52"
 +++

-1.  [HCC-DOCS](index.html)
-2.  [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
-3.  [HCC Documentation](HCC-Documentation_332651.html)
-4.  [Running Applications](Running-Applications_7471153.html)
-5.  [Bioinformatics Tools](Bioinformatics-Tools_8193279.html)
-
-<span id="title-text"> HCC-DOCS : Data Manipulation Tools </span>
-=================================================================
-
-Created by <span class="author"> Adam Caprez</span> on Sep 04, 2014
-
- 
-
-
+{{% children %}}
\ No newline at end of file
--- a/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/bamtools/_index.md
+++ b/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/bamtools/_index.md
--- a/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/bamtools/running_bamtools_commands.md
+++ b/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/bamtools/running_bamtools_commands.md
--- a/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/samtools/_index.md
+++ b/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/samtools/_index.md
--- a/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/samtools/running_bamtools_commands.md
+++ b/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/samtools/running_bamtools_commands.md
--- a/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/samtools/running_samtools_commands.md
+++ b/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/samtools/running_samtools_commands.md
--- a/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/sratoolkit.md
+++ b/content/guides/running_applications/bioinformatics_tools/data_manipulation_tools/sratoolkit.md
--- a/content/guides/running_applications/bioinformatics_tools/de_novo_assembly_tools/oases.md
+++ b/content/guides/running_applications/bioinformatics_tools/de_novo_assembly_tools/oases.md
@@ -35,7 +35,7 @@ A simple SLURM script to run Oases on the Velvet output stored in `output_direct
 #SBATCH --output=Oases.%J.out
 #SBATCH --error=Oases.%J.err

-module load oases/0.2.8
+module load oases/0.2

 oases output_directory/ -min_trans_lgth 200
 {{< /highlight >}}