Natasha Pavlovikj · 6e5a0698
--- a/content/guides/running_applications/bioinformatics_tools/de_novo_assembly_tools/trinity/_index.md

+ 40

− 247
+++ b/content/guides/running_applications/bioinformatics_tools/de_novo_assembly_tools/trinity/_index.md

+ 40

− 247
-1.  [HCC-DOCS](index.html)
+++
-2.  [HCC-DOCS Home](HCC-DOCS-Home_327685.html)
+title = "Trinity"
-3.  [HCC Documentation](HCC-Documentation_332651.html)
+description = "How to use Trinity on HCC machines"
-4.  [Running Applications](Running-Applications_7471153.html)
+weight = "52"
-5.  [Bioinformatics Tools](Bioinformatics-Tools_8193279.html)
+++
-6.  [De Novo Assembly Tools](De-Novo-Assembly-Tools_8193280.html)
-<span id="title-text"> HCC-DOCS : Trinity </span>
-=================================================
-Created by <span class="author"> Adam Caprez</span>, last modified by
-<span class="editor"> Natasha Pavlovikj</span> on Feb 26, 2018
-| Name    | Version       | Resource |
-|---------|---------------|----------|
-| trinity | r2013-02-25   | Tusker   |
-| trinity | r2013-11-10   | Tusker   |
-| trinity | r2014-04-13p1 | Tusker   |
-|         |               |       |
-|---------|---------------|-------|
-| trinity | r2013-11-10   | Crane |
-| trinity | r2014-04-13p1 | Crane |
-Trinity
+[Trinity] (https://github.com/trinityrnaseq/trinityrnaseq/wiki) is a method for efficient and robust de novo reconstruction of transcriptomes from RNA-Seq data. Trinity combines three independent software modules: `Inchworm`, `Chrysalis`, and `Butterfly`. All these modules can be applied sequentially to process large RNA-Seq datasets.
-(<a href="http://trinityrnaseq.sourceforge.net/" class="external-link">http://trinityrnaseq.sourceforge.net/</a>)
-is a method for efficient and robust de novo reconstruction of
-transcriptomes from RNA-Seq data. Trinity combines three independent
-software modules: Inchworm, Chrysalis, and Butterfly. All these modules
-can be applied sequentially to process large RNA-Seq datasets. 
 The basic usage of Trinity is:
+{{< highlight bash >}}
+$ Trinity --seqType [fa|fq] --JM <jellyfish_memory> --left input_reads_pair_1.[fa|fq] --right input_reads_pair_2.[fa|fq] [options]
+{{< /highlight >}}
+where **input_reads_pair_1.[fa|fq]** and **input_reads_pair_2.[fa|fq]** are the input paired-end files of sequence reads in fasta/fastq format, and **--seqType** is the type of these input reads. The option **--JM** defines the number of GB of system memory required for k-mer counting by jellyfish.
-**General Trinity Usage**
+Additional Trinity **options** can be found in the Trinity website, or by typing:
+{{< highlight bash >}}
-``` syntaxhighlighter-pre
+$ Trinity
-Trinity --seqType [fa|fq] --JM <jellyfish_memory> --left input_reads_pair_1.[fa|fq] --right input_reads_pair_2.[fa|fq] [options]
+{{< /highlight >}}
-```
-where **input\_reads\_pair\_1.\[fa\|fq\]**
+Running the Trinity pipeline from beginning to end on large datasets may exceed the walltime limit for a single job. Therefore, Trinity provides a mechanism to run the workflow in four separate steps, where each step resumes from the previous one. The same Trinity command and options are run for each step, with an additional option that is included for the different steps. On the last step, the Trinity command is run as normal.
-and **input\_reads\_pair\_2.\[fa\|fq\]** are the input paired-end files
-of sequence reads in fasta/fastq format, and **--seqType** is the type
-of these input reads. The option **--JM** defines the number of GB of
-system memory required for k-mer counting by jellyfish. Additional
-Trinity **options** can be found in the Trinity website, or by typing:
-**Additional Trinity Options**
+{{% panel theme="info" header="Step 1 Options" %}}
+{{< highlight bash >}}
-``` syntaxhighlighter-pre
-[<username>@login.tusker ~]$ Trinity
-```
-Running the Trinity pipeline from beginning to end on large datasets may
-exceed the walltime limit for a single job. Therefore, Trinity provides
-a mechanism to run the workflow in four separate steps, where each step
-resumes from the previous one. The same Trinity command and options are
-run for each step, with difference of an additional option that is
-included for the different steps. On the last step, the Trinity command
-is run as normal.
-**Step 1:**
-**Trinity Step 1 Options**
-``` syntaxhighlighter-pre
 Trinity.pl [options] --no_run_chrysalis
-```
+{{< /highlight >}}
+{{% /panel %}}
-**Step 2: **
-**Trinity Step 2 Options**
+{{% panel theme="info" header="Step 2 Options" %}}
+{{< highlight bash >}}
-``` syntaxhighlighter-pre
 Trinity.pl [options] --no_run_quantifygraph
-```
+{{< /highlight >}}
+{{% /panel %}}
-**Step 3:**
-**Trinity Step 3 Options**
-``` syntaxhighlighter-pre
+{{% panel theme="info" header="Step 3 Options" %}}
+{{< highlight bash >}}
 Trinity.pl [options] --no_run_butterfly
-```
+{{< /highlight >}}
+{{% /panel %}}
-**Step 4:**
+{{% panel theme="info" header="Step 4 Options" %}}
+{{< highlight bash >}}
-**Trinity Step 4 Options**
-``` syntaxhighlighter-pre
 Trinity.pl [options]
-```
+{{< /highlight >}}
+{{% /panel %}}
-Each step may be run as its own job, providing a workaround for the
-single job walltime limit. The following page describes how to run each
-step of Trinity as a single job under the SLURM scheduler on HCC:
-**Useful Information**
-In order to test the TRINITY (trinity/r2014-04-13p1) performance on
-Tusker, we used three paired-end input fastq files: **small\_1.fastq**,
-**small\_2.fastq**, **medium\_1.fastq**, **medium\_2.fastq**,
-**large\_1.fastq**, **large\_2.fastq. **Some statistics about the input
-files and the time and memory resources required for TRINITY are shown
-on the table below:
-**total \# of sequences**
-**total \# of bases**
-**total size in MB**
-**Trinity Step 1 required time**
-**Trinity Step 1 required memory**
-Trinity Step 2 required time
-Trinity Step 2 required memory
-Trinity Step 3 required time
-Trinity Step 3 required memory
-Trinity Step 4 required time
-Trinity Step 4 required memory
-\# of used CPUs
-**small\_1.fastq**
-50,121
-2,506,050
-8.010 MB
-\~ 1 minute
-\~ 35 GB
-\~ 0.01 hours
-\~ 0.6 GB
-\~ 0.2 minutes
-\~ 0.07 GB
-\~ 0.008 hours
-\~ 0.8 GB
-8
-**small\_2.fastq**
-50,121
-2,506,050
-8.010 MB
-**medium\_1.fastq**
-786,742
-59,792,392
-152 MB
-\~ 3 minutes
-\~ 68 GB
-\~ 0.1 hours
-\~ 3 GB
-\~ 0.8 minutes
-\~ 0.6 GB
-\~ 0.3 hours
-\~ 5 GB
-8
-**medium\_2.fastq**
-786,742
-59,792,392
-152 MB
-**large\_1.fastq**
-10,174,715
-1,027,646,215
-3,376 MB
-\~ 58 minutes
-\~ 80 GB
-\~ 5 hours
-\~ 30 GB
-\~ 35 minutes
-\~ 8 GB
-\~ 13 hours
-\~ 30 GB
-8
-**large\_2.fastq**
-10,174,715
-1,027,646,215
-3,376 MB
-Memory Requirement
-<span
-class="aui-icon aui-icon-small aui-iconfont-warning confluence-information-macro-icon"></span>
-<span style="color: rgb(0,0,0);">The Inchworm (step 1) and Chrysalis
-(step 2) steps can be memory intensive. A basic recommendation is to
-have **1GB of RAM per 1M** </span><span
-style="color: rgb(0,0,0);">**\~76 base Illumina paired-end
-reads**.</span>
-Attachments:
+Each step may be run as its own job, providing a workaround for the single job walltime limit. To see how to run each step of Trinity as a single job under the SLURM scheduler on HCC, please check:
------------
+{{% children %}}
-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
+\\
-[crane\_modules.xml](attachments/8193286/8127532.xml)
+<span style="color: rgb(0,0,0);font-size: 20.0px;line-height: 1.5;">Useful Information</span>
-(application/octet-stream)  
-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
-[crane\_trinity\_version.xsl](attachments/8193286/8127533.xsl)
-(application/octet-stream)  
-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
-[tusker\_modules.xml](attachments/8193286/8127534.xml)
-(application/octet-stream)  
-<img src="assets/images/icons/bullet_blue.gif" width="8" height="8" />
-[tusker\_trinity\_version.xsl](attachments/8193286/8127535.xsl)
-(application/octet-stream)  
+In order to test the Trinity (trinity/r2014-04-13p1) performance on Tusker, we used three paired-end input fastq files, `small_1.fastq` and `small_2.fastq`, `medium_1.fastq` and `medium_2.fastq`, and `large_1.fastq` and `large_2.fastq`. Some statistics about the input files and the time and memory resources used by Trinity on Tusker are shown in the table below:
+{{< readfile file="/static/html/trinity.html" >}}
+{{% notice tip %}}
+The Inchworm (step 1) and Chrysalis (step 2) steps can be memory intensive. A basic recommendation is to have **1GB of RAM per 1M ~76 base Illumina paired-end reads**.
+{{% /notice %}}
+\ No newline at end of file