diff --git a/content/applications/app_specific/bioinformatics_tools/data_manipulation_tools/sratoolkit.md b/content/applications/app_specific/bioinformatics_tools/data_manipulation_tools/sratoolkit.md index 7efbaf2b1b322a5f96ac2d4486f7feaef3169862..2479607afd00cbc107e59857972f3decb61ba059 100644 --- a/content/applications/app_specific/bioinformatics_tools/data_manipulation_tools/sratoolkit.md +++ b/content/applications/app_specific/bioinformatics_tools/data_manipulation_tools/sratoolkit.md @@ -44,6 +44,15 @@ fastq-dump --split-files input_reads.sra {{% /panel %}} This script outputs two fastq paired end reads `input_reads_1.fastq` and `input_reads_2.fastq`. + +To download `bam` files from NCBI using the SRA identification, the following commands can be used: +{{< highlight bash >}} +$ module load SRAtoolkit/2.11 samtools +$ sam-dump <sra_id> | samtools view -bS - > <sra_id>.bam +{{< /highlight >}} +where `<sra_id>` is the assigned SRA identification in NCBI (e.g., SRR1482462). + + All SRAtoolkit commands are single threaded, and therefore both `#SBATCH --nodes` and `#SBATCH --ntasks-per-node` in the SLURM script are set to **1**. @@ -64,3 +73,7 @@ Other frequently used SRAtoolkit tools are: - **vdb-encrypt**: encrypt non-SRA dbGaP data - **vdb-decrypt**: decrypt non-SRA dbGaP data - **vdb-validate**: validate the integrity of downloaded SRA data + +{{% notice info %}} +**If needed, the location of the caching on a per-user basis can be changed with `vdb-config -i`.** +{{% /notice %}}