downloading_sra_data_from_ncbi.md 1.82 KB
Newer Older
1
2
3
4
5
+++
title = "Downloading SRA data from NCBI"
description = "How to download data from NCBI"
weight = "52"
+++
6

npavlovikj's avatar
i    
npavlovikj committed
7

8
9
One way to download high-volume data from NCBI is to use command line
utilities, such as **wget**, **ftp** or Aspera Connect **ascp**
10
11
12
13
14
15
16
plugin. The Aspera Connect plugin is commonly used high-performance transfer
plugin that provides the best transfer speed.

This plugin is available on our clusters as a module. In order to use it, load the appropriate module first:
{{< highlight bash >}}
$ module load aspera-cli
{{< /highlight >}}
npavlovikj's avatar
i    
npavlovikj committed
17
18


19
20
21
22
23
The basic usage of the Aspera plugin is
{{< highlight bash >}}
$ ascp -i $ASPERA_PUBLIC_KEY -k 1 -T -l <max_download_rate_in_Mbps>m anonftp@ftp.ncbi.nlm.nih.gov:/<files_to_transfer> <local_work_output_directory>
{{< /highlight >}}
where **-k 1** enables resume of partial transfers, **-T** disables encryption for maximum throughput, and **-l** sets the transfer rate.
npavlovikj's avatar
i    
npavlovikj committed
24
25
26


**\<files_to_transfer\>** mentioned in the basic usage of Aspera
27
plugin has a specifically defined pattern that needs to be followed:
28
{{< highlight bash >}}
29
<files_to_transfer> = /sra/sra-instant/reads/ByRun/sra/SRR|ERR|DRR/<first_6_characters_of_accession>/<accession>/<accession>.sra
30
31
{{< /highlight >}}
where **SRR\|ERR\|DRR** should be either **SRR**, **ERR **or **DRR** and should match the prefix of the target **.sra** file.
npavlovikj's avatar
i    
npavlovikj committed
32
33


34
More **ascp** options can be seen by using:
35
36
37
{{< highlight bash >}}
$ ascp --help
{{< /highlight >}}
npavlovikj's avatar
i    
npavlovikj committed
38
39


40
41
42
For example, if you want to download the **SRR304976** file from NCBI in your $WORK **data/** directory with downloading speed of **1000 Mbps**, you should use the following command:
{{< highlight bash >}}
$ ascp -i $ASPERA_PUBLIC_KEY -k 1 -T -l 1000m anonftp@ftp.ncbi.nlm.nih.gov:/sra/sra-instant/reads/ByRun/sra/SRR/SRR304/SRR304976/SRR304976.sra /work/[groupname]/[username]/data/
npavlovikj's avatar
i    
npavlovikj committed
43
{{< /highlight >}}