bowtie2.md 3.46 KB
Newer Older
npavlovikj's avatar
npavlovikj committed
1
2
3
4
5
6
+++
title = "Bowtie2"
description =  "How to run Bowtie2 on HCC resources"
weight = "10"
+++

npavlovikj's avatar
i    
npavlovikj committed
7

npavlovikj's avatar
npavlovikj committed
8
9
10
11
12
13
14
15
16
17
[Bowtie2] (http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. Although Bowtie and Bowtie2 are both fast read aligners, there are few main differences between them:

- Bowtie2 supports gapped alignment with affine gap penalties, without restrictions on the number of gaps and gap lengths.
- Bowtie supports reads longer than 50bp and is generally faster, more sensitive, and uses less memory than Bowtie.
- Bowtie support only end-to-end alignments, while Bowtie2 supports both end-to-end and local alignment.
- Bowtie has an upper limit on read length of around 1,000 bp, while Bowtie2 does not have any.
- Bowtie2's paired-end alignment is more flexible that Bowtie's.
- Bowtie2 does not align colorspace reads.
- Bowtie and Bowtie2 indices are not compatible.

npavlovikj's avatar
i    
npavlovikj committed
18

npavlovikj's avatar
npavlovikj committed
19
20
21
22
23
24
25
Same as Bowtie, the first and basic step of running Bowtie2 is to build Bowtie2 index from a reference genome sequence. The basic usage of the
command **bowtie2-build** is:
{{< highlight bash >}}
$ bowtie2-build -f input_reference.fasta index_prefix
{{< /highlight >}}
where **input_reference.fasta** is an input file of sequence reads in fasta format, and **index_prefix** is the prefix of the generated index files. Beside the option **-f** that is used when the reference input file is a fasta file, the option **-c** can be used when the reference sequences are given on the command line.

npavlovikj's avatar
i    
npavlovikj committed
26

npavlovikj's avatar
npavlovikj committed
27
28
29
30
31
32
The command **bowtie2** takes a Bowtie2 index and set of sequencing read files and outputs set of alignments in SAM format. The general **bowtie2** usage is:
{{< highlight bash >}}
$ bowtie2 -x index_prefix [-q|--qseq|-f|-r|-c] [-1 input_reads_pair_1.[fasta|fastq] -2 input_reads_pair_2.[fasta|fastq] | -U input_reads.[fasta|fastq]] -S bowtie2_alignments.sam [options]
{{< /highlight >}}
where **index_prefix** is the generated index using the **bowtie2-build** command, and **options** are optional parameters that can be found in the [Bowtie2 manual] (http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml). Bowtie2 supports both single-end (`input_reads.[fasta|fastq]`) and paired-end (`input_reads_pair_1.[fasta|fastq]`, `input_reads_pair_2.[fasta|fastq]`) files in fasta or fastq format. The format of the input files also needs to be specified by using one of the following flags: **-q** (fastq files), **--qseq** (Illumina's qseq format), **-f** (fasta files), **-r** (raw one sequence per line), or **-c** (sequences given on command line).

npavlovikj's avatar
i    
npavlovikj committed
33

npavlovikj's avatar
npavlovikj committed
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
An example of how to run Bowtie2 local alignment on Tusker with paired-end fasta files and `8 CPUs` is shown below:
{{% panel header="`bowtie2_alignment.submit`"%}}
{{< highlight bash >}}
#!/bin/sh
#SBATCH --job-name=Bowtie2
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --time=168:00:00
#SBATCH --mem=10gb
#SBATCH --output=Bowtie2.%J.out
#SBATCH --error=Bowtie2.%J.err

module load bowtie/2.3

bowtie2 -x index_prefix -f -1 input_reads_pair_1.fasta -2 input_reads_pair_2.fasta -S bowtie2_alignments.sam --local -p $SLURM_NTASKS_PER_NODE
{{< /highlight >}}
{{% /panel %}}


npavlovikj's avatar
i    
npavlovikj committed
53
54
55
### Bowtie2 Output

Bowtie2 outputs alignments in SAM format that can further be manipulated with different tools, like SAMtools and GATK. Each line from the file describes an alignment and is a collection of at least 12 fields separated by tabs. Detailed information about Bowtie2 output fields can be found in the Bowtie2 manual.