1 Indexing the genome for Hisat2

Exercise 1

  1. Check your current working directory and if necessary navigate to the Course_Materials/ directory using the command cd (change directory).

pwd - to check present wworking directory

cd ~/Course_Materials - if necessary

  1. Use ls to list the contents of the directory.

ls

  1. Use ls references to list the contents of the references directory.

ls references

  1. Make a directory called hisat2_index_chr1 inside the references directory. This is where we will create our chr1 index.

mkdir references/hisat2_index_chr1

  1. Create the hisat2 index by runnning the following command:

hisat2-build -p 7 references/Mus_musculus.GRCm38.chr1.fa references/hisat2_index_chr1/grcm38

  1. Why do we use -p 7? Take a look at hisat2-build help.

hisat2-build --help The -p flag is used to instruct hisat2 about how many threads (processors) it should use when running an operation. Using multiple processors in parallel speeds up the analysis. In our case, the machines we are using have 8 processors and so we tell hisat2 to use 7 of these which leaves one free.

  1. How many files are created?

Hisat2 always creates 8 index files that start with our base name end with .X.ht2. So in this case we have mmu.GRCm38.1.ht2 to mmu.GRCm38.8.ht2.

2 Align with Hisat2

Exercise 2

  1. Create a directory called bam (BAM will be our final aligned file format, but we have one more step after alignment to get there).

mkdir bam

  1. Use hisat2 to align the fastq file. Use the following parameters
    • Index (the full genome this time) - references/hisat_index/mmu.GRCm38
    • Fastq file - fastq/MCL1.DL.fastq.gz
    • Output file - bam/SMCL1.DL.sam
    • Set the number of threads (number of processors to use) to 7 - check the help page to find the appropriate flag
    • Add the -t flag so that HISAT2 will print a time log to the console
hisat2 -x references/hisat_index/mmu.GRCm38 \
       -U fastq/MCL1.DL.fastq.gz \
       -S bam/MCL1.DL.sam \
       -p 7 \
       -t

3 Convert the SAM output to BAM

Exercise 3

  1. Transform your aligned SAM file in to a BAM file.

samtools view -b -@ 7 bam/MCL1.DL.sam > bam/MCL1.DL.bam

  1. Sort the BAM file

samtools sort -@ 7 bam/MCL1.DL.bam > bam/MCL1.DL.sorted.bam

  1. Index the sorted BAM file

samtools index bam/MCL1.DL.sorted.bam
generates the file bam/MCL1.DL.sorted.bam.bai