- Check your current working directory and if necessary navigate to the
Course_Materials/
directory using the commandcd
(change directory).
pwd
- to check present wworking directory
cd ~/Course_Materials
- if necessary
- Use
ls
to list the contents of the directory.
ls
- Use
ls references
to list the contents of thereferences
directory.
ls references
- Make a directory called
hisat2_index_chr1
inside thereferences
directory. This is where we will create our chr1 index.
mkdir references/hisat2_index_chr1
- Create the hisat2 index by runnning the following command:
hisat2-build -p 7 references/Mus_musculus.GRCm38.chr1.fa references/hisat2_index_chr1/grcm38
- Why do we use
-p 7
? Take a look athisat2-build
help.
hisat2-build --help
The-p
flag is used to instruct hisat2 about how many threads (processors) it should use when running an operation. Using multiple processors in parallel speeds up the analysis. In our case, the machines we are using have 8 processors and so we tell hisat2 to use 7 of these which leaves one free.
- How many files are created?
Hisat2 always creates 8 index files that start with our base name end with
.X.ht2
. So in this case we havemmu.GRCm38.1.ht2
tommu.GRCm38.8.ht2
.
- Create a directory called
bam
(BAM will be our final aligned file format, but we have one more step after alignment to get there).
mkdir bam
- Use hisat2 to align the fastq file. Use the following parameters
- Index (the full genome this time) -
references/hisat_index/mmu.GRCm38
- Fastq file -
fastq/MCL1.DL.fastq.gz
- Output file -
bam/SMCL1.DL.sam
- Set the number of threads (number of processors to use) to 7 - check the help page to find the appropriate flag
- Add the
-t
flag so that HISAT2 will print a time log to the consolehisat2 -x references/hisat_index/mmu.GRCm38 \ -U fastq/MCL1.DL.fastq.gz \ -S bam/MCL1.DL.sam \ -p 7 \ -t
- Transform your aligned SAM file in to a BAM file.
samtools view -b -@ 7 bam/MCL1.DL.sam > bam/MCL1.DL.bam
- Sort the BAM file
samtools sort -@ 7 bam/MCL1.DL.bam > bam/MCL1.DL.sorted.bam
- Index the sorted BAM file
samtools index bam/MCL1.DL.sorted.bam
generates the filebam/MCL1.DL.sorted.bam.bai