1 Just a reminder

Course etiquette
Please read the course etiquette, if you haven’t read that yet.

Shared document
We are using shared GoogleDocs documents for each of the main topics covered during the summer school. The document for this section can be found here.

Docker

docker exec -ti -u 0 docker /bin/bash 

Software requirements (if not using Docker)
If you want to follow this tutorial using your own machine, you need to install the following command line tools:

You can install the tools one by one, but a very convenient way to manage installed tools/packages and their dependencies is Conda. If you are new to Conda, please follow this tutorial.

1.1 Exercise 1 (with answers)

  1. Go to /home/ubuntu/Course_Materials/Introduction/practicals/data/sample2_STAR/ directory.

cd /home/ubuntu/Course_Materials/Introduction/practicals/data/sample2_STAR/
  1. Convert Aligned.sortedByCoord.out.bam to Aligned.sortedByCoord.out.sam

samtools view Aligned.sortedByCoord.out.bam > Aligned.sortedByCoord.out.sam
  1. Compare the size of BAM and SAM file.

ls -al #will list file sizes

  44158799  Aligned.sortedByCoord.out.bam
 415515880  Aligned.sortedByCoord.out.sam
  1. How many reads Aligned.sortedByCoord.out.bam out of first 10 reads was mapped uniquely? Hint: mapping quality = 255 for uniquely mapped reads.

samtools view -bh -q 255 Aligned.sortedByCoord.out.bam > Aligned.sortedByCoord.out.unique.bam 
samtools view Aligned.sortedByCoord.out.unique.bam |head -n 10

answer: all 10 reads

  1. Sort Aligned.sortedByCoord.out.bam using samtools sort command, save the output as Aligned.sortedByCoord.out.sorted.bam

samtools sort Aligned.sortedByCoord.out.bam -o Aligned.sortedByCoord.out.sorted.bam
  1. Index Aligned.sortedByCoord.out.sorted.bam using samtools index command

samtools index Aligned.sortedByCoord.out.sorted.bam 
  1. Extract only uniquely mapped reads from Aligned.sortedByCoord.out.sorted.bam and save them as Aligned.sortedByCoord.out.sorted.unique.bam

samtools view -bh -q 255 Aligned.sortedByCoord.out.sorted.bam > Aligned.sortedByCoord.out.sorted.unique.bam
  1. [ADVANCED] How many reads were mapped uniquely?

samtools view -c -q 255 Aligned.sortedByCoord.out.sorted.bam

answer= 1261841

  1. [ADVANCED] How many reads mapped uniquely to PIK3CA?

samtools view -c -q 255 Aligned.sortedByCoord.out.sorted.bam "chr3:179148114-179240093"

answer =2766

3 Acknowledgements

Dora Bihary
VIB Center for Cancer Biology, University of Leuven, BE
MRC Cancer Unit, University of Cambridge, UK

Joanna A. Krupka MRC Cancer Unit, University of Cambridge, UK
Shoko Hirosue MRC Cancer Unit, University of Cambridge, UK

Harvard Chan Bioinformatics Core