Course etiquette
Please read the course etiquette, if you haven’t read that yet.
Shared document
We are using shared GoogleDocs documents for each of the main topics covered during the summer school. The document for this section can be found here.
Docker
docker exec -ti -u 0 docker /bin/bash
Software requirements (if not using Docker)
If you want to follow this tutorial using your own machine, you need to install the following command line tools:
You can install the tools one by one, but a very convenient way to manage installed tools/packages and their dependencies is Conda. If you are new to Conda, please follow this tutorial.
- Go to
/home/ubuntu/Course_Materials/Introduction/practicals/data/sample2_STAR/
directory.
cd /home/ubuntu/Course_Materials/Introduction/practicals/data/sample2_STAR/
- Convert
Aligned.sortedByCoord.out.bam
toAligned.sortedByCoord.out.sam
samtools view Aligned.sortedByCoord.out.bam > Aligned.sortedByCoord.out.sam
- Compare the size of BAM and SAM file.
ls -al #will list file sizes
44158799 Aligned.sortedByCoord.out.bam
415515880 Aligned.sortedByCoord.out.sam
- How many reads
Aligned.sortedByCoord.out.bam
out of first 10 reads was mapped uniquely? Hint: mapping quality = 255 for uniquely mapped reads.
samtools view -bh -q 255 Aligned.sortedByCoord.out.bam > Aligned.sortedByCoord.out.unique.bam
samtools view Aligned.sortedByCoord.out.unique.bam |head -n 10
answer: all 10 reads
- Sort
Aligned.sortedByCoord.out.bam
usingsamtools sort
command, save the output asAligned.sortedByCoord.out.sorted.bam
samtools sort Aligned.sortedByCoord.out.bam -o Aligned.sortedByCoord.out.sorted.bam
- Index
Aligned.sortedByCoord.out.sorted.bam
usingsamtools index
command
samtools index Aligned.sortedByCoord.out.sorted.bam
- Extract only uniquely mapped reads from
Aligned.sortedByCoord.out.sorted.bam
and save them asAligned.sortedByCoord.out.sorted.unique.bam
samtools view -bh -q 255 Aligned.sortedByCoord.out.sorted.bam > Aligned.sortedByCoord.out.sorted.unique.bam
- [ADVANCED] How many reads were mapped uniquely?
samtools view -c -q 255 Aligned.sortedByCoord.out.sorted.bam
answer= 1261841
- [ADVANCED] How many reads mapped uniquely to PIK3CA?
samtools view -c -q 255 Aligned.sortedByCoord.out.sorted.bam "chr3:179148114-179240093"
answer =2766
Benchmarking of the most popular short-read aligners:
Otto C, Stadler PF, Hoffmann S, Lacking alignments? The next-generation sequencing mapper segemehl revisited, Bioinformatics, Volume 30, Issue 13, 1 July 2014, Pages 1837–1843,
Dora Bihary
VIB Center for Cancer Biology, University of Leuven, BE
MRC Cancer Unit, University of Cambridge, UK
Joanna A. Krupka
MRC Cancer Unit, University of Cambridge, UK
Shoko Hirosue
MRC Cancer Unit, University of Cambridge, UK