What makes somatic SNV detection difficult?
CaVEMan and some other calling tools
Statistical approaches to calling somatic SNVs
How well should you expect a tool to perform?
July 2017
What makes somatic SNV detection difficult?
CaVEMan and some other calling tools
Statistical approaches to calling somatic SNVs
How well should you expect a tool to perform?
Somatic SNV callers typically set the expected mutation rate to be around 5 mutations per megabase, i.e. a total of 15,000 mutations across the genome.
Source: ICGC Data Portal
Low cellularity (tumour DNA content)
Intra-tumour heterogeneity in which multiple tumour cell populations (subclones) exist
Aneuploidy
Unbalanced structural variation (deletions, duplications, etc.)
Matched normal contaminated with cancer DNA
adjacent normal tissue may contain residual disease or early tumour-initiating somatic mutations
circulating tumour DNA in blood normals
Sequencing errors
Alignment artefacts
In this example the tumour was sequenced to an average depth of 50.
Is this sufficient?
Consider the 50 observations of our tumour which carries a mutation at this base
Tumour cellularity
In fact the 'tumour' sample has some normal contamination
40% of our reads could easily be from the normal sample
Tumour heterogeneity