March 2021
Differential Gene Expression Analysis Workflow

QC of aligned reads
- Alignment Rate
- Duplication Rate
- Insert Size
- Genomic location of reads
- Transcript coverage
QC of aligned reads - Alignment Rate
- Depends on:
- Quality of Reference Genome
- Quality of library prep and sequencing
- For human and mouse > 95%
QC of aligned reads - Duplication Rate
- Human exome is ~30 Mb therefore there are < 30 million possible reads
- Duplication rates in RNAseq can be > 40%
QC of aligned reads - Insert Size
- Insert size is the length of the fragment of mRNA from which the reads are derived

QC of aligned reads - Genomic location of reads

QC of aligned reads - Transcript coverage

QC Goals
- Ensure the experiment generated the expected data
- Check is the sequencing depth and alignment rates are similar across samples
- Identify poor alignment parameters (sample quality, library prep ?)
- Discover contamination from another organism or from DNA
- Identify biases present in the data