February 2026
Clusters and/or cell types have been identified, we now want to compare sample groups:
Differential expression - Differences in expression between sample group within a biological state.
Differential abundance - Differences in cell numbers between sample groups within a biological state.
Replicates are samples not cells:
Are the genes up or down regulated between treated vs control or wildtype vs mutant or healthy vs diseased etc. ?
Once the pseudo-bulk matrix is generated, we can use any bulk RNA-seq DE method to perform the analysis, such as DESeq2, edgeR, limma-voom etc.
We should remove pseudosamples with created from very few cells eg. < 20 cells
We should remove genes that are lowly expressed * reduces computational work, * improves the accuracy of mean-variance trend modelling * decreases the severity of the multiple testing correction * filter: log-CPM threshold in a minimum number of samples, smallest sample group
Seurat function FindMarkers() can also be used for differential expression.
The default Wilcoxon Rank Sum test was used to find cluster markers but the function is also a wrapper for several other methods, including MAST, DESeq2 and limma.
We can choose the method with the test.use parameter.
We will use DESeq2
DESeq2 will ‘normalise’ our pseudocount data to account for composition biases and differences in sequencing depth between samples.
Main Steps: