Motivation
Differential expression
Differential abundance
CRUK bioinfomatics summer school - July 2021
Motivation
Differential expression
Differential abundance
Clusters and/or cell types have been identified, we now want to compare sample groups:
Replicates are samples not cells:
Pseudo-bulk:
An example. TSNE plots showing clusters and sample groups (left) and samples (right):
Workflow:
compute pseudo-bulk count by summing across cells,
perform bulk analysis with few replicates,
Method:
quasi-likelihood (QL) methods from the edgeR
package
negative binomial generalized linear model (NB GLM)
Steps:
Remove samples with very low library sizes, e.g. < 20 cells
Remove genes that are lowly expressed,
Correct for composition biases
Test whether the log-fold change between sample groups is significantly different from zero
Aim: test for significant changes in per-cluster cell abundance across conditions
Example: which cell types are depleted or enriched upon treatment?
Methods were developed for flow cytometry.
Steps:
Count cells assigned to each label, i.e. cluster or cell type
Same workflow as for differential expression above,
Share information across labels