1 Normalisation - Exercises

Exercise: apply the deconvolution normalisation method on a single sample: ETV6-RUNX1_1 (aka GSM3872434).

library(scater)
library(scran)
library(tidyverse)
library(BiocSingular)
library(BiocParallel)

bpp <- MulticoreParam(7)

1.1 Load object

We will load the R object created after QC.

# Read object in:
# remember getwd() and dir()
sce <- readRDS("../Robjects/Caron_filtered_genes.rds")
colData(sce)$SampleName <- colData(sce)$Sample

Select cells for ETV6-RUNX1_1 sample:

# have new list of cell barcodes for each sample
sce.master <- sce
vec.bc <- colData(sce.master) %>%
    data.frame() %>%
    filter(SampleName == "ETV6-RUNX1_1") %>%
    group_by(SampleName) %>%
    pull(Barcode)

Check the Number of cells in the sample:

table(colData(sce.master)$Barcode %in% vec.bc)

Subset the cells from the SCE object:

tmpInd <- which(colData(sce.master)$Barcode %in% vec.bc)
sce <- sce.master[,tmpInd]
sce

Check columns data:

head(colData(sce))
table(colData(sce)$SampleName)

1.2 Exercise 1 : Deconvolution

Clusters of cells first identified to help form sensible pools of cells.

1.2.1 Cluster cells

set.seed(100) # clusters with PCA from irlba with approximation
clust <- quickCluster(sce) # slow with all cells.
table(clust)

1.2.2 Compute size factors

Scaling factors are then computed from the identified cluster.

# deconvolve


# set size factors


# size factors distribution summary

Check the relation between the deconvolution size factors against library size factors. To do this compute library size factors:

# compute library size factors


# make data frame keeping library and deconvolution size factors for plotting

Generate a scatter plot of library size against deconvolution size factors:

# plot deconv.sf against lib.sf

# colour by library size

1.2.3 Apply size factors

Apply the deconvolution size factors on the dataset: