Acknowledgments: much of the material in this section has been derived from the chapters on differential expression and abundance in the OSCA book and the Hemberg Group course materials. Additional material concerning miloR has been based on the demonstration from the Marioni Lab.

0.1 Introduction

In the previous section we discussed how we can perform differential expression analysis using pseudo-bulk samples, when multiple biological replicates are available in a multi-condition experiment. Differential expression analysis addresses the question of whether a gene has a different average expression between the two groups being compared. However, another useful question to address is whether the cell composition also differs between conditions. We could imagine, for example, that natural killer cells are more abundant in leukemia samples than in healthy samples.

This type of analysis is referred to as differential abundance analysis.

0.2 Setup

We will continue working with our PBMMC and ETV6-RUNX1 samples, as we did in the differential expression section. Here is a reminder of the code to load the packages and read the data in.

# load packages
library(BiocParallel)
library(scran)
library(scater)
library(miloR)
library(tidyverse)
library(patchwork)

# load the SCE object
sce <- readRDS("R_objects/Caron_clustered.PBMMCandETV6RUNX1.rds")

# check the contents of the object
sce
## class: SingleCellExperiment 
## dim: 33102 30461 
## metadata(1): Samples
## assays(3): counts logcounts reconstructed
## rownames(33102): MIR1302-2HG ENSG00000238009 ... ENSG00000276017
##   ENSG00000278817
## rowData names(4): ID Symbol Type Chromosome
## colnames(30461): 1_AAACCTGAGACTTTCG-1 1_AAACCTGGTCTTCAAG-1 ...
##   11_TTTGTCATCAGTTGAC-1 11_TTTGTCATCTCGTTTA-1
## colData names(13): Sample Barcode ... k.60_cluster.fun.leiden label
## reducedDimNames(3): corrected TSNE_corrected UMAP_corrected
## mainExpName: NULL
## altExpNames(0):
# plot UMAP done on the batch-corrected data
plotReducedDim(sce, dimred = "UMAP_corrected", 
               colour_by = "label", 
               text_by = "label")

0.3 Differential abundance between conditions

# Differential abundance with Milo ----

One method to perform DA analysis is to simply count how many cells occur in each label + condition combination and then test for differences in the counts between the two conditions.

table(sce$label, sce$SampleName)
##                     
##                      ETV6-RUNX1_1 ETV6-RUNX1_2 ETV6-RUNX1_3 ETV6-RUNX1_4
##   B (c1)                     1761         4939         1791         2331
##   B (c2)                      650         1023          251         1274
##   B (c3)                      244           97           17           27
##   T (c4)                       15          193         1897          283
##   Erythrocytes (c5)             1            0           71          178
##   CD20+ B (c6)                  6          115          545          139
##   B (c7)                       83          389           91           95
##   NK T (c8)                     9           44           96           59
##   Erythrocytes (c9)             3           13           95          159
##   Mono (c10)                    1            6           19           43
##   B (c11)                       0           17            9            5
##   CD20+ B (c12)                 0            2            0            0
##   CD20+ B (c13)                 0            1            0            0
##   T (c14)                       0            5           23            4
##   Erythrocytes (c15)            0            1           89         1019
##   Erythrocytes (c16)            0            0           26           79
##   Mono (c17)                    0            0            0            0
##                     
##                      PBMMC_1 PBMMC_2 PBMMC_3
##   B (c1)                  40      69     187
##   B (c2)                  16      39     113
##   B (c3)                  31     110     144
##   T (c4)                 129    1213    1143
##   Erythrocytes (c5)        7     483      18
##   CD20+ B (c6)            33     418     584
##   B (c7)                   6      21       5
##   NK T (c8)               26     388     280
##   Erythrocytes (c9)       26     541     119
##   Mono (c10)             140     405     690
##   B (c11)                  4      32      28
##   CD20+ B (c12)          124      80     253
##   CD20+ B (c13)          251     211     780
##   T (c14)                  0       1       5
##   Erythrocytes (c15)       0     788      39
##   Erythrocytes (c16)       0      11       5
##   Mono (c17)              13      42      37

As these are count data, statistical methods used in flow-cytometry have been adapted to test for differences in cell abundance from such a matrix of counts. This is the approach used in the Differential Abundance Analysis chapter in the OSCA book.

However, this approach relies on a fixed set of clusters determined by us beforehand, which can be limiting in cases where the changes in abundance are more continuous (e.g. along a developmental trajectory, or where cell states are not completely discrete). To address this limitation, Dann et al. (2022) developed a method where cell abundance differences are tested along neighbourhoods of cells on a KNN graph. This means that the results aren’t dependent on our clustering results, and are instead treated in a more “continuous” way.

This method has been implemented by the authors in the R/Bioconductor package MiloR, which we cover in this section. Starting from a graph that faithfully recapitulates the biology of the cell population, the Milo analysis consists of 3 steps:

  • Building a k-nearest neighbours (KNN) graph
  • Sampling representative neighbourhoods in the graph (for computational efficiency)
  • Testing for differential abundance of conditions in all neighbourhoods
  • Accounting for multiple hypothesis testing using a weighted FDR procedure that accounts for the overlap of neighbourhoods

The first step in the analysis is to turn our single cell object into a Milo object. This is very similar to the SingleCellExperiment object we’ve been working with so far, but also includes information about the neighbourhood graphs that are built during the analysis.

# create the Milo object can be simply converted from a SCE
milo <- Milo(sce)

milo
## class: Milo 
## dim: 33102 30461 
## metadata(1): Samples
## assays(3): counts logcounts reconstructed
## rownames(33102): MIR1302-2HG ENSG00000238009 ... ENSG00000276017
##   ENSG00000278817
## rowData names(4): ID Symbol Type Chromosome
## colnames(30461): 1_AAACCTGAGACTTTCG-1 1_AAACCTGGTCTTCAAG-1 ...
##   11_TTTGTCATCAGTTGAC-1 11_TTTGTCATCTCGTTTA-1
## colData names(13): Sample Barcode ... k.60_cluster.fun.leiden label
## reducedDimNames(3): corrected TSNE_corrected UMAP_corrected
## mainExpName: NULL
## altExpNames(0):
## nhoods dimensions(2): 1 1
## nhoodCounts dimensions(2): 1 1
## nhoodDistances dimension(1): 0
## graph names(0):
## nhoodIndex names(1): 0
## nhoodExpression dimension(2): 1 1
## nhoodReducedDim names(0):
## nhoodGraph names(0):
## nhoodAdjacency dimension(2): 1 1
##             used   (Mb) gc trigger   (Mb)  max used   (Mb)
## Ncells   7583827  405.1   12955232  691.9  12955232  691.9
## Vcells 170568604 1301.4  240681272 1836.3 180286371 1375.5

Notice how there are now several slots with prefix nhoods, which we will explore as we go through the analysis.

0.4 Construct KNN graph

# Build KNN graph ----

The first step in our analysis is to build a KNN graph from our cells. This is very similar to what we did earlier in the clustering session, except now the graph will be stored inside the object:

# add KNN graph to Milo object
milo <- buildGraph(milo, 
                   k = 60, 
                   d = 50, 
                   reduced.dim = "corrected", 
                   BPPARAM = MulticoreParam(7))

A couple of things to note about this:

  • k is the number of nearest neighbours to build the graph. This can be adjusted depending on how much of a fine resolution you want. If you use too high a number, you will end up loosing resolution as a higher diversity of cells will be connected in the graph. On the other hand, if you use too low a number, you may increase noise in the data and loose statistical power, as only very few cells will be connected to each other in a neighbourhood. The author’s recommendation is to use a value of k as you did for clustering and UMAP visualisation - this is what we’ve done in this case.
  • d is the number of dimensions from our dimensionality reduction matrix to use. In this case we’re using the MNN-corrected matrix, which contains 50 dimensions, and we use all of them (50 is also the default, so we could have left this option out).

The object now has a new object inside the graph slot, which a standard object type from the igraph package:

# the graph is stored as a standard igraph object
graph(milo)
## IGRAPH 8fbcb03 U--- 30461 1412281 -- 
## + edges from 8fbcb03:
##  [1] 1--  138 1--  351 1--  462 1--  536 1--  569 1--  597 1--  600 1--  606
##  [9] 1--  760 1--  780 1--  990 1-- 1542 1-- 1557 1-- 1674 1-- 1711 1-- 1936
## [17] 1-- 1964 1-- 1973 1-- 2013 1-- 2277 1-- 2538 1-- 2634 1-- 2817 1-- 3123
## [25] 1-- 3483 1-- 3854 1-- 5903 1-- 6022 1-- 6184 1-- 7528 1-- 7925 1-- 8095
## [33] 1-- 9222 1-- 9538 1--10135 1--11901 1--12550 1--12612 1--12783 1--14466
## [41] 1--15031 1--15089 1--15163 1--15209 1--15249 1--15611 1--15614 1--15698
## [49] 1--16125 1--16332 1--16552 1--16795 1--17338 1--17623 1--18057 1--18718
## [57] 1--18745 1--19261 1--19340 1--19630 2--   27 2--   85 2--  216 2--  239
## [65] 2--  308 2--  310 2--  320 2--  333 2--  393 2--  411 2--  412 2--  457
## + ... omitted several edges

0.5 Define neighbourhoods

The next step in the analysis is to define cell neighbourhoods. This essentially consists of picking a focal cell and counting how many other cells it is connected to in the graph (and how many come from each group under comparison). However, if we did this for every single cell in the data, it could get computationally quite intractable. Instead, Milo implements a sampling step, where so-called “index cells” are sampled from the larger graph, and those will be the neighbourhoods used in the DA analysis.

This is done with the makeNhodds() function:

# sample index cells to define neighbourhoods
milo <- makeNhoods(milo, 
                   prop = 0.1, 
                   k = 60, 
                   d = 50, 
                   reduced_dims = "corrected")

# check our object again
milo
## class: Milo 
## dim: 33102 30461 
## metadata(1): Samples
## assays(3): counts logcounts reconstructed
## rownames(33102): MIR1302-2HG ENSG00000238009 ... ENSG00000276017
##   ENSG00000278817
## rowData names(4): ID Symbol Type Chromosome
## colnames(30461): 1_AAACCTGAGACTTTCG-1 1_AAACCTGGTCTTCAAG-1 ...
##   11_TTTGTCATCAGTTGAC-1 11_TTTGTCATCTCGTTTA-1
## colData names(13): Sample Barcode ... k.60_cluster.fun.leiden label
## reducedDimNames(3): corrected TSNE_corrected UMAP_corrected
## mainExpName: NULL
## altExpNames(0):
## nhoods dimensions(2): 30461 1575
## nhoodCounts dimensions(2): 1 1
## nhoodDistances dimension(1): 0
## graph names(1): graph
## nhoodIndex names(1): 1575
## nhoodExpression dimension(2): 1 1
## nhoodReducedDim names(0):
## nhoodGraph names(0):
## nhoodAdjacency dimension(2): 1 1

Some things to note:

  • prop is the proportion of cells to sample (the authors advise 0.1 - 0.2).
  • k and d should be set the same as the k value used to build the original KNN graph. These values will be used for the graph-sampling algorithm used behind the scenes.

We can also see that our Milo object now has the nhoods slots populated. In this case, it indicates that the sampling algorithm picked 1575 index cells to form neighbourhoods for DA analysis.

One good QC metric to check at this stage is to create a histogram of cell counts per neighbourhood. Like we said earlier, when we set the value of k to define our KNN graph, we want to make sure the value is not too low, such that most neighbourhoods have very few cells, nor for it to be so high that we have very large (and presumably heterogenous) neighbourhoods.

Conveniently, MiloR has a plotting function just for this purpose:

# distribution of neighbourhood sizes
plotNhoodSizeHist(milo) +
  geom_vline(xintercept = 100, col = "salmon")

The authors of Milo have stated several different parameters of deciding if your histogram is correct. Either ‘peaking above 20’, ‘peaking between 50 and 100’ or ‘an average neighbourhood size over 5 x N_samples’. Realistically, all of these statements give numbers in the same ballpark and so we can make a judgement on our data. In our case, we can see that our histogram peaks at ~100, suggesting a good neighbourhood size. If this was not the case, we could re-run the analysis from the start, making a KNN graph with a higher/lower k value.

0.6 Counting cells

After defining our sample of neighbourhoods, the next step consists of counting how many cells there are in each neighbourhood for each sample replicate. In this step we need to define which column of our metadata we want to use to do the counting. We should count cells at the biological replicate leve, which in our case is stored in the SampleName column.

# count cells in each neighbourhood
milo <- countCells(milo, 
                   meta.data = colData(milo),
                   samples = "SampleName")

# Milo now has a counts matrix
head(nhoodCounts(milo))
## 6 x 7 sparse Matrix of class "dgCMatrix"
##   ETV6-RUNX1_1 ETV6-RUNX1_2 ETV6-RUNX1_3 ETV6-RUNX1_4 PBMMC_1 PBMMC_2 PBMMC_3
## 1            .            .            2           93       .     103       .
## 2            .            .            .            .      32       9      70
## 3            .            .            3           10       .      67       4
## 4            .           12          208           15       1       3      36
## 5            .            .            .            .      29      15      49
## 6           94            9            2           11       .       1       2

The dimensions of the counts matrix corresponds to the number of neighbourhoods (rows) and samples (columns), in our case a 1575 by 7 matrix.

0.7 Neighbourhood connectivity

There is one more step that we need to do before we are ready to run our DA analysis. It consists of calculating the distances between each neighbourhood. As we said, the neighbourhoods on our graph may partially overlap, so when Milo corrects our p-values for multiple testing, it takes into account the spatial dependence of our tests. For example, neighbourhoods that are closer to each other may have similar p-values, and so we should avoid penalising them too much, otherwise we will sacrifice statistical power.

# calculate distances between neighbourhoods - for p-value correction
milo <- calcNhoodDistance(milo, d = 50, reduced.dim = "corrected")

As before, the value of d (number of dimensions to consider from our MNN-corrected matrix) should be the same that was used for building our initial graph.

0.8 Running DA tests

Finally, we are ready to run the actual differential abundance step. Similarly to the pseudo-bulk appproach for differential expression, MiloR takes advantage of the edgeR package and its negative binomial linear model implementation. This provides a suitable statistical model to account for over-dispersed count data, as is common with these kind of datasets.

First, we need to define a simple table with information about our samples - we will use this table to define our model formula (similarly to what we did in the differential expression step). In this case, we want to detect DA between our PBMMC and ETV6-RUNX1 sample groups, so we define a table based on those two groups. We could also include a batch column in this table, but to keep the demonstration simple we will skip this. We would normally do this if we know there is a batch effect that we want to account for it in DA testing.

# define a table for our model design
sample_info <- unique(colData(milo)[,c("SampleName", "SampleGroup")])
rownames(sample_info) <- sample_info$SampleName

sample_info
## DataFrame with 7 rows and 2 columns
##                SampleName SampleGroup
##               <character> <character>
## ETV6-RUNX1_1 ETV6-RUNX1_1  ETV6-RUNX1
## ETV6-RUNX1_2 ETV6-RUNX1_2  ETV6-RUNX1
## ETV6-RUNX1_3 ETV6-RUNX1_3  ETV6-RUNX1
## ETV6-RUNX1_4 ETV6-RUNX1_4  ETV6-RUNX1
## PBMMC_1           PBMMC_1       PBMMC
## PBMMC_2           PBMMC_2       PBMMC
## PBMMC_3           PBMMC_3       PBMMC

Now we can do the DA test, explicitly defining our experimental design. In this case as discussed we will test the difference between sample groups. The testNhoods function calculates a Fold-change and corrected P-value for each neighbourhood, which indicates whether there is significant differential abundance between sample groups.

# run DA test
da_results <- testNhoods(milo, 
                         design = ~ SampleGroup, 
                         design.df = sample_info, 
                         reduced.dim = "corrected")

# results are returned as a data.frame
da_results %>%
  arrange(SpatialFDR) %>%
  head()
##       logFC   logCPM        F       PValue        FDR Nhood  SpatialFDR
## 2  10.44828 10.36934 23.16635 0.0003797705 0.00677564     2 0.008332264
## 5  10.24767 10.16906 24.44186 0.0003026752 0.00677564     5 0.008332264
## 35 10.81204 10.73030 22.49446 0.0004294659 0.00677564    35 0.008332264
## 53 10.21916 10.13825 24.19410 0.0003161089 0.00677564    53 0.008332264
## 61 10.75541 10.67494 22.14380 0.0004583873 0.00677564    61 0.008332264
## 75 10.39109 10.31169 23.87804 0.0003342642 0.00677564    75 0.008332264

0.9 Inspecting DA results

A good diagnostic plot to make after running our analysis is a histogram of p-values. We should expect a distribution with a peak close to 0 (corresponding to differentially abundant neighbourhoods) and tailing off towards 1. This blog article explains the different p-value histogram profiles you may see and what they can mean.

# p-value histogram
ggplot(da_results, aes(PValue)) + 
  geom_histogram(bins = 50)

Next, we can get an overview of our results as a volcano plot, marking a significance threshold of 10% FDR:

# volcano plot
# each point in this plot corresponds to a neighbourhood (not a cell)
ggplot(da_results, aes(logFC, -log10(SpatialFDR))) + 
  geom_point(aes(colour = FDR < 0.1)) +
  geom_hline(yintercept = 1) 

As we can see, several neighbourhoods fall below our FDR threshold, indicating significant differential abundance of cells between PBMMC and ETV6-RUNX1 samples.

There is an unsual discontinuity in our logFC axis, suggesting a sudden change in abundance in some of our neighbourhoods. We can get a better idea of the fold changes by visualising them as a graph for our neighbourhoods (rather than the original single-cell graph, which would be too big to display). We can then super-impose this graph on our original UMAP (or t-SNE), to compare with our original analysis.

# build neighbourhood graph embedding
milo <- buildNhoodGraph(milo)
# our original UMAP with our previously annotated cell labels
umap_plot <- plotReducedDim(milo, 
                            dimred = "UMAP_corrected", 
                            colour_by = "label", 
                            text_by = "label")

# the neighbourhood map adjusted to match UMAP embedding
nh_graph_plot <- plotNhoodGraphDA(milo, 
                                  da_results, 
                                  layout = "UMAP_corrected",
                                  alpha = 0.05)

# the two plots together side-by-side
umap_plot + nh_graph_plot +
  plot_layout(guides="collect")

On the left we have our original UMAP, with cell/cluster annotations we did previously (by standard clustering and using known marker genes to manually annotate our cells). On the right we have the Milo neighbourhood graph, where each node represents a neighbourhood, coloured by the log fold-change in abundance between PBMMC and ETV6-RUNX1 samples (positive values represent higher abundance in ETV6-RUNX1, and vice-versa). We can see, for example, a cluster of cells with negative log fold changes around our large B cell cluster, which likely explains the discontinuity in values we saw earlier in our volcano plot.

Although we have annotations for our cells, these annotations at the moment are not present in the differential abundance table:

head(da_results)
##       logFC    logCPM            F       PValue        FDR Nhood  SpatialFDR
## 1  4.117087 10.710946  2.859558485 0.1110087119 0.18860703     1 0.196064468
## 2 10.448283 10.369340 23.166345664 0.0003797705 0.00677564     2 0.008332264
## 3  5.983684 10.072605  7.705068618 0.0138890153 0.05976830     3 0.064812658
## 4 -0.112015 10.105450  0.003484294 0.9536906783 0.95868169     4 0.958488172
## 5 10.247670 10.169059 24.441857897 0.0003026752 0.00677564     5 0.008332264
## 6 -2.817131  8.778342  2.700531085 0.1206144892 0.19922183     6 0.206753593

We can transfer our cell labels to the neighbourhood DA results, by simply counting how many cells within that neighbourhood share the same label, and assign the label with the highest counts.

# annotate our neighbourhood DA results with our cell labels
da_results <- annotateNhoods(milo, da_results, coldata_col = "label")
head(da_results)
##       logFC    logCPM            F       PValue        FDR Nhood  SpatialFDR
## 1  4.117087 10.710946  2.859558485 0.1110087119 0.18860703     1 0.196064468
## 2 10.448283 10.369340 23.166345664 0.0003797705 0.00677564     2 0.008332264
## 3  5.983684 10.072605  7.705068618 0.0138890153 0.05976830     3 0.064812658
## 4 -0.112015 10.105450  0.003484294 0.9536906783 0.95868169     4 0.958488172
## 5 10.247670 10.169059 24.441857897 0.0003026752 0.00677564     5 0.008332264
## 6 -2.817131  8.778342  2.700531085 0.1206144892 0.19922183     6 0.206753593
##                label label_fraction
## 1 Erythrocytes (c15)      1.0000000
## 2      CD20+ B (c12)      0.9819820
## 3  Erythrocytes (c9)      1.0000000
## 4             T (c4)      1.0000000
## 5      CD20+ B (c12)      1.0000000
## 6             B (c1)      0.9495798

The result includes the fraction of cells in the neighbourhood that shared that label. We can look at the distribution of this fraction as a quality check:

# histogram of fraction of cells in the neighbourhood with the same label
ggplot(da_results, aes(label_fraction)) + 
  geom_histogram(bins = 50)

We should expect the plot to be very biased towards 1, as is seen. This makes sense, since Milo’s neighbourhoods should be very similar to our previously-defined clusters. After all, both our clustering and Milo used KNN graphs as the starting point, and we’ve used the same settings for the k and d parameters in both steps of our analysis.

Despite most neighbourhoods being homogenous, some seem to have mixed labels. We can highlight these in our results table:

# add "mixed" label to neighbourhoods with less 70% consistency
da_results$label <- ifelse(da_results$label_fraction < 0.7, 
                           "Mixed", 
                           da_results$label)

head(da_results)
##       logFC    logCPM            F       PValue        FDR Nhood  SpatialFDR
## 1  4.117087 10.710946  2.859558485 0.1110087119 0.18860703     1 0.196064468
## 2 10.448283 10.369340 23.166345664 0.0003797705 0.00677564     2 0.008332264
## 3  5.983684 10.072605  7.705068618 0.0138890153 0.05976830     3 0.064812658
## 4 -0.112015 10.105450  0.003484294 0.9536906783 0.95868169     4 0.958488172
## 5 10.247670 10.169059 24.441857897 0.0003026752 0.00677564     5 0.008332264
## 6 -2.817131  8.778342  2.700531085 0.1206144892 0.19922183     6 0.206753593
##                label label_fraction
## 1 Erythrocytes (c15)      1.0000000
## 2      CD20+ B (c12)      0.9819820
## 3  Erythrocytes (c9)      1.0000000
## 4             T (c4)      1.0000000
## 5      CD20+ B (c12)      1.0000000
## 6             B (c1)      0.9495798

Finally, we can visualise the distribution of DA fold changes in different labels.

# distribution of logFC across neighbourhood labels
plotDAbeeswarm(da_results, group.by = "label")

We can see that several clusters are enriched and some are depleted between our sample groups. There are not many DA neighbourhoods with a mixed label either. And we can see that B cells cluster 1 has quite a discontinuity in log fold changes - this is what was causing the odd distribution in the volcano plot, and we saw earlier is likely due to a sub-population of B cells with different representation across our samples.

It is also worth noting that this analysis may also reveal uncorrected batch effects, which would lead to an artificial separation of our samples according to their group. As always, we should always proceed in our analysis with care and back our conclusions with further confirmatory experiments.

There are further downstream analysis possible to identify genes that differentiate between our different neighbourhoods. This is similar in spirit to the marker gene analysis we did earlier, but based on the neighbourhood graph constructed by Milo. You can consult the package’s main vignette to see examples of these downstream analysis.

0.10 Session information

sessionInfo()
## R version 4.2.2 Patched (2022-11-10 r83330)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
##  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
##  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
## [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] patchwork_1.1.2             forcats_0.5.2              
##  [3] stringr_1.4.1               dplyr_1.0.10               
##  [5] purrr_0.3.4                 readr_2.1.3                
##  [7] tidyr_1.2.1                 tibble_3.1.8               
##  [9] tidyverse_1.3.2             miloR_1.4.0                
## [11] edgeR_3.38.4                limma_3.52.3               
## [13] scater_1.24.0               ggplot2_3.3.6              
## [15] scran_1.24.1                scuttle_1.6.3              
## [17] SingleCellExperiment_1.18.1 SummarizedExperiment_1.26.1
## [19] Biobase_2.56.0              GenomicRanges_1.48.0       
## [21] GenomeInfoDb_1.32.4         IRanges_2.30.1             
## [23] S4Vectors_0.34.0            BiocGenerics_0.42.0        
## [25] MatrixGenerics_1.8.1        matrixStats_0.62.0         
## [27] BiocParallel_1.30.4        
## 
## loaded via a namespace (and not attached):
##   [1] googledrive_2.0.0         ggbeeswarm_0.6.0         
##   [3] colorspace_2.0-3          ellipsis_0.3.2           
##   [5] bluster_1.6.0             XVector_0.36.0           
##   [7] BiocNeighbors_1.14.0      fs_1.5.2                 
##   [9] farver_2.1.1              graphlayouts_0.8.4       
##  [11] ggrepel_0.9.1             lubridate_1.8.0          
##  [13] fansi_1.0.3               xml2_1.3.3               
##  [15] codetools_0.2-18          sparseMatrixStats_1.8.0  
##  [17] cachem_1.0.6              knitr_1.40               
##  [19] polyclip_1.10-4           jsonlite_1.8.0           
##  [21] broom_1.0.1               cluster_2.1.4            
##  [23] dbplyr_2.2.1              ggforce_0.4.1            
##  [25] httr_1.4.4                compiler_4.2.2           
##  [27] dqrng_0.3.0               backports_1.4.1          
##  [29] assertthat_0.2.1          Matrix_1.5-3             
##  [31] fastmap_1.1.0             gargle_1.2.1             
##  [33] cli_3.4.1                 tweenr_2.0.2             
##  [35] BiocSingular_1.12.0       htmltools_0.5.4          
##  [37] tools_4.2.2               rsvd_1.0.5               
##  [39] igraph_1.3.5              gtable_0.3.1             
##  [41] glue_1.6.2                GenomeInfoDbData_1.2.8   
##  [43] Rcpp_1.0.9                cellranger_1.1.0         
##  [45] jquerylib_0.1.4           vctrs_0.4.1              
##  [47] DelayedMatrixStats_1.18.0 ggraph_2.1.0             
##  [49] xfun_0.33                 rvest_1.0.3              
##  [51] beachmat_2.12.0           lifecycle_1.0.2          
##  [53] irlba_2.3.5.1             gtools_3.9.4             
##  [55] statmod_1.5.0             googlesheets4_1.0.1      
##  [57] zlibbioc_1.42.0           MASS_7.3-58              
##  [59] scales_1.2.1              tidygraph_1.2.2          
##  [61] hms_1.1.2                 parallel_4.2.2           
##  [63] RColorBrewer_1.1-3        yaml_2.3.5               
##  [65] gridExtra_2.3             sass_0.4.2               
##  [67] stringi_1.7.8             highr_0.9                
##  [69] ScaledMatrix_1.4.1        rlang_1.0.6              
##  [71] pkgconfig_2.0.3           bitops_1.0-7             
##  [73] evaluate_0.16             lattice_0.20-45          
##  [75] labeling_0.4.2            cowplot_1.1.1            
##  [77] tidyselect_1.1.2          magrittr_2.0.3           
##  [79] R6_2.5.1                  generics_0.1.3           
##  [81] metapod_1.4.0             DelayedArray_0.22.0      
##  [83] DBI_1.1.3                 haven_2.5.1              
##  [85] pillar_1.8.1              withr_2.5.0              
##  [87] RCurl_1.98-1.9            crayon_1.5.1             
##  [89] modelr_0.1.9              utf8_1.2.2               
##  [91] tzdb_0.3.0                rmarkdown_2.16           
##  [93] viridis_0.6.2             readxl_1.4.1             
##  [95] locfit_1.5-9.6            grid_4.2.2               
##  [97] reprex_2.0.2              digest_0.6.29            
##  [99] munsell_0.5.0             beeswarm_0.4.0           
## [101] viridisLite_0.4.1         vipor_0.4.5              
## [103] bslib_0.4.0