9 Cluster Marker Genes

In order to aid the interpretation of the clustering results that we covered in the previous section, it is helpful to identify genes that contribute to the separation of cells into those clusters.

The main approach to achieve this, is to identify genes that are differently expressed between clusters. These may be, for example, exclusively expressed in a single cluster or perhaps differentiate between a few different clusters. There are different methods to identify expression differences between clusters: using mean expression level, or the ranking of the gene by its expression, or the proportions of cells that express the gene.

Our main objective in this section is to cover some of the methods that can be used to achieve this goal, and obtain a summary table of results. As always, the OSCA chapter on marker detection contains additional detail and advice.

9.1 Load Data

# load the libraries
library(Seurat)
library(sctransform)
library(glmGamPoi)
library(tidyverse)
library(patchwork)

First we will load in a version of our data that has been processed up until this point in the course. QCed, normalised, dimensionality reduced and clustered.

# load in the data
seurat_object <- readRDS("RObjects/Clustered.500.rds")

The cluster information is stored in our objects meta.data slot and is called “leiden_cluster”.

To remind ourselves, we can visualise the clusters on a UMAP:

# visualise UMAP with cluster labels
DimPlot(seurat_object,
        reduction = "umap",
        group.by = "leiden_cluster",
        label = TRUE,
        label.size = 5) + 
  NoLegend()

Our objective is to identify genes that distinguish these clusters from one another - “cluster marker genes”. Intuitively we hope that the clusters relate to specific cell populations, and therefore we are trying to find genes that will allow us to identify the cell types for each cluster.

For example genes such as the “CST3” gene, which is a known monocyte marker:

# visualise known monocyte marker gene CST3
FeaturePlot(seurat_object,
            reduction = "umap",
            features = "CST3")

9.2 Identifying cluster marker genes

Although we have defined our clusters based on the batch-corrected expression values, these should not be used for for gene-based analyses like marker gene detection. Instead, we should use the uncorrected (normalised) expression values for differential expression between clusters. This is because data integration algorithms bring cells together based on their overall gene expression, but for each gene individually the data transformation may introduce artificial agreement between batches, which is not ideal for gene-level differential analysis. Furthermore, the severity of these biases is dependent on the parameters used for data integration.

Valid assays to use in gene based differential analysis tests are the normalised counts obtained from the SCTransform method and stored in the “SCT” assay.

Seurat has two functions we can use for this, FindMarkers() and FindAllMarkers(). The former allows us to compare two clusters at a time, in a pairwise manner. The latter automates this for all clusters but can also be used to test groups of clusters against other clusters or against all cells.

Prior to performing differential expression, we first run PrepSCTFindMarkers().

“This allows us to use the”corrected counts" that are stored in the data slot of the the SCT assay. Corrected counts are obtained by setting the sequencing depth for all the cells to a fixed value and reversing the learned regularized negative-binomial regression model. PrepSCTFindMarkers() ensures that the fixed value is set properly."

– Seurat Vignette

# prepare the data for marker gene detection
seurat_object <- PrepSCTFindMarkers(seurat_object,
                                    assay = "SCT")

## Found 4 SCT models. Recorrecting SCT counts using minimum median counts: 3423.5

9.2.1 Finding markers of a single cluster

To find markers of a single cluster, we can use FindMarkers(). For example, to find genes that are differentially expressed in cluster 2 compared to all other clusters:

# Compare cells in cluster 2 against all other cells
markers_cluster2 <- FindMarkers(seurat_object,
                                ident.1 = 2,
                                ident.2 = NULL,
                                assay = "SCT",
                                group.by = "leiden_cluster")
head(markers_cluster2)

##                    p_val avg_log2FC pct.1 pct.2     p_val_adj
## IFITM1      0.000000e+00   2.877734 0.905 0.199  0.000000e+00
## SKAP1       0.000000e+00   3.145875 0.755 0.092  0.000000e+00
## TC2N        0.000000e+00   3.787626 0.503 0.036  0.000000e+00
## NELL2       0.000000e+00   5.667427 0.423 0.012  0.000000e+00
## PCED1B-AS1 4.632024e-318   3.495943 0.666 0.099 9.732808e-314
## RPS27      1.291867e-313   1.390475 1.000 0.988 2.714471e-309

The output is a data frame with the following columns: - p_val: p-value of the test - avg_log2FC: average log2 fold change between the two groups - pct.1: percentage of cells in cluster 2 that express the gene - pct.2: percentage of cells in the other clusters that express the gene - p_val_adj: adjusted p-value for multiple testing

Using pvalues is inappropriate for marker gene detection. This is because the differential expression is performed on the same data used to obtain the clusters and would therefore be “data dredging”. If we have used gene expression to separate the cells into clusters we would expect thre to be a difference in expression between clusters. Admitedly, if simply used to identify possible marker genes by ranking this is largely harmless but it becuse problematic when the pvalues are used to claim statistically significant seperation between clusters. There is also an issue of pseudoreplication where in this type of differential application cells from the same cluster are treated as replicates when infact cells from the same sample are not independent. There is a more detailed discussion of these issues in the OSCA book.

Instead, we should use the log fold change and percentage of cells expressing the gene to identify marker genes. For example, we could filter for genes that are expressed in at least 70% of cells in cluster 2 and have a log fold change greater than 1:

# filter for genes that:
#  - are expressed in at least 70% of cells in cluster 2
#  - and have a log fold change greater than 1
markers_cluster2_filtered <- markers_cluster2 %>%
  filter(pct.1 > 0.70 & abs(avg_log2FC) > 1)
  
head(markers_cluster2_filtered)

##                  p_val avg_log2FC pct.1 pct.2     p_val_adj
## IFITM1    0.000000e+00   2.877734 0.905 0.199  0.000000e+00
## SKAP1     0.000000e+00   3.145875 0.755 0.092  0.000000e+00
## RPS27    1.291867e-313   1.390475 1.000 0.988 2.714471e-309
## RPL21    5.810414e-307   1.433676 1.000 0.992 1.220884e-302
## LEPROTL1 4.938806e-304   3.088247 0.747 0.152 1.037742e-299
## RPS29    5.575920e-281   1.269738 1.000 0.978 1.171612e-276

This leaves us with 33 genes that are potential markers of cluster 2. We can plot some of these marker genes on the UMAP.

# visualise the expression of one of the marker genes
# On the UMAP
fplot <- FeaturePlot(seurat_object, 
                     features = "SKAP1",
                     label = TRUE,
                     reduction = "umap")

# As a violin plot
vplot <- VlnPlot(seurat_object, features = "SKAP1",
                 group.by = "leiden_cluster",
                 pt.size = 0) + 
  NoLegend()

# SKAP1 is a TCR gene
# It also appears in cluster 11 next to it
fplot + vplot + plot_layout(ncol = 2)

9.2.2 Finding markers of all clusters

To do the same process but for all clusters we use FindAllMarkers().

# find markers for all clusters automatically
markers_all <- FindAllMarkers(seurat_object,
                              assay = "SCT",
                              group.by = "leiden_cluster")

## Calculating cluster 1

## Calculating cluster 2

## Calculating cluster 3

## Calculating cluster 4

## Calculating cluster 5

## Calculating cluster 6

## Calculating cluster 7

## Calculating cluster 8

## Calculating cluster 9

## Calculating cluster 10

## Calculating cluster 11

## Calculating cluster 12

## Calculating cluster 13

## Calculating cluster 14

## Calculating cluster 15

## Calculating cluster 16

head(markers_all)

##               p_val avg_log2FC pct.1 pct.2     p_val_adj cluster  gene
## PCDH9  0.000000e+00   3.197622 0.868 0.220  0.000000e+00       1 PCDH9
## PDE4D 3.848952e-272   2.776117 0.865 0.290 8.087417e-268       1 PDE4D
## BACH2 1.638420e-267   2.755084 0.954 0.494 3.442648e-263       1 BACH2
## CD79B 4.453671e-251   2.459362 0.916 0.389 9.358053e-247       1 CD79B
## IGHM  3.056034e-227   1.762352 0.935 0.418 6.421339e-223       1  IGHM
## TCL1A 8.412701e-199   1.850403 0.957 0.467 1.767677e-194       1 TCL1A

The output is a data frame with the following columns:

gene: gene name
cluster: cluster number
p_val: p-value of the test
avg_log2FC: average log2 fold change between the cluster and all other clusters
pct.1: percentage of cells in the cluster that express the gene
pct.2: percentage of cells in the other clusters that express the gene
p_val_adj: adjusted p-value for multiple testing

We can filter the dataframe to get the same information as we did before. To get the marker genes for cluster 2:

# filter the results for cluster 2
# we should get the same results as before
markers_all_cluster2 <- markers_all %>%
  filter(cluster == 2 & pct.1 > 0.70 & abs(avg_log2FC) > 1)

head(markers_all_cluster2)

##                    p_val avg_log2FC pct.1 pct.2     p_val_adj cluster     gene
## IFITM1.1    0.000000e+00   2.877734 0.905 0.199  0.000000e+00       2   IFITM1
## SKAP1.1     0.000000e+00   3.145875 0.755 0.092  0.000000e+00       2    SKAP1
## RPS27.1    1.291867e-313   1.390475 1.000 0.988 2.714471e-309       2    RPS27
## RPL21.1    5.810414e-307   1.433676 1.000 0.992 1.220884e-302       2    RPL21
## LEPROTL1.1 4.938806e-304   3.088247 0.747 0.152 1.037742e-299       2 LEPROTL1
## RPS29.1    5.575920e-281   1.269738 1.000 0.978 1.171612e-276       2    RPS29

We can see we have the same 33 genes.

9.3 Exercise 1

We saw earlier that known monocyte marker CST3 was expressed in cluster 13. Use the FindMarkers() function or the FindAllMarkers() function to investigate what other potentially good marker genes would be for that cluster.

Answer

# using FindMarkers()
markers_cluster13 <- FindMarkers(seurat_object,
                                 ident.1 = 13,
                                 ident.2 = NULL,
                                 assay = "SCT",
                                 group.by = "leiden_cluster")

markers_cluster13_filtered <- markers_cluster13 %>%
  filter(pct.1 > 0.70 & abs(avg_log2FC) > 1)

head(markers_cluster13_filtered)

##         p_val avg_log2FC pct.1 pct.2 p_val_adj
## TYROBP      0   6.901316 0.981 0.039         0
## LYZ         0   9.032923 0.947 0.045         0
## FCER1G      0   6.230357 0.927 0.030         0
## S100A11     0   6.149912 0.937 0.041         0
## LGALS1      0   5.466743 0.961 0.076         0
## CST3        0   5.898520 0.956 0.092         0

We could also filter our markers_all dataframe to get the same information:

# filter the markers_all dataframe for cluster 13
markers_cluster13_all <- markers_all %>%
  filter(cluster == 13 & pct.1 > 0.70 & abs(avg_log2FC) > 1)

head(markers_cluster13_all)

##           p_val avg_log2FC pct.1 pct.2 p_val_adj cluster    gene
## TYROBP.11     0   6.901316 0.981 0.039         0      13  TYROBP
## LYZ.7         0   9.032923 0.947 0.045         0      13     LYZ
## FCER1G.8      0   6.230357 0.927 0.030         0      13  FCER1G
## S100A11.9     0   6.149912 0.937 0.041         0      13 S100A11
## LGALS1.8      0   5.466743 0.961 0.076         0      13  LGALS1
## CST3.6        0   5.898520 0.956 0.092         0      13    CST3

This gives us 91 potential marker genes for cluster 13.

If we think these are too many, we could be more stringent with our filtering criteria, for example by increasing the log fold change threshold:

# very high log fold change threshold
markers_cluster13_stricter <- markers_cluster13 %>%
  filter(pct.1 > 0.70 & abs(avg_log2FC) > 4)

# results in only 20 genes
nrow(markers_cluster13_stricter)

## [1] 20

9.4 Heatmap of marker genes

We have already seen how we can use the VlnPlot() function to visualise the distribution of expression in our data between clusters. We have also seen how to use FeaturePlot() to visualise a gene’s expression on the projected reduced dimensionality space.

Another useful type of visualisation is to use heatmaps to show the expression of these genes of interest. We will demonstrate this using the top marker genes for cluster 13. Lets take the top 20 genes from our filtered markers.

# top 20 marker genes for cluster 13
top_markers_cluster13 <- markers_cluster13_filtered %>%
  slice_max(n = 20, order_by = abs(avg_log2FC)) %>%
  rownames()

# heatmap of expression for these genes
# we change default colour scale
# and remove colour legend (clusters)
DoHeatmap(seurat_object,
          features = top_markers_cluster13,
          group.by = "leiden_cluster") +
  scale_fill_viridis_c() +
  guides(colour = "none")

## Warning in DoHeatmap(seurat_object, features = top_markers_cluster13, group.by = "leiden_cluster"): The following features were
## omitted as they were not found in the scale.data layer for the SCT assay: FCN1, CSTA, VCAN

## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.

Another useful visualisation is to use dot plots of expression that show both the average gene expression (as a colour scale) and the number of cells in which the gene is detected (as the size of the points). We can generate such a plot using the plotDots() function:

# dot plot of expression for these genes
# we change default colour scale
# and rotate x-axis labels for better visibility
DotPlot(seurat_object, 
        features = top_markers_cluster13,
        group.by = "leiden_cluster") +
  scale_colour_viridis_c() +
  theme(axis.text.x = element_text(angle = 45, 
                                   hjust = 1))

## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.

9.5 Cell Type Labelling

One of the main tasks we often want to perform is annotating our cells as known types of cells occurring in our sampled tissue. This requires prior knowledge of cell transcriptomic states and therefore becomes easier if there are well-curated resources available. However, for less well-studied tissues these may not be available and so cell type annotation may rely on “manual annotation” using a small set of genes with known cell-specific expression (e.g. from microscopy data, qPCR on cell-sorted samples, etc.).

In this section we will do a very simple manual labelling of our clusters, based on known genes expressed in different blood cell types. However, there are more systematic methods for cell type annotation, in particular when prior information is available for those cells:

The SingleR package uses previously labelled bulk or single-cell datasets to annotate a new dataset.
The AUCcell package classifies cells into types based on user-provided lists of “signature” genes for each type. These lists can be generated from literature, or also be based on prior RNA-seq datasets.
Another strategy is to perform a standard gene set enrichment analysis on the top marker genes for each cluster.

Any of these approaches is suitable, and they can often be done in conjunction. To learn more about these approaches, read the chapter on cell type annotation in the OSCA book.

9.6 Manual annotation

A lot is known about immune cell markers, in particular as many surface markers have been identified as useful for immunophenotyping.

To help us in our annotation, we start by retrieving the top-ranked genes from each cluster into a list. We do this by looping through the list using the lapply() function and in each case picking the genes with rank < 10 for Cohen’s D statistic:

# loop through all marker genes 
# and extract top-ranked gene names for each cluster
top_markers_all <- lapply(unique(markers_all$cluster), function(x) {
  markers_all %>%
    filter(cluster == x) %>%
    filter(pct.1 > 0.70, abs(avg_log2FC) > 1) %>%
    slice_max(n = 15, order_by = abs(avg_log2FC)) %>%
    pull(gene)
})

# we get several known markers of immune cells
top_markers_all

## [[1]]
##  [1] "PCDH9"    "PDE4D"    "BACH2"    "CD79B"    "AFF3"     "TCL1A"    "PAX5"     "CD37"     "IGHM"     "CD24"     "ZEB2"    
## [12] "CMTM7"    "TMEM131L" "YBX3"     "VPREB3"  
## 
## [[2]]
##  [1] "HBB"      "SKAP1"    "LEPROTL1" "IFITM1"   "CAMK4"    "JUNB"     "SARAF"    "MAML2"    "PRKCH"    "TXNIP"    "FYB1"    
## [12] "BCL11B"   "RPL21"    "HMGB1"    "RPS27"   
## 
## [[3]]
##  [1] "ENSG00000287092" "STK32B"          "ARPP21"          "ERG"             "TCF4"            "XBP1"            "FHIT"           
##  [8] "VPREB1"          "NR3C1"           "MME"             "VPREB3"          "EBF1"            "MEF2C"           "LAPTM5"         
## [15] "AUTS2"          
## 
## [[4]]
##  [1] "HLA-DRB6"        "CACNB2"          "MSR1"            "HLA-DRB5"        "ENSG00000229839" "ENSG00000290032" "CALN1"          
##  [8] "HLA-DPB1"        "ERG"             "ALOX5"           "MGMT"            "HSPB1"           "DNTT"            "PSD3"           
## [15] "LST1"           
## 
## [[5]]
##  [1] "ENSG00000259097" "CHI3L2"          "TOX"             "FXYD2"           "TRBC1"           "CCDC26"          "MIR181A1HG"     
##  [8] "SCAI"            "ELOVL5"          "BCL11B"          "GNAQ"            "CASC15"          "CD2AP"           "PARD3"          
## [15] "JARID2"         
## 
## [[6]]
##  [1] "TCL1B"     "LINC02580" "HBB"       "HBA1"      "HBA2"      "AKAP12"    "UHRF1"     "RPS4Y2"    "TERF2"     "CD24"     
## [11] "PEBP1"     "CD79A"     "MME"       "ARPP21"    "MARCKSL1" 
## 
## [[7]]
##  [1] "UBE2C"  "AURKB"  "TOP2A"  "MKI67"  "H4C3"   "CENPF"  "BIRC5"  "NUSAP1" "PTTG1"  "RRM2"   "CKS1B"  "CKS2"   "HMGB2"  "TACC3" 
## [15] "TUBB4B"
## 
## [[8]]
##  [1] "TYMS"   "PCLAF"  "PCNA"   "MCM3"   "DUT"    "MCM7"   "DNMT1"  "ATAD2"  "H2AZ1"  "SLBP"   "RANBP1" "SIVA1"  "HSPD1"  "HMGA1" 
## [15] "DEK"   
## 
## [[9]]
##  [1] "ANGPTL1"  "MS4A1"    "HBB"      "RALGPS2"  "HBA2"     "HBA1"     "IGHM"     "LYN"      "BANK1"    "CD37"     "HLA-DQA1"
## [12] "HLA-DQB1" "CD74"     "CD52"     "HLA-DRB1"
## 
## [[10]]
##  [1] "EPHB6"           "TMEM132D"        "RCBTB2"          "ENSG00000259345" "HDAC4"           "FKBP5"           "TRBC2"          
##  [8] "ITGAE"           "RAP1GAP2"        "MAL"             "DLEU1"           "PLCB1"           "CASC15"          "SCN3A"          
## [15] "DDIT4"          
## 
## [[11]]
##  [1] "HBB"       "IL32"      "SKAP1"     "CD247"     "IFITM1"    "FYN"       "HCST"      "CD74"      "PRKCH"     "SRGN"     
## [11] "PITPNC1"   "JUNB"      "PPP2R5C"   "PTPRC"     "LINC-PINT"
## 
## [[12]]
##  [1] "PLXDC2"    "IL3RA"     "ALOX5AP"   "MAML3"     "FLT3"      "CDK14"     "ANKRD28"   "LINC-PINT" "STIM2"     "SFMBT2"   
## [11] "DGKD"      "TCF4"      "SRGN"      "FRMD4B"    "PHLPP1"   
## 
## [[13]]
##  [1] "VCAN"    "CSTA"    "S100A8"  "LYZ"     "S100A9"  "FCN1"    "MNDA"    "CFD"     "TYROBP"  "FCER1G"  "S100A11" "CST3"   
## [13] "LGALS1"  "TYMP"    "IFI30"  
## 
## [[14]]
##  [1] "HBA2"   "HBA1"   "HBB"    "BPGM"   "GMPR"   "MALAT1" "IFI27"  "ALAS2"  "SNCA"   "STRADB" "DCAF12" "NCOA4"  "TMSB4X" "RPS27" 
## [15] "BNIP3L"
## 
## [[15]]
##  [1] "RHAG"   "GYPA"   "CA2"    "SOX6"   "RHCE"   "ANK1"   "GYPE"   "KLF1"   "CA1"    "SLC4A1" "EPB42"  "GYPB"   "AHSP"   "RFESD" 
## [15] "HBD"   
## 
## [[16]]
##  [1] "C1QBP"  "NME2"   "SNHG25" "SNHG29" "SNHG8"  "HSPD1"  "TXN"    "MIF"    "EEF1B2" "ENO1"   "GSTP1"  "OSTC"   "LDHB"   "RPLP0" 
## [15] "NPM1"

If we took some time to examine this list (which would benefit from knowing about the underlying biology of the immune system), we can start to see some genes known to differentiate different types of cells:

HBA1 and HBA2 → expressed in red blood cells (erythrocytes)
CST3 → specific to monocytes
CD3E and CD3D → specific to T cells
NKG7 → specific to natural killer (NK) T cells
CD79A and CD24 → specific to B cells
MS4A1 (CD20) → a clinically-important antigen used as a target to treat B cell-related diseases, such as leukemia. Cells with this antigen are referred to as CD20+ B cells.

Let’s visualise these markers’ expression in our clusters:

# cell type specific genes
known_genes <- c(
  "HBA1", # erythrocytes
  "CST3", # monocytes
  "CD3E", # T cells
  "NKG7", # NK T cells
  "CD79A", # B cells
  "MS4A1" # CD20 B cells
)

# violin plot
VlnPlot(seurat_object,
        features = known_genes,
        group.by = "leiden_cluster",
        pt.size = 0) + 
  NoLegend() +
  theme(axis.text.x = element_text(angle = 45, 
                                   hjust = 1))

Based on the expression patterns for these cells, we classify these clusters as follows:

Cluster	Labelled Cell Type	Evidence
1	B cells	CD79A
2	T cells	CD79A
3	B cells	CD79A
4	B cells	CD79A
5	T cells	CD3D
6	B cells	CD79A
7	B cells	CD79A
8	B cells	CD79A
9	CD20 B cells	CD79A + MS4A1
10	T cells	CD3D
11	NK T cells	CD3D + NKG7
12	B cells	CD79A
13	Monocytes	CD79A
14	Erythrocytes	HBA1
15	Erythrocytes	HBA1
16	Unknown

Note: cluster 16 has not been identified but its top markers seem to be tumour- or cancer-related genes, so it may be a cluster of tumour cells.

Now that we have a more meaningful annotation for our clusters, let’s add this to our Seurat object. We will also add the original cluster ID in parenthesis to remind ourselves that this annotation was done based on the clusters.

When you have come to a decision about which clustering to use it is convenient to set Idents to the column name of the clustering you have chosen. This means downstream code does not need to be changed should you later decide to switch to a different clustering, you’d simply need to change the contents of the Idents slot. This also makes the code easily re-usable for different analyses.

# add new labels to the Idents column in the metadata
seurat_object$Idents <- recode_values(
  seurat_object$leiden_cluster,
  "1" ~ "B (c1)",
  "2" ~ "T (c2)",
  "3" ~ "B (c3)",
  "4" ~ "B (c4)",
  "5" ~ "T (c5)",
  "6" ~ "B (c6)",
  "7" ~ "B (c7)",
  "8" ~ "B (c8)",
  "9" ~ "CD20+ B (c9)",
  "10" ~ "T (c10)",
  "11" ~ "NK T (c11)",
  "12" ~ "B (c12)",
  "13" ~ "Monocytes (c13)",
  "14" ~ "Erythrocytes (c14)",
  "15" ~ "Erythrocytes (c15)",
  "16" ~ "Unknown (c16)"
)

# set the Idents to the new labels
Idents(seurat_object) <- "Idents"

Now, when we label our UMAP, we can see the new names, which are more intutitive to interpret:

# visualise UMAP with new labels
DimPlot(seurat_object,
        reduction = "umap",
        group.by = "Idents",
        label = TRUE,
        label.size = 5) +
  NoLegend()

9.7 (Bonus) Further exploration

9.7.1 Erythrocytes

Looking at the expression of HBA1 and HBA2:

# visualise the expression of HBA1 and HBA2
FeaturePlot(seurat_object,
            features = c("HBA1", "HBA2"),
            reduction = "umap") +
  scale_x_discrete(guide = guide_axis(angle = 45))

## Scale for x is already present.
## Adding another scale for x, which will replace the existing scale.

You may notice that there is a lot of background expression across several cell clusters. HBA1 and HBA2 are common components of the “soup” or ambient RNA in scRNA-seq experiments involving blood cells. Hemoglobin chains, such as HBA1 and HBA2, are very highly expressed in red blood cells and burst red blood cells will contribute their RNA to the medium from which the cells are loaded into the 10X Chromium machine. Some of this medium, or “soup”, containing ambient RNA is captured with each cell. There are methods available to correct for soup in your analysis such as SoupX.

sessionInfo()

## R version 4.5.1 (2025-06-13)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.5 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0  LAPACK version 3.10.0
## 
## locale:
##  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8   
##  [6] LC_MESSAGES=C.UTF-8    LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C           LC_TELEPHONE=C        
## [11] LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
## 
## time zone: Europe/London
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] patchwork_1.3.2    lubridate_1.9.5    forcats_1.0.1      stringr_1.6.0      dplyr_1.2.0        purrr_1.2.1       
##  [7] readr_2.1.6        tidyr_1.3.2        tibble_3.3.1       ggplot2_4.0.2      tidyverse_2.0.0    glmGamPoi_1.22.0  
## [13] sctransform_0.4.2  Seurat_5.3.1       SeuratObject_5.2.0 sp_2.2-0          
## 
## loaded via a namespace (and not attached):
##   [1] RColorBrewer_1.1-3          jsonlite_2.0.0              magrittr_2.0.4              ggbeeswarm_0.7.2           
##   [5] spatstat.utils_3.2-0        farver_2.1.2                rmarkdown_2.30              vctrs_0.7.1                
##   [9] ROCR_1.0-11                 spatstat.explore_3.5-3      htmltools_0.5.8.1           S4Arrays_1.10.0            
##  [13] SparseArray_1.10.1          sass_0.4.10                 parallelly_1.45.1           KernSmooth_2.23-26         
##  [17] bslib_0.9.0                 htmlwidgets_1.6.4           ica_1.0-3                   plyr_1.8.9                 
##  [21] plotly_4.11.0               zoo_1.8-14                  cachem_1.1.0                igraph_2.2.1               
##  [25] mime_0.13                   lifecycle_1.0.5             pkgconfig_2.0.3             Matrix_1.7-4               
##  [29] R6_2.6.1                    fastmap_1.2.0               MatrixGenerics_1.22.0       fitdistrplus_1.2-4         
##  [33] future_1.67.0               shiny_1.11.1                digest_0.6.38               S4Vectors_0.48.0           
##  [37] rprojroot_2.1.1             tensor_1.5.1                RSpectra_0.16-2             irlba_2.3.5.1              
##  [41] GenomicRanges_1.62.0        beachmat_2.26.0             labeling_0.4.3              progressr_0.18.0           
##  [45] timechange_0.4.0            spatstat.sparse_3.1-0       httr_1.4.8                  polyclip_1.10-7            
##  [49] abind_1.4-8                 compiler_4.5.1              here_1.0.2                  withr_3.0.2                
##  [53] S7_0.2.1                    fastDummies_1.7.5           MASS_7.3-65                 DelayedArray_0.36.0        
##  [57] tools_4.5.1                 vipor_0.4.7                 lmtest_0.9-40               otel_0.2.0                 
##  [61] beeswarm_0.4.0              httpuv_1.6.16               future.apply_1.20.0         goftest_1.2-3              
##  [65] glue_1.8.0                  nlme_3.1-168                promises_1.5.0              grid_4.5.1                 
##  [69] Rtsne_0.17                  cluster_2.1.8.1             reshape2_1.4.5              generics_0.1.4             
##  [73] gtable_0.3.6                spatstat.data_3.1-9         tzdb_0.5.0                  hms_1.1.4                  
##  [77] data.table_1.17.8           XVector_0.50.0              BiocGenerics_0.56.0         spatstat.geom_3.6-0        
##  [81] RcppAnnoy_0.0.22            ggrepel_0.9.6               RANN_2.6.2                  pillar_1.11.1              
##  [85] limma_3.66.0                spam_2.11-1                 RcppHNSW_0.6.0              later_1.4.4                
##  [89] splines_4.5.1               lattice_0.22-7              survival_3.8-3              deldir_2.0-4               
##  [93] tidyselect_1.2.1            miniUI_0.1.2                pbapply_1.7-4               knitr_1.50                 
##  [97] gridExtra_2.3               IRanges_2.44.0              Seqinfo_1.0.0               SummarizedExperiment_1.40.0
## [101] scattermore_1.2             stats4_4.5.1                xfun_0.54                   Biobase_2.70.0             
## [105] statmod_1.5.1               matrixStats_1.5.0           stringi_1.8.7               lazyeval_0.2.2             
## [109] yaml_2.3.10                 evaluate_1.0.5              codetools_0.2-20            cli_3.6.5                  
## [113] uwot_0.2.4                  xtable_1.8-4                reticulate_1.44.0           jquerylib_0.1.4            
## [117] dichromat_2.0-0.1           Rcpp_1.1.0                  globals_0.18.0              spatstat.random_3.4-2      
## [121] png_0.1-8                   ggrastr_1.0.2               spatstat.univar_3.1-4       parallel_4.5.1             
## [125] presto_1.0.0                dotCall64_1.2               listenv_0.10.0              viridisLite_0.4.2          
## [129] scales_1.4.0                ggridges_0.5.7              rlang_1.1.7                 cowplot_1.2.0

Introduction to single-cell RNA-seq analysis