Introduction to single-cell RNA-seq analysis

In order to utilise certain functions from the Bioconductor ecosystem, it is often necessary to convert a Seurat object to a SingleCellExperiment (SCE) object. This can be achieved using the as.SingleCellExperiment() function from the Seurat package.

library(Seurat)
library(SingleCellExperiment)
library(tidyverse)

We will load our seurat object. This is all the cells from sample groups ETV6-RUNX1 and PBMMC. The data has been processed as described in the previous sections (QC, normalisation, feature selection, dimensionality reduction, harmony integration and clustering).

# Load the Seurat object
seurat_object <- readRDS("../RObjects/Annotated.full.ETV6.PBMMC.rds")

Now we can convert the Seurat object to a SingleCellExperiment object.

# Convert the Seurat object to a SingleCellExperiment object
sce_object <- as.SingleCellExperiment(seurat_object)

We need to make sure all the information is correctly transferred to the SCE object. The structure of the SingleCellExperiment object is different to the Seurat object but it has different slots for different information types in a similar way.

sce_object

## class: SingleCellExperiment 
## dim: 28343 30167 
## metadata(0):
## assays(2): counts logcounts
## rownames(28343): ENSG00000238009 ENSG00000241860 ... FAM41AY1 FAM224B
## rowData names(0):
## colnames(30167): ETV6RUNX1-1_AAACCTGAGACTTTCG-1
##   ETV6RUNX1-1_AAACCTGGTCTTCAAG-1 ... PBMMC-3_TTTGTCATCAGTTGAC-1
##   PBMMC-3_TTTGTCATCTCGTTTA-1
## colData names(11): orig.ident nCount_RNA ... Idents ident
## reducedDimNames(3): PCA HARMONY UMAP
## mainExpName: SCT
## altExpNames(1): RNA

It looks like everyhting has been transfered but we should check. The colData slot of the SCE object should contain all the relevant metadata from the Seurat object.

head(colData(sce_object))

## DataFrame with 6 rows and 11 columns
##                                 orig.ident nCount_RNA nFeature_RNA SampleGroup
##                                   <factor>  <numeric>    <numeric> <character>
## ETV6RUNX1-1_AAACCTGAGACTTTCG-1 ETV6RUNX1-1       8354         2935   ETV6RUNX1
## ETV6RUNX1-1_AAACCTGGTCTTCAAG-1 ETV6RUNX1-1      14974         4341   ETV6RUNX1
## ETV6RUNX1-1_AAACCTGGTGTTGAGG-1 ETV6RUNX1-1      10468         3636   ETV6RUNX1
## ETV6RUNX1-1_AAACCTGTCCCAAGTA-1 ETV6RUNX1-1      10437         3340   ETV6RUNX1
## ETV6RUNX1-1_AAACCTGTCGAATGCT-1 ETV6RUNX1-1       2453         1392   ETV6RUNX1
## ETV6RUNX1-1_AAACGGGCACCATCCT-1 ETV6RUNX1-1       3351         1729   ETV6RUNX1
##                                 SampleName percent.mt nCount_SCT nFeature_SCT
##                                   <factor>  <numeric>  <numeric>    <integer>
## ETV6RUNX1-1_AAACCTGAGACTTTCG-1 ETV6RUNX1-1    3.50730       5697         2881
## ETV6RUNX1-1_AAACCTGGTCTTCAAG-1 ETV6RUNX1-1    3.80660       4877         2100
## ETV6RUNX1-1_AAACCTGGTGTTGAGG-1 ETV6RUNX1-1    4.08865       5964         3195
## ETV6RUNX1-1_AAACCTGTCCCAAGTA-1 ETV6RUNX1-1    5.04934       5762         2839
## ETV6RUNX1-1_AAACCTGTCGAATGCT-1 ETV6RUNX1-1    3.66898       4322         1411
## ETV6RUNX1-1_AAACGGGCACCATCCT-1 ETV6RUNX1-1    3.04387       4228         1728
##                                leiden_cluster      Idents    ident
##                                      <factor> <character> <factor>
## ETV6RUNX1-1_AAACCTGAGACTTTCG-1              3      B (c3)   B (c3)
## ETV6RUNX1-1_AAACCTGGTCTTCAAG-1              6      B (c6)   B (c6)
## ETV6RUNX1-1_AAACCTGGTGTTGAGG-1              6      B (c6)   B (c6)
## ETV6RUNX1-1_AAACCTGTCCCAAGTA-1              3      B (c3)   B (c3)
## ETV6RUNX1-1_AAACCTGTCGAATGCT-1              3      B (c3)   B (c3)
## ETV6RUNX1-1_AAACGGGCACCATCCT-1              3      B (c3)   B (c3)

The assays slot of the SCE object should contain the count data and the normalised data. The counts assay should contain the raw count data and the logcounts assay should contain the log-normalised data.

The reducedDims slot of the SCE object should contain the dimensionality reduction results. The PCA slot should contain the PCA results and the UMAP slot should contain the UMAP results. There should also be a harmony slot containing the harmony results.

To test that the dimensionality reduction results have been correctly transfered, we can plot the UMAP using the plotReducedDim() function from the scater package.

library(scater)

## Loading required package: scuttle

plotReducedDim(sce_object, "UMAP", colour_by = "leiden_cluster")

DimPlot(seurat_object, reduction = "umap", group.by = "leiden_cluster")

This looks the same as our UMAP plot from the Seurat object, so we can be confident that the conversion was successful.

There is one more step we need to do. Like Seurat using the Idents to store our cell identities (in our case celltypes), the SingleCellExperiment object uses labels. We can transfer our identity information from the Seurat object to the SCE object using the colLabels function.

colLabels(sce_object) <- Idents(seurat_object)

We can now use this SCE object for downstream analysis using Bioconductor packages.

Introduction to single-cell RNA-seq analysis

Conversion of a Seurat object to a SingleCellExperiment object