View on GitHub

Introduction to Single-cell RNA-seq Analysis

University of Cambridge / CRUK

Download this project as a .zip file Download this project as a tar.gz file

Introduction to single-cell RNA-seq data analysis

12, 19, 16 September 2022, 09:30 - 17:30

Taught in person

Bioinformatics Training Facility, University of Cambridge

Instructors

Abigail Edwards - Bioinformatics Core, Cancer Research UK Cambridge Institute
Adam Reid - The Gurdon Institute, University of Cambridge
Ashley Sawle - Bioinformatics Core, Cancer Research UK Cambridge Institute
Chandra Chilamakuri - Bioinformatics Core, Cancer Research UK Cambridge Institute
Katarzyna Kania - Genomics Core, Cancer Research UK Cambridge Institute
Roderik Kortlever - Dept. Biochemistry, University of Cambridge

Helpers:

Hugo Tavares - Bioinformatics Training Facility, University of Cambridge
Stephane Ballereau - Cellular Genetics programme, Wellcome Sanger Institute

Outline

This workshop is aimed at biologists interested in learning how to perform standard single-cell RNA-seq analyses.

This will focus on the droplet-based assay by 10X genomics and include running the accompanying cellranger pipeline to align reads to a genome reference and count the number of read per gene, reading the count data into R, quality control, normalisation, data set integration, clustering and identification of cluster marker genes, as well as differential expression and abundance analyses. You will also learn how to generate common plots for analysis and visualisation of gene expression data, such as TSNE, UMAP and violin plots.

Prerequisites

**Some basic experience of using a UNIX/LINUX command line is assumed**

**Some R knowledge is assumed and essential. Without it, you will struggle on this course.** If you are not familiar with the R statistical programming language we strongly encourage you to work through an introductory R course before attempting these materials. We recommend our Introduction to R course

Data set

The course data is based on ‘CaronBourque2020’ relating to pediatric leukemia, with four sample types, including:
- pediatric Bone Marrow Mononuclear Cells (PBMMCs)
- three tumour types: ETV6-RUNX1, HHD, PRE-T
The data used in the course can be downloaded from Dropbox (~2.6GB). Please note that:
- these data have been processed for teaching purposes and are therefore not suitable for research use;
- all the data is provided on our training machines, you don’t need to download it to attend the course.

Schedule

Please note that this is our first time teaching these materials back in person so we may adjust these times as the pace requires.

Day 1

09:30 - 09:40 Welcome
09:40 - 10:00 Introduction - Hugo Tavares
- Slides
10:20 - 10:10 Preamble: data set and workflow - Adam Reid
- Slides
10:40 - 12:00 Library structure, cellranger for alignment and cell calling - Adam Reid
- Slides (pdf)
- Demonstration
12.00 - 12.30 Loupe browser demo - Roderik Kortlever
- Slides
12:30 - 13:30 lunch break
13.30 - 16.00 QC and exploratory analysis - Ashley Sawle
- Slides (pdf)
- Demonstration
- Practical
16.00 - 17.00 Introduction to Single Cell Technologies Katarzyna Kania
- Slides

Day 2

09:30 - 09:40 Recap - Chandra Chilamakuri
- Slides
09:40 - 12:30 Normalisation - Chandra Chilamakuri
- Slides (pdf)
- Demonstration
- Practical
12:30 - 13:30 lunch break
13:30 - 15:25 Feature selection and dimensionality reduction - Abigail Edwards
- Slides (pdf)
- Demonstration
15:25 - 15:35 10 min break
15:35 - 17:30 Batch correction and data set integration - Ashley Sawle
- Slides (pdf)
- Demonstration

Day 3

09:30 - 09:40 Recap
09:40 - 11:05 Clustering - Ashley Sawle
- Slides
- Demonstration
11:05 - 11:15 10 min break
11:15 - 12:30 Identification of cluster marker genes - Hugo Tavares
- Slides
- Demonstration
12:30 - 13:30 lunch break
13:30 - 17.30 Differential Expression and Abundance Analysis - Abigail Edwards
- Slides
- Demonstration

Software Installation

We will give you access to training computers with all the necessary software installed. However, if you want to run the analysis on your own computer, you can follow these instructions.

Download and install R: https://cloud.r-project.org/
- (Windows users only): Download and install RTools: https://cran.r-project.org/bin/windows/Rtools/
Download and install RStudio: https://www.rstudio.com/products/rstudio/download/#download

Open RStudio and run the following commands from the console:

  install.packages("BiocManager")
  BiocManager::install(c("AnnotationHub", "BiocParallel", "BiocSingular", "DropletUtils", "PCAtools", "batchelor", "bluster", "cluster", "clustree", "dynamicTreeCut", "edgeR", "ensembldb", "ggplot2", "igraph", "patchwork", "pheatmap", "scater", "scran", "tidyverse"))

For Cellranger, you will need to use a Linux machine. See the installation instructions from 10x Genomics.

Acknowledgments:

Much of the material in this course has been derived from the demonstrations found in OSCA book and the Hemberg Group course materials. Additional material concerning miloR has been based on the demonstration from the Marioni Lab.

The materials have been contributed to by many individuals over the last 2 years, including:

Abigail Edwards, Ashley D Sawle, Chandra Chilamakuri, Kamal Kishore, Stephane Ballereau, Zeynep Kalendar Atak, Hugo Tavares, Jon Price, Katarzyna Kania, Roderik Kortlever, Adam Reid, Tom Smith

Apologies if we have missed anyone!