CRUK Bioinformatics Summer School 2020 - single-cell RNA-seq analysis
Heterogeneity in childhood acute lymphoblastic leukemia with droplet-based 10X Chromium assay.
2021-05-28
Chapter 1 Preamble
1.1 The study
“Childhood acute lymphoblastic leukemia (cALL) is the most common pediatric cancer. It is characterized by bone marrow lymphoid precursors that acquire genetic alterations, resulting in disrupted maturation and uncontrollable proliferation.” Caron et al. 2020. Nowaways, up to 85–90% of patients are cured, but others do not respond to treatment or relapse and die. The aim of the study is to characterise the heterogeneity of gene expression at the cell level, within and between patients.
Four type of samples are considered:
- eight patients:
- six B-ALL
- four ‘t(12;21)’ or ‘ETV6-RUNX1’
- two ‘High hyper diploid’ or ‘HHD’
- two T-ALL (‘PRE-T’)
- six B-ALL
- three healthy pediatric controls
- eight healthy adult controls, publicly available
As the study aims at identifying cell populations, large numbers of cells were sequenced with the droplet-based 10X Chromium assay.
1.2 The plan
We will follow several steps:
- sequencing quality check (see 2)
- alignment of reads to the human genome (GRCh38) with 10X software cellranger (see 3)
- quality control (cell calls, cells and genes filtering) (see 20 for the ‘all-cells’ analysis and 4 for the analysis of the downsampled data set to use in the course)
- count normalisation (see 21 for the ‘all-cells’ analysis and 5 for the analysis of the downsampled data set to use in the course)
- data set integration (see 28 and 12, and 29 and 13 )
- feature selection (see 8)
- dimensionality reduction (see 6 for visualisation and 9 for analysis)
- clustering (see ?? or @ref(clustering-with-PBMMC-ETV6-RUNX1-and–5hCellPerSpl))
- marker gene identification (see 15)
- cell cycle assignment (see 30)
- cell type annotation (see 17.0.5)
- trajectory analysis (see 17)
1.3 Abbreviations
- ALL: Acute Lymphoblastic Leukemia
- BMMC: Bone Marrow Mononuclear Cell
- cALL: childhood ALL
- HCA: Human Cell Atlas
- HVG: highly variable genes
- MNN: mutual nearest neighbors
- PCA: Principal Component Analysis
- UMI: Unique Molecular Identifier
- SCE: SingleCellExperiment