Preamble
The ‘Caron’ data set
We will illustrate the main steps of a scRNA-seq analysis using data from a study recently published on the heterogeneity in childhood acute lymphoblastic leukemia and obtaine with the droplet-based 10X Chromium assay.
“Childhood acute lymphoblastic leukemia (cALL) is the most common pediatric cancer. It is characterized by bone marrow lymphoid precursors that acquire genetic alterations, resulting in disrupted maturation and uncontrollable proliferation.” Caron et al. 2020. Nowaways, up to 85–90% of patients are cured, but others do not respond to treatment or relapse and die. The aim of the study is to characterise the heterogeneity of gene expression at the cell level, within and between patients, using pediatric Bone Marrow Monouclear Cells (BMMC).
The adult Bone Marrow Monouclear Cells data set
As in the the ‘Caron’ study we also include adult BMMCs. For technical reason we will use a different data set: adult BMMCs from eight healthy donors in the ‘Census of Immune Cells’ study, available from the Human Cell Atlas data portal.
Samples
Four types of sample are considered:
- eight patients:
- six B-ALL
- two T-ALL
- three healthy pediatric controls
- eight healthy adult controls, publicly available
As the study aims at identifying cell populations, large numbers of cells were sequenced with the droplet-based 10X Chromium assay.
The plan
The aim of this material is to illustrate an analysis workflow for a droplet-based study, not reproduce the published analysis. We will follow several steps:
- sequencing quality check
- alignment of reads to the human genome (GRCh38) with 10X software cellranger
- quality control (cell calls, cells and genes filtering)
- count normalisation
- data set integration
- feature selection
- dimensionality reduction
- clustering
- marker gene identification
- cell type annotation
- cell cycle assignment
- trajectory analysis
LS0tIAp0aXRsZTogIkNSVUsgQmlvaW5mb3JtYXRpY3MgU3VtbWVyIFNjaG9vbCAyMDIwIC0gc2luZ2xlLWNlbGwgUk5BLXNlcSBhbmFseXNpcyIKc3VidGl0bGU6ICdBbiBpbnRyb2R1Y3Rpb24nCmF1dGhvcjogIlN0ZXBoYW5lIEJhbGxlcmVhdSwgWmV5bmVwIEthbGVuZGVyIEF0YWssIEthdGFyenluYSBLYW5pYSIKZGF0ZTogImByIFN5cy5EYXRlKClgIgpvdXRwdXQ6CiAgaHRtbF9kb2N1bWVudDoKICAgIGRmX3ByaW50OiBwYWdlZAogICAgdG9jOiB5ZXMKICAgIG51bWJlcl9zZWN0aW9uczogdHJ1ZQogICAgY29kZV9mb2xkaW5nOiBoaWRlCiAgaHRtbF9ub3RlYm9vazoKICAgIGNvZGVfZm9sZGluZzogaGlkZQogICAgdG9jOiB5ZXMKICAgIHRvY19mbG9hdDogeWVzCiAgICBudW1iZXJfc2VjdGlvbnM6IHRydWUKICBodG1sX2Jvb2s6CiAgICBjb2RlX2ZvbGRpbmc6IGhpZGUKcGFyYW1zOgogIG91dERpckJpdDogIkFuYVdpU2NlL0F0dGVtcHQxIgotLS0KCmBgYHtyIGluZGV4LmtuaXRyLnNldHVwLCBpbmNsdWRlPUZBTFNFLCBjYWNoZT1GQUxTRX0KIyBBZGQgYSBjb21tb24gY2xhc3MgbmFtZSBmb3IgZXZlcnkgY2h1bmtzCmtuaXRyOjpvcHRzX2NodW5rJHNldCgKICBlY2hvID0gVFJVRSkKYGBgCgojIFByZWFtYmxlIHsjUHJlYW1ibGV9CgpgYGB7ciwgY2FjaGU9RkFMU0UsIGV2YWw9RkFMU0UsIGVjaG89RkFMU0V9Cm91dERpckJpdCA8LSBwYXJhbXMkb3V0RGlyQml0ICPCoCJTaW5nbGVDZWxsRXhwMSIsICJBbmFXaVNjZSIKcHJpbnQob3V0RGlyQml0KQpgYGAKCiMjIFRoZSAnQ2Fyb24nIGRhdGEgc2V0CgpXZSB3aWxsIGlsbHVzdHJhdGUgdGhlIG1haW4gc3RlcHMgb2YgYSBzY1JOQS1zZXEgYW5hbHlzaXMgdXNpbmcgZGF0YSBmcm9tIGEgc3R1ZHkgcmVjZW50bHkgcHVibGlzaGVkIG9uIHRoZSBoZXRlcm9nZW5laXR5IGluIGNoaWxkaG9vZCBhY3V0ZSBseW1waG9ibGFzdGljIGxldWtlbWlhIGFuZCBvYnRhaW5lIHdpdGggdGhlIGRyb3BsZXQtYmFzZWQgMTBYIENocm9taXVtIGFzc2F5LgoKIkNoaWxkaG9vZCBhY3V0ZSBseW1waG9ibGFzdGljIGxldWtlbWlhIChjQUxMKSBpcyB0aGUgbW9zdCBjb21tb24gcGVkaWF0cmljIGNhbmNlci4gSXQgaXMgY2hhcmFjdGVyaXplZCBieSBib25lIG1hcnJvdyBseW1waG9pZCBwcmVjdXJzb3JzIHRoYXQgYWNxdWlyZSBnZW5ldGljIGFsdGVyYXRpb25zLCByZXN1bHRpbmcgaW4gZGlzcnVwdGVkIG1hdHVyYXRpb24gYW5kIHVuY29udHJvbGxhYmxlIHByb2xpZmVyYXRpb24uIiBbQ2Fyb24gZXQgYWwuIDIwMjBdKGh0dHBzOi8vd3d3Lm5hdHVyZS5jb20vYXJ0aWNsZXMvczQxNTk4LTAyMC02NDkyOS14KS4gTm93YXdheXMsIHVwIHRvIDg14oCTOTAlIG9mIHBhdGllbnRzIGFyZSBjdXJlZCwgYnV0IG90aGVycyBkbyBub3QgcmVzcG9uZCB0byB0cmVhdG1lbnQgb3IgcmVsYXBzZSBhbmQgZGllLiBUaGUgYWltIG9mIHRoZSBzdHVkeSBpcyB0byBjaGFyYWN0ZXJpc2UgdGhlIGhldGVyb2dlbmVpdHkgb2YgZ2VuZSBleHByZXNzaW9uIGF0IHRoZSBjZWxsIGxldmVsLCB3aXRoaW4gYW5kIGJldHdlZW4gcGF0aWVudHMsIHVzaW5nIHBlZGlhdHJpYyBCb25lIE1hcnJvdyBNb25vdWNsZWFyIENlbGxzIChCTU1DKS4gIAoKIyMgVGhlIGFkdWx0IEJvbmUgTWFycm93IE1vbm91Y2xlYXIgQ2VsbHMgZGF0YSBzZXQKCkFzIGluIHRoZSB0aGUgJ0Nhcm9uJyBzdHVkeSB3ZSBhbHNvIGluY2x1ZGUgYWR1bHQgQk1NQ3MuIEZvciB0ZWNobmljYWwgcmVhc29uIHdlIHdpbGwgdXNlIGEgZGlmZmVyZW50IGRhdGEgc2V0OiBhZHVsdCBCTU1DcyBmcm9tIGVpZ2h0IGhlYWx0aHkgZG9ub3JzIGluIHRoZSBbJ0NlbnN1cyBvZiBJbW11bmUgQ2VsbHMnXShodHRwczovL2RhdGEuaHVtYW5jZWxsYXRsYXMub3JnL2V4cGxvcmUvcHJvamVjdHMvY2M5NWZmODktMmU2OC00YTA4LWEyMzQtNDgwZWNhMjFjZTc5KSBzdHVkeSwgYXZhaWxhYmxlIGZyb20gdGhlIEh1bWFuIENlbGwgQXRsYXMgZGF0YSBwb3J0YWwuCgojIyBTYW1wbGVzCgpGb3VyIHR5cGVzIG9mIHNhbXBsZSBhcmUgY29uc2lkZXJlZDoKCiogZWlnaHQgcGF0aWVudHM6CiAgKiBzaXggQi1BTEwKICAqIHR3byBULUFMTAoqIHRocmVlIGhlYWx0aHkgcGVkaWF0cmljIGNvbnRyb2xzCiogZWlnaHQgaGVhbHRoeSBhZHVsdCBjb250cm9scywgcHVibGljbHkgYXZhaWxhYmxlCgpBcyB0aGUgc3R1ZHkgYWltcyBhdCBpZGVudGlmeWluZyBjZWxsIHBvcHVsYXRpb25zLCBsYXJnZSBudW1iZXJzIG9mIGNlbGxzIHdlcmUgc2VxdWVuY2VkIHdpdGggdGhlIGRyb3BsZXQtYmFzZWQgMTBYIENocm9taXVtIGFzc2F5LgoKIyMgVGhlIHBsYW4KClRoZSBhaW0gb2YgdGhpcyBtYXRlcmlhbCBpcyB0byBpbGx1c3RyYXRlIGFuIGFuYWx5c2lzIHdvcmtmbG93LCBub3QgcmVwcm9kdWNlIHRoZSBwdWJsaXNoZWQgYW5hbHlzaXMuIFdlIHdpbGwgZm9sbG93IHNldmVyYWwgc3RlcHM6CgoqIHNlcXVlbmNpbmcgcXVhbGl0eSBjaGVjawoqIGFsaWdubWVudCBvZiByZWFkcyB0byB0aGUgaHVtYW4gZ2Vub21lIChHUkNoMzgpIHdpdGggMTBYIHNvZnR3YXJlIGNlbGxyYW5nZXIKKiBxdWFsaXR5IGNvbnRyb2wgKGNlbGwgY2FsbHMsIGNlbGxzIGFuZCBnZW5lcyBmaWx0ZXJpbmcpIDwhLS0gbWVudGlvbiBkb3VibGV0IGRldGVjdGlvbiAtLT4KKiBjb3VudCBub3JtYWxpc2F0aW9uCiogZGF0YSBzZXQgaW50ZWdyYXRpb24KKiBmZWF0dXJlIHNlbGVjdGlvbgoqIGRpbWVuc2lvbmFsaXR5IHJlZHVjdGlvbgoqIGNsdXN0ZXJpbmcKKiBtYXJrZXIgZ2VuZSBpZGVudGlmaWNhdGlvbgoqIGNlbGwgdHlwZSBhbm5vdGF0aW9uCiogY2VsbCBjeWNsZSBhc3NpZ25tZW50CiogdHJhamVjdG9yeSBhbmFseXNpcwo=