- Use
pathview
to export a figure for “mmu04060”, but this time only use genes that are statistically significant at FDR < 0.01
logFC <- shrinkLvV %>%
drop_na(FDR, Entrez) %>%
filter(FDR < 0.01) %>%
select(Entrez, logFC) %>%
deframe()
pathview(gene.data = logFC,
pathway.id = "mmu04060",
species = "mmu",
limit = list(gene=5, cpd=1))
## Info: Downloading xml files for mmu04060, 1/1 pathways..
## Info: Downloading png files for mmu04060, 1/1 pathways..
## 'select()' returned 1:1 mapping between keys and columns
## Info: Working in directory /home/cri.camres.org/sawle01/Documents/training/RNAseq_November_2020_remote/Course_Materials/solutions
## Info: Writing image file mmu04060.pathview.png
mmu04060.pathview.png:
clusterProfiler
can also perform over-representation analysis on GO terms. using the commmandenrichGO
. Look at the help page for the commandenrichGO
(?enrichGO
) and have a look at the instructions in the clusterProfiler book.
- Run the over-representation analysis for GO terms
- Use genes that have an adjusted p-value (FDR) of less than 0.01 and an absolute fold change greater than 2.
- For this analysis you can use Ensembl IDs rather then Entrez
- You’ll need to provide the background (
universe
) genes, this should be all the genes in our analysis.- The mouse database package is called
org.Mm.eg.db
. You’ll need to load it usinglibrary
before running the analysis.
- As we are using Ensembl IDs, you’ll need to set the
keyType
parameter in theenrichGO
command to indicate this.- Only test terms in the “Molecular Function” ontology
- Use the
dotplot
function to visualise the results.
library(org.Mm.eg.db)
sigGenes <- shrinkLvV %>%
drop_na(FDR) %>%
filter(FDR < 0.01 & abs(logFC) > 1) %>%
pull(GeneID)
universe <- shrinkLvV$GeneID
ego <- enrichGO(gene = sigGenes,
universe = universe,
OrgDb = org.Mm.eg.db,
keyType = "ENSEMBL",
ont = "MF",
pvalueCutoff = 0.01,
readable = TRUE)
dotplot(ego)
Another common way to rank the genes is to order by pvalue, but also, sorting so that upregulated genes are at start and downregulated at the other - you can do this combining the sign of the fold change and the pvalue.
- Rank the genes by statisical significance - you will need to create a new ranking value using
-log10({p value}) * sign({Fold Change})
- Load the “C2” pathways from the the
Robjects/mouse_c2_v5.RData
file
- Run
fgsea
using the new ranked genes and the C2 pathways
ranks <- -log10(gseaDat$pvalue) * sign(gseaDat$logFC) # or gseaDat$stat
names(ranks) <- gseaDat$Entrez
load("Robjects/mouse_c2_v5.RData")
fgseaResC2 <- fgsea(Mm.c2, ranks, minSize=15, maxSize = 500)
head(fgseaResC2[order(padj, -abs(NES)), ], n=10)
## Warning in preparePathwaysAndStats(pathways, stats, minSize, maxSize, gseaParam, : There are ties in the preranked stats (0.08% of the list).
## The order of those tied genes will be arbitrary, which may produce unexpected results.
## Warning in fgseaMultilevel(...): For some pathways, in reality P-values are less
## than 1e-10. You can set the `eps` argument to zero for better estimation.
## # A tibble: 10 x 8
## pathway pval padj log2err ES NES size leadingEdge
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <list>
## 1 REACTOME_INFLUENZA_… 1.00e-10 2.15e-8 NA -0.859 -2.70 128 <chr [80]>
## 2 REACTOME_NONSENSE_M… 1.00e-10 2.15e-8 NA -0.887 -2.70 100 <chr [74]>
## 3 REACTOME_3_UTR_MEDI… 1.00e-10 2.15e-8 NA -0.886 -2.70 100 <chr [68]>
## 4 REACTOME_PEPTIDE_CH… 1.00e-10 2.15e-8 NA -0.908 -2.70 79 <chr [68]>
## 5 REACTOME_INFLUENZA_… 1.00e-10 2.15e-8 NA -0.884 -2.69 95 <chr [69]>
## 6 KEGG_RIBOSOME 1.00e-10 2.15e-8 NA -0.897 -2.66 81 <chr [66]>
## 7 REACTOME_METABOLISM… 1.00e-10 2.15e-8 NA -0.774 -2.57 202 <chr [87]>
## 8 REACTOME_TRANSLATION 1.00e-10 2.15e-8 NA -0.800 -2.56 141 <chr [71]>
## 9 REACTOME_SRP_DEPEND… 1.00e-10 2.15e-8 NA -0.843 -2.55 103 <chr [67]>
## 10 REACTOME_METABOLISM… 1.00e-10 2.15e-8 NA -0.755 -2.54 244 <chr [97]>
- Run
fgsea
using the new ranked genes and the H pathways. How do these results differ from the ones we got when ranking by the fold change alone?
fgseaResH <- fgsea(Mm.H, ranks, minSize=15, maxSize = 500)
head(fgseaResH[order(padj, -abs(NES)), ], n=20)
## Warning in preparePathwaysAndStats(pathways, stats, minSize, maxSize, gseaParam, : There are ties in the preranked stats (0.08% of the list).
## The order of those tied genes will be arbitrary, which may produce unexpected results.
## Warning in fgseaMultilevel(...): For some pathways, in reality P-values are less
## than 1e-10. You can set the `eps` argument to zero for better estimation.
## # A tibble: 20 x 8
## pathway pval padj log2err ES NES size leadingEdge
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <list>
## 1 HALLMARK_MYC_TARG… 1.00e-10 5.00e-9 NA -0.683 -2.24 201 <chr [56]>
## 2 HALLMARK_OXIDATIV… 5.12e- 9 1.28e-7 0.761 0.648 2.07 196 <chr [128]>
## 3 HALLMARK_E2F_TARG… 8.51e- 7 1.42e-5 0.659 -0.562 -1.85 201 <chr [49]>
## 4 HALLMARK_TGF_BETA… 2.25e- 6 2.82e-5 0.627 -0.742 -2.05 54 <chr [15]>
## 5 HALLMARK_ADIPOGEN… 5.70e- 6 5.70e-5 0.611 0.573 1.83 199 <chr [79]>
## 6 HALLMARK_MYC_TARG… 1.05e- 5 8.76e-5 0.593 -0.701 -1.96 58 <chr [25]>
## 7 HALLMARK_CHOLESTE… 3.10e- 5 2.21e-4 0.557 0.701 1.94 73 <chr [29]>
## 8 HALLMARK_FATTY_AC… 4.63e- 5 2.57e-4 0.557 0.576 1.78 155 <chr [54]>
## 9 HALLMARK_MTORC1_S… 4.52e- 5 2.57e-4 0.557 0.550 1.76 203 <chr [47]>
## 10 HALLMARK_G2M_CHEC… 7.54e- 5 3.77e-4 0.538 -0.506 -1.66 201 <chr [29]>
## 11 HALLMARK_ESTROGEN… 4.44e- 4 2.02e-3 0.498 -0.482 -1.57 193 <chr [54]>
## 12 HALLMARK_PEROXISO… 9.93e- 4 4.14e-3 0.455 0.609 1.75 95 <chr [39]>
## 13 HALLMARK_ALLOGRAF… 1.08e- 3 4.14e-3 0.455 -0.460 -1.52 202 <chr [41]>
## 14 HALLMARK_WNT_BETA… 2.16e- 3 7.70e-3 0.432 -0.658 -1.73 40 <chr [18]>
## 15 HALLMARK_NOTCH_SI… 2.48e- 3 7.76e-3 0.432 -0.696 -1.73 31 <chr [15]>
## 16 HALLMARK_TNFA_SIG… 2.44e- 3 7.76e-3 0.432 -0.449 -1.47 199 <chr [60]>
## 17 HALLMARK_DNA_REPA… 2.99e- 3 8.78e-3 0.432 -0.459 -1.46 150 <chr [27]>
## 18 HALLMARK_BILE_ACI… 5.90e- 3 1.64e-2 0.407 0.548 1.57 98 <chr [37]>
## 19 HALLMARK_ESTROGEN… 1.09e- 2 2.86e-2 0.381 -0.416 -1.36 197 <chr [39]>
## 20 HALLMARK_PROTEIN_… 1.71e- 2 4.27e-2 0.352 0.499 1.44 96 <chr [56]>