Challenge 1
Obtain results for luminal vs basal and find the top 200 genes (call the new results object
resLvB
.
resLvB <- results(ddsObj, name="CellType_luminal_vs_basal", alpha=0.05)
topGenesLvB <- as.data.frame(resLvB) %>%
rownames_to_column("GeneID") %>%
arrange(padj) %>%
head(200)
topGenesLvB
Challenge 2
When we looked at the PCA it did seem that an interaction model might be warranted. Let’s test that. 1.Fit a model with interaction: What is the rationale to include the interaction?
2. Use the LRT to compare the two models. 3. Is the number of replicates good enough to include the interaction?
4. Is the interaction needed in the model?
designI <- as.formula(~ CellType * Status)
# Build model
ddsObjI <- DESeqDataSetFromMatrix(countData = countdata,
colData = sampleinfo,
design = designI)
some variables in design formula are characters, converting to factors
# Run DESeq
ddsObjI <- DESeq(ddsObjI)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
# Compare the models
ddsObjI <- DESeq(ddsObjI, test="LRT", reduced=design)
using pre-existing size factors
estimating dispersions
found already estimated dispersions, replacing these
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
resIvA <- results(ddsObjI)
head(resIvA)
log2 fold change (MLE): CellTypeluminal.Statuslactate
LRT p-value: '~ CellType * Status' vs '~ CellType + Status'
DataFrame with 6 rows and 6 columns
baseMean log2FoldChange lfcSE stat
<numeric> <numeric> <numeric> <numeric>
ENSMUSG00000051951 193.628316332322 3.42805158084046 2.25405051312001 2.61930258508055
ENSMUSG00000102331 0.552133378452342 -1.17247770601742 6.53741820376331 0.0217322297523772
ENSMUSG00000025900 2.09442070179461 -1.9755970656928 3.43884201149684 1.50879333755253
ENSMUSG00000025902 52.4208304161579 0.414078278824817 0.831984392507072 0.247059069140377
ENSMUSG00000098104 0.684213614702782 -2.54414197763085 4.50935416136949 0.478381585698944
ENSMUSG00000103922 27.5871076674379 3.55643619940112 0.882991779252022 16.4215487583443
pvalue padj
<numeric> <numeric>
ENSMUSG00000051951 0.269914161057679 0.383078204268457
ENSMUSG00000102331 0.98919270809821 NA
ENSMUSG00000025900 0.470294272371044 NA
ENSMUSG00000025902 0.883795538337252 0.925466294570668
ENSMUSG00000098104 0.787264663574355 NA
ENSMUSG00000103922 0.000271710233055865 0.000853060524695032
length(which(resIvA$padj<=0.05))
[1] 8691