First we will recapitulate the above steps to generate a new DESeq2 object with the additive model. Then we will extract the results table for the contrast for “lactate v virgin”.
design <- as.formula(~ CellType + Status)
ddsObj.raw <- DESeqDataSetFromMatrix(countData = countdata,
colData = sampleinfo,
design = design)
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
ddsObj <- DESeq(ddsObj.raw)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
resLvV <- results(ddsObj, alpha=0.05)
resLvB
.resultsNames(ddsObj)
## [1] "Intercept" "CellType_luminal_vs_basal"
## [3] "Status_pregnant_vs_lactate" "Status_virgin_vs_lactate"
resLvB <- results(ddsObj, alpha=0.05, name="CellType_luminal_vs_basal")
resLvB
## log2 fold change (MLE): CellType luminal vs basal
## Wald test p-value: CellType luminal vs basal
## DataFrame with 25369 rows and 6 columns
## baseMean log2FoldChange lfcSE stat pvalue
## <numeric> <numeric> <numeric> <numeric> <numeric>
## ENSMUSG00000051951 194.697167 -9.84263 0.940542 -10.464855 1.25268e-25
## ENSMUSG00000102331 0.557693 -2.17264 2.244934 -0.967796 3.33146e-01
## ENSMUSG00000025900 3.838125 4.16201 1.057538 3.935563 8.30020e-05
## ENSMUSG00000025902 52.523551 -2.67254 0.341351 -7.829312 4.90547e-15
## ENSMUSG00000098104 0.686876 -1.34807 1.458074 -0.924552 3.55199e-01
## padj
## <numeric>
## ENSMUSG00000051951 1.92854e-24
## ENSMUSG00000102331 NA
## ENSMUSG00000025900 2.25328e-04
## ENSMUSG00000025902 3.85988e-14
## ENSMUSG00000098104 NA
## [ reached getOption("max.print") -- omitted 6 rows ]
sum(resLvB$padj < 0.05 & resLvB$log2FoldChange > 0, na.rm = TRUE)
## [1] 6410
Suppose we want to find genes that are differentially expressed between pregnant and lactate. We don’t have a parameter that explicitly will allow us to test that hypothesis. We need to provide a contrast. Look at the help page for results
(?results
) and read about the contrast
argument (also look at Example 1 at the bottom of the help page)
Use the contrast
argument to extract the results table for “pregnant v lactate”.
resPvL <- results(ddsObj,
alpha=0.05,
contrast = c("Status", "pregnant", "lactate"))
resPvL
## log2 fold change (MLE): Status pregnant vs lactate
## Wald test p-value: Status pregnant vs lactate
## DataFrame with 25369 rows and 6 columns
## baseMean log2FoldChange lfcSE stat pvalue
## <numeric> <numeric> <numeric> <numeric> <numeric>
## ENSMUSG00000051951 194.697167 -1.660985 0.758521 -2.189767 0.0285411
## ENSMUSG00000102331 0.557693 -1.059737 2.714947 -0.390334 0.6962893
## ENSMUSG00000025900 3.838125 2.280314 1.306913 1.744810 0.0810180
## ENSMUSG00000025902 52.523551 0.305865 0.420506 0.727375 0.4669962
## ENSMUSG00000098104 0.686876 -1.698857 1.839139 -0.923724 0.3556298
## padj
## <numeric>
## ENSMUSG00000051951 0.101341
## ENSMUSG00000102331 NA
## ENSMUSG00000025900 0.214589
## ENSMUSG00000025902 0.669987
## ENSMUSG00000098104 NA
## [ reached getOption("max.print") -- omitted 6 rows ]
When we looked at the PCA it did seem that an interaction model might be warranted. Let’s test that.
~ CellType * Status
)design <- as.formula(~ CellType + Status + CellType:Status)
ddsObj2.raw <- DESeqDataSetFromMatrix(countData = countdata,
colData = sampleinfo,
design = design)
## converting counts to integer mode
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
~CellType + Status
)# create the simpler model
design.reduced <- as.formula(~ CellType + Status)
# Compare the two designs
ddsObjC2 <- DESeq(ddsObj2.raw, test="LRT", reduced=design.reduced)
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
resCSvCxS <- results(ddsObjC2)
sum(resCSvCxS$padj < 0.05, na.rm=TRUE)
## [1] 9119