In this practical session, we will familiarize ourselves with IGV (Integrative Genomics Viewer) to assess the ChIP-seq quality. IGV is used to visualize and explore next-generation sequencing data and annotations. Open the virtual desktop and type igv
in the terminal. A window of IGV should open.
This material for IGV was adapted from the practical material by Shoko Hirosue in 2020.
Now click File > Load from File… and open the bam files and peak files. Let’s open tp53_r2.fastq_trimmed.fastq_sorted.bam
and its input file input.fastq_trimmed.fastq_sorted.bam
, as well as the narrowPeak file tp53_r2.fastq_trimmed.fastq_sorted_peaks.narrowPeak
and summit bed file tp53_r2.fastq_trimmed.fastq_sorted_summite.bed
in the data directory (/home/ubuntu/Course_Materials/ChIPSeq/practicals/data
or /home/ubuntu/Course_Materials/ChIPSeq/practicals/output/macs_output
).
Zoom into one of the peak regions.
Check tp53 targets from (TRRUST): OGG1
, XPC
, SLC6A6
, CTNNB1
, MAP4
, RASSF1
.
Select bam files you have loaded, right click them and select “Group Autoscale”.
Bookmark this region: Go to Regions > Region Navigator. Click Add, and give your region a name (eg. MyFirstRegion) in the “Description” field. Click “View”. This way if you navigate somewhere else on the genome you can always easily access this region from Regions > Region Navigator.
Remove all the files (just so it’s easier to see.) Load BigWig files (tp53_r2.fastq_trimmed.fastq_sorted_standard_treat_pileup.bw
and tp53_r2.fastq_trimmed.fastq_sorted_standard_control_lambda.bw
)
Group autoscale the two tracks so they are comparable.
Set different colours for each of the tracks (Right click at the file name, choose Change Track Colour (Positive values)…).
Export an image from File > Save Image and have a look at the saved file.
Exercise 1
- Explore the IGV
- Load tp53 or p73 data
Now let’s have a quick look of some metrics we learnt from the lecture. The bioconductor package we are using here is ChIC.
# get the working directory
getwd()
# load the library
library(ChIC)
# load the data
chipName <- 'practicals/data/tp53_r2.fastq_trimmed.fastq_sorted'
chipBam <- readBamFile(chipName)
inputName <- 'practicals/data/input.fastq_trimmed.fastq_sorted'
inputBam <- readBamFile(inputName)
Select chromosome 3 of interest:
subset_chromosomes<-c("chr3")
chipSubset<-lapply(chipBam, FUN=function(x) {x[subset_chromosomes]})
inputSubset<-lapply(inputBam, FUN=function(x) {x[subset_chromosomes]})
str(chipSubset)
Now we can generate quality scores and strand cross-correlation plot using qualityScores_EM
from ChIC packages.
# if you get error with plotting: /home/ubuntu/.local/share/rstudio/notebooks
# go to /home/ubuntu/.local/share/rstudio/notebooks
# change your permission `chmod +x /home/ubuntu/.local/share/rstudio/notebooks`
# or add savePlotPath='/home/ubuntu/Course_Materials/ChIPSeq/practicals/output/',
EM_Results <- qualityScores_EM(
chipName=chipName,
inputName=inputName,
chip.data=chipSubset,
input.data=inputSubset,
annotationID="hg38",
read_length=32,
mc=8)
Remember the metrics we covered before lunch. You can check those PBC, RSC, NSC metics below:
PBC <- EM_Results$QCscores_ChIP$CC_PBC
RSC <- EM_Results$QCscores_ChIP$CC_RSC
NSC <- EM_Results$QCscores_ChIP$CC_NSC
PBC
Exercise 2
- Try to play around with it yourself (May take a few mins to generate plot)
- (Optional) Generate strand cross-correlation plot for `p73`.
Joanna Krupka
Shoko Hirosue
Dora Bihary