1 Calling ChIP-seq peaks using MACS2

1.1 Assess the quality of the aligned datasets

  • strand cross-correlaticross-correlation. It is based on the fact that a high-quality ChIP-seq experiment produces significant clustering of enriched DNA sequence tags at locations bound by the protein of interest, and that the sequence tag density accumulates on forward and reverse strands centered around the binding site. The cross-correlation metric is computed as the Pearson’s linear correlation between the Crick strand and the Watson strand, after shifting Watson by k base pairs. This typically produces two peaks when cross-correlation is plotted against the shift value

The spp R package can be used for strand cross-correlation (not run in this practical). Two cross-correlation peaks are usually observed in a ChIP experiment, one corresponding to the read length (“phantom” peak) and one to the average fragment length of the library.

see Landt et al., 2012, Genome Res. “ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia” for more details.

The absolute and relative height of the two peaks are useful determinants of the success of a ChIP-seq experiment. A high-quality immunoprecipitation is characterized by a ChIP peak that is much higher than the “phantom” peak, while often very small or no such peak is seen in failed experiments.