September 2019

Counting

  • Use featureCounts [@Liao2014] programme from the subRead package

  • Need to provide featureCounts with an annotation file.

GTF File Format

GTF/GFF files define genomic regions covered by different types of genomic features, e.g. genes, transcripts, exons, or UTRs.

GTF File Format

GTF File Format

GTF File Format

GTF File Format

GTF File Format

GTF File Format

Using FeatureCounts

When using a GTF/GFF file we need to tell featureCounts

  • what feature type to use to count reads
  • what attribute type to summarise the results with

For RNAseq we most commonly wish to count reads aligning to exons, and then to summarise at the gene level.

Running FeatureCounts

The code below uses featureCounts to count reads in a BAM file against a GTF for the mouse GRCm38 genome assembly.

  mkdir counts

  featureCounts \
      -t exon \
      -g gene_id \
      --primary \
      -a references/Mus_musculus.GRCm38.97.gtf \
      -o counts/MCL1.DL.featureCounts \
      bam/MCL1.DL.sorted.bam
  • -t exon - the feature type to count reads against, in this case exons
  • -g gene_id - the attribute type to summarise counts by, in this case the gene ID
  • --primary - only count primary alignment
  • -a - the gene annotation reference file
  • -o - the name for the output files