October 2024

HTS Applications - Overview

DNA Sequencing

  • Genome Assembly

  • SNPs/SVs/CNVs

  • DNA methylation

  • DNA-protein interactions (ChIPseq)

  • Chromatin Modification (ATAC-seq/ChIPseq)

RNA Sequencing

  • Transcriptome Assembly

  • Differential Gene Expression

  • Fusion Genes

  • Splice variants

Single-Cell

  • RNA/DNA

  • Low-level RNA/DNA detection

  • Cell-type classification

  • Dissection of heterogenous cell populations

RNAseq Workflow

Experimental Design

Library Preparation

Sequencing

Bioinformatics Analysis

Image adapted from: Wang, Z., et al. (2009), Nature Reviews Genetics, 10, 57–63.

Practical considerations for RNAseq

  • Coverage: how many reads?

  • Read length & structure: Long or short reads? Paired or Single end?

  • Library preparation method: Poly-A, Ribominus, other?

How many reads do we need?


The coverage is defined as:

\(\frac{Read\,Length\;\times\;Number\,of\,Reads}{Length\,of\,Target\,Sequence}\)

  • For a general view of differential expression: 5–25 million reads per sample
  • For alternative splicing and lowly expressed genes: 30–60 million reads per sample.
  • In-depth view of the transcriptome/assemble new transcripts: 100–200 million reads
  • Targeted RNA expression requires fewer reads.
  • miRNA-Seq or Small RNA Analysis require even fewer reads.

Designing the right experiment - Read length

Long or short reads? Paired or Single end?

The answer depends on the experiment:

  • Gene expression – typically just a short read e.g. 50/75 bp; SE or PE.
  • kmer-based quantification of Gene Expression (Salmon etc.) - benefits from PE.
  • Transcriptome Analysis – longer paired-end reads (such as 2 x 75 bp).
  • Small RNA Analysis – short single read, e.g. SE50 - will need trimming.

Library preparation

- Ribosomal RNA

- Poly-A transcripts

- Other RNAs e.g. tRNA, miRNA etc.

Total RNA extraction

Library preparation

Poly-A Selection

Poly-A transcripts e.g.:

  • mRNAs
  • immature miRNAs
  • snoRNA

Ribominus selection

Poly-A transcripts + Other mRNAs e.g.:

  • tRNAs
  • mature miRNAs
  • piRNAs

Sequencing by Synthesis

A complimentary strand is synthesized using the cDNA fragment as template.

Each nucleotide includes a fluorescent tag and as the new strand is synthesized, the colour of the fluorescence indicates which base is being added.

The sequencer records the order of these flashes of light and translates them to a base sequence.

Sequencing errors cause uncertainty in calling the nucleotide at a given location. These reductions in confidence would be reflected in the quality scores in your fastq output.

Case Study

Differential Gene Expression Analysis Workflow