Exercise

    1. Check the location of the current directory using the command pwd
    2. If the current directory is not Course_Materials, then navigate to the Course_Materials directory using the cd (change directory) command:
cd ~/Course_Materials
    1. Use ls to list the contents of the directory. There should be directory called fastq
    2. Use ls to list the contents of the fastq directory:
ls fastq

SRR7657883.sra_1.fastq.gz SRR7657883.subset_2M.sra_1.fastq.gz
SRR7657883.sra_2.fastq.gz Test_adapter_contamination.gq.gz.
SRR7657883.subset_2M.sra_2.fastq.gz

You should see two fastq files called SRR7657883.sra_1.fastq.gz and SRR7657883.sra_1.fastq.gz. These are the files for read 1 and read 2 of one of the samples we will be working with.

  1. Run fastqc on one of the fastq files:
fastqc fastq/SRR7657883.sra_1.fastq.gz  

\(\Rightarrow\) SRR7657883.sra_1_fastqc.html
\(\Rightarrow\) SRR7657883.sra_1_fastqc.zip

  1. Open the html report in a browser and see if you can answer these questions:
    A) What is the read length? 150
    B) Does the quality score vary through the read length?
    Yes, the first few bases and the last few bases are typically of lower quality.
    C) How is the data’s quality?
    Overall, pretty good.