4  Galaxy Tutorial

Author

Adrien Osakwe

4.1 Galaxy Tutorial

In this section we will briefly try out the online tool https://usegalaxy.org. Make sure you create an account first (increases your disk quota).

4.2 Uploading Data

To upload data, select the “upload” icon in the top left of the webpage. Then select “Paste/Fetch data” and paste the following:

https://zenodo.org/record/6457007/files/GSM461177_1_subsampled.fastqsanger 
https://zenodo.org/record/6457007/files/GSM461177_2_subsampled.fastqsanger 
https://zenodo.org/record/6457007/files/GSM461180_1_subsampled.fastqsanger 
https://zenodo.org/record/6457007/files/GSM461180_2_subsampled.fastqsanger

Then hit “Start”. This should complete fairly quickly. You can then find the uploaded files in the right-hand side of the webpage under “History”.

4.2.1 Creating Paired Lists

As the data we are using contained paired-end reads, we need to create a ‘paired collection’ for Galaxy to treat the files correctly. In the history tab, click the “select items” box and select all the files.

After selecting the pair, you will see a blue button that says “4 of 4 selected”. Click it and select the “Build List of Dataset Pairs” option.

4.3

This will open up a window showing the paired files. Make sure the correct GSM IDs are paired!

Give your collection a name and click “create collection”.

4.4 Running FastQC

Click on the “Tools” tab on the left and search for ”Flatten Collection”.

Choose the collection we created as the input and hit “Run Tool”. This will create a new object which we can use with FastQC.

Next, select FastQC through the search bar:

In the “tool parameters” section, click the “Dataset Collection” tab to select our flattened collection.

And hit Run Tool.

This will generate two objects: one with raw data and another with html files providing an easy-to-interpret report of the sequencing quality. Try downloading one and having a look!

4.5 MultiQC

We can also use MultiQC to compress the four reports into one. Select MultiQC from the search bar and choose FastQC as the tool that generated the logs. You can then select the RawData object we got from FastQC as the input.

You can now view the QC metrics for the 4 FASTQ files in one report!

Question: Do you notice anything weird with any of the files? What metrics are you using to support your claim?

4.6 Running Trimmomatic

If we feel that the quality of our reads drops too much at the 3’ end, we can trim the reads. In the search bar, look for the “Trimmomatic” tool. Then, select the following options:

  • Set “Single-end or paired-end reads?” to paired-end (as collection)

  • Select our original paired end collection (before flattening!)

  • Select the “Output trimmomatic log messages?” option as well.

With this, you can use MultiQC on the log files to generate a post-trimming report and continue your analysis.