Skip to contents

Creates four ggplot2 figures from MultiQC results, and returns the data along with the plot objects. The plots are for FastQC (read counts and Phred scores), STAR, and HTSeq. See Details for more information.

Usage

tr_qc_plots(
  directory,
  type = "bar",
  hide_samples = FALSE,
  col_width = 0.9,
  add_points = TRUE,
  font_size = 18,
  threshold_line = 1e+07,
  threshold_line_colour = "#EE2C2C",
  threshold_line_type = "dashed",
  threshold_line_size = 1,
  limits = NULL
)

Arguments

directory

Folder containing all the data files generated by MultiQC, e.g. "multiqc_data/"

type

Type of plot to produce, either "bar" (default) or "box"

hide_samples

Logical: For type = "bar", remove samples names from the y axis when there are many samples. Defaults to FALSE.

col_width

Width of bars when type = "bar", which can be reduced when plotting many samples. Defaults to 0.9

add_points

Logical: When making a box plot, should individual samples be plotted as points? Defaults to TRUE.

font_size

Base font size (defaults to 18)

threshold_line

Provide a number to draw a line at the indicated number of reads for FastQC read, STAR, and HTSeq plots. Defaults to 10e6; set to NULL to disable.

threshold_line_colour

Colour for the threshold line ("#EE2C2C")

threshold_line_type

Type of threshold line to draw ("dashed"). See ?aes_linetype_size_shape for available options.

threshold_line_size

Size of threshold line (1)

limits

Override the upper limit of FastQC read, STAR, and HTSeq bar plots. Supply a single number to give all three plots the same limit, or a vector of three values to modify each individually. Defaults to NULL, which sets automatic limits.

Value

A list with elements "plot" containing the ggplot objects, and "data" containing all the underlying data

Details

For the Phred scores, one must open the MultiQC HTML report, and export the data for "fastqc_per_base_sequence_quality_plot" as a tab- delimited file (TSV), placing it inside the same directory as the rest. If there are too many samples, the data gets saved in "mqc_fastqc_per_base_sequence_quality_plot_1.txt"; this file is also checked for automatically.

Note that the "limits" argument only applies to bar plots - it has no effect on box plots.

If the data is paired end (i.e. there is R1 and R2 for each sample), each read will be plotted separately in the FastQC Phred score and read number plots. If the samples are named with "R1" but no "R2" is found, the "R1" will be removed from the sample names.

Examples

if (FALSE) tr_qc_plots("multiqc_data")