Creates four ggplot2 figures from MultiQC results, and returns the data along with the plot objects. The plots are for FastQC (read counts and Phred scores), STAR, and HTSeq. See Details for more information.
Usage
tr_qc_plots(
directory,
type = "bar",
hide_samples = FALSE,
col_width = 0.9,
add_points = TRUE,
font_size = 18,
threshold_line = 1e+07,
threshold_line_colour = "#EE2C2C",
threshold_line_type = "dashed",
threshold_line_size = 1,
limits = NULL
)
Arguments
- directory
Folder containing all the data files generated by MultiQC, e.g. "multiqc_data/"
- type
Type of plot to produce, either "bar" (default) or "box"
- hide_samples
Logical: For
type = "bar"
, remove samples names from the y axis when there are many samples. Defaults to FALSE.- col_width
Width of bars when
type = "bar"
, which can be reduced when plotting many samples. Defaults to 0.9- add_points
Logical: When making a box plot, should individual samples be plotted as points? Defaults to TRUE.
- font_size
Base font size (defaults to 18)
- threshold_line
Provide a number to draw a line at the indicated number of reads for FastQC read, STAR, and HTSeq plots. Defaults to 10e6; set to NULL to disable.
- threshold_line_colour
Colour for the threshold line ("#EE2C2C")
- threshold_line_type
Type of threshold line to draw ("dashed"). See
?aes_linetype_size_shape
for available options.- threshold_line_size
Size of threshold line (1)
- limits
Override the upper limit of FastQC read, STAR, and HTSeq bar plots. Supply a single number to give all three plots the same limit, or a vector of three values to modify each individually. Defaults to NULL, which sets automatic limits.
Value
A list with elements "plot" containing the ggplot
objects, and
"data" containing all the underlying data
Details
For the Phred scores, one must open the MultiQC HTML report, and export the data for "fastqc_per_base_sequence_quality_plot" as a tab- delimited file (TSV), placing it inside the same directory as the rest. If there are too many samples, the data gets saved in "mqc_fastqc_per_base_sequence_quality_plot_1.txt"; this file is also checked for automatically.
Note that the "limits" argument only applies to bar plots - it has no effect on box plots.
If the data is paired end (i.e. there is R1 and R2 for each sample), each read will be plotted separately in the FastQC Phred score and read number plots. If the samples are named with "R1" but no "R2" is found, the "R1" will be removed from the sample names.