This vignette serves as a quickstart guide for R users to create and save an mzQC document.
Target Audience: R users
library(rmzqc)
data = readMZQC(system.file("./testdata/test.mzQC", package = "rmzqc", mustWork = TRUE))
cat("This file has ", length(data$runQualities), " runqualities\n")
## This file has 1 runqualities
cat(" - file: ", data$runQualities[[1]]$metadata$inputFiles[[1]]$name, "\n")
## - file: special.raw
cat(" - # of metrics: ", length(data$runQualities[[1]]$qualityMetrics), "\n")
## - # of metrics: 1
cat(" - metric #1 name: ", data$runQualities[[1]]$qualityMetrics[[1]]$name, "\n")
## - metric #1 name: number of MS1 spectra
cat(" - metric #1 value: ", data$runQualities[[1]]$qualityMetrics[[1]]$value, "\n")
## - metric #1 value: 13405
Hint: if you receive an error such as
Error in parse_con(txt, bigint_as_char) :
lexical error: invalid char in json text.
cursor_int": [ NaN,NaN,NaN,NaN,825282.0,308263
(right here) ------^
when callingreadMZQC
this indicates that the mzQC is not
valid JSON, since NaN
values should be quoted
("NaN"
) or replaced by null
(unquoted),
depending on context. In short: null
may become an
NA
in R if part of an array, see https://github.com/jeroen/jsonlite/issues/70#issuecomment-431433773.
library(rmzqc)
## we need a proper URI (i.e. no backslashes and a scheme, e.g. 'file:')
## otherwise writing will fail
raw_file = localFileToURI("c:\\data\\special.raw", FALSE)
file_format = getCVTemplate(accession = filenameToCV(raw_file))
## Downloading obo from 'https://github.com/HUPO-PSI/psi-ms-CV/releases/download/v4.1.194/psi-ms.obo' ...
ptxqc_software = toAnalysisSoftware(id = "MS:1003162", version = "1.0.13") ## you could use 'version = packageVersion("PTXQC")' to automate further
run1_qc = MzQCrunQuality$new(metadata = MzQCmetadata$new(label = raw_file,
inputFiles = list(MzQCinputFile$new(basename(raw_file),
raw_file,
file_format)),
analysisSoftware = list(ptxqc_software)),
qualityMetrics = list(toQCMetric(id = "MS:4000059", value = 13405), ## number of MS1 scans
toQCMetric(id = "MS:4000063", value = list("MS:1000041" = 1:3, "UO:0000191" = c(0.1, 0.8, 0.1))) # MS2 known precursor charges fractions
)
)
mzQC_document = MzQCmzQC$new(version = "1.0.0",
creationDate = MzQCDateTime$new(),
contactName = Sys.info()["user"],
contactAddress = "test@user.info",
description = "A minimal mzQC test document with bogus data",
runQualities = list(run1_qc),
setQualities = list(),
controlledVocabularies = list(getCVInfo()))
## write it out
mzqc_filename = paste0(getwd(), "/test.mzQC")
writeMZQC(mzqc_filename, mzQC_document)
cat(mzqc_filename, "written to disk!\n")
## C:/Users/bielow/AppData/Local/Temp/RtmpCWj8M7/Rbuild890019086f21/rmzqc/vignettes/test.mzQC written to disk!
## read it again
mq = readMZQC(mzqc_filename)
## print some basic stats
gettextf("This mzQC was created on %s and has %d quality metric(s) in total.", dQuote(mq$creationDate$datetime), length(mq$runQualities) + length(mq$setQualities))
## [1] "This mzQC was created on \"2025-06-03T00:00:00Z\" and has 1 quality metric(s) in total."
mzQC allows for 4 different kinds of value types for a
qualityMetric
: - Single value - n-tuple - table -
matrix
See toQCMetric()
for more examples.
Below we will create an example for each of them, so you know what R datastructures to employ. Using a data.frame, instead of a list, will give you the wrong JSON formatting and leads to validation failures.
A single value (MS:4000003) is the easiest. Just assign a string or a number to the value attribute:
qualityMetric_singleValue = toQCMetric(id = "MS:4000059", value = 13405) ## number of MS1 scans
## let's look at how this looks in JSON
jsonlite::toJSON(qualityMetric_singleValue, pretty = TRUE, auto_unbox = TRUE)
## {
## "accession": "MS:4000059",
## "name": "number of MS1 spectra",
## "description": "\"The number of MS1 events in the run.\" [PSI:MS]",
## "value": 13405
## }
An n-tuple (MS:4000004) is just a vector in R:
qualityMetric_tuple = toQCMetric(id = "MS:4000061", value = c(0.45, 0.76, 0.23)) ## MS1 density quantiles
## let's look at how this looks in JSON
jsonlite::toJSON(qualityMetric_tuple, pretty = TRUE, auto_unbox = TRUE)
## {
## "accession": "MS:4000061",
## "name": "MS1 density quantiles",
## "description": "\"The first to n-th quantile of MS1 peak density (scan peak counts). A value triplet represents the original QuaMeter metrics, the quartiles of MS1 density. The number of values in the tuple implies the quantile mode.\" [PSI:MS]",
## "value": [0.45, 0.76, 0.23]
## }
An table (MS:4000005) is a list (not a data.frame!) of columns.
The example below shows a table with two columns and three rows:
qualityMetric_table = # MS2 known precursor charges fractions
toQCMetric(id = "MS:4000063", value = list("MS:1000041" = 1:3, ## charge state
"UO:0000191" = c(0.1, 0.8, 0.1))) ## fraction of precursors with that charge
## let's look at how this looks in JSON
jsonlite::toJSON(qualityMetric_table, pretty = TRUE, auto_unbox = TRUE)
## {
## "accession": "MS:4000063",
## "name": "MS2 known precursor charges fractions",
## "description": "\"The fraction of MS/MS precursors of the corresponding charge. The fractions [0,1] are given in the 'Fraction' column, corresponding charges in the 'Charge state' column. The highest charge state is to be interpreted as that charge state or higher.\" [PSI:MS]",
## "value": {
## "MS:1000041": [1, 2, 3],
## "UO:0000191": [0.1, 0.8, 0.1]
## }
## }
A matrix (MS:4000006) is an R matrix:
At the point of writing, there is no matrix CV term yet. So we just make one up:
qualityMetric_matrix = toQCMetric(id = "MS:40000??", value = matrix(1:9, 3, 3), allow_unknown_id = TRUE)
## Warning in CV$byID(id): Could not find id 'MS:40000??' in CV list (length:
## 6807)
qualityMetric_matrix$name = "unknown metric"
## let's look at how this looks in JSON
jsonlite::toJSON(qualityMetric_matrix, pretty = TRUE, auto_unbox = TRUE)
## {
## "accession": "MS:40000??",
## "name": "unknown metric",
## "value": [
## [1, 4, 7],
## [2, 5, 8],
## [3, 6, 9]
## ]
## }