Welcome to RawHummus

Robust and reproducible data is essential to ensure high-quality analytical results and is particularly important for large-scale metabolomics studies where detector sensitivity drifts, retention time and mass accuracy shifts frequently occur. Therefore, raw data need to be inspected before data processing to detect measurement bias and verify system consistency.

RawHummus is an R Shiny app designed for an automated raw data quality control (QC) in metabolomics studies. It produces a comprehensive QC report, which contains interactive plots and tables, summary statistics and detailed explanations.

If you are using Thermo Q-Exactive series instrument, RawHummus also allows visualizing the log files (they are usually stored at C:\Xcalibur\system\Exactive\log folder), which contains over 40 different instrument metrics, such as ambient temperature and ambient humidity.

Installation

You need to have R and optionally RStudio Installed. You can download them from this link.

Next you can run this code to install RawHummus.

install.packages('RawHummus')

Usage

Once RawHummus is installed. You can run the following code to start the web app.

library(RawHummus)
run_app()

Workflow and Features

Figure 1. Overview of the main functions in RawHummus


workflow


Table 1. Overview of quality metrics used in RawHummus report

Section Metric Explanation
Chromatogram
TIC plot It is total ion current at each scan plotted as an intensity point for each raw file. Overlaid TIC plot is used for rapid inspection of RT and ion intensity fluctuations.
Summed TIC bar plot It is summed TIC of all scans in a raw file. It is used to check global ion intensity variations among raw files.
TIC correlation analysis Pairwise Pearson correlation analysis of raw files. It is used to evaluate chromatogram similarity, i.e., peak shape similarity and RT shift. Pearson correlation coefficient above 0.85 indicates that the two raw files are similar.
Max. mass difference (ppm) It is maximum m/z variation among each selected mass feature across all the raw files. It is used to evaluate the mass accuracy. If the max. mass difference is over 5 ppm, this value will be highlighted in red.
MS1
Max. RT difference (min) It is maximum retention time variation among each selected mass feature across all the raw files. It used to evaluate RT shifts. If the max. RT difference is over 1 min, this value will be highlighted in red.
Max. intensity fold change It is the maximum intensity fold change among each selected mass feature across all the raw files. It is used to evaluate ion intensity variation. If max. intensity fold change is over 1.5, this value will be highlighted in red.
Intensity CV (%) It is intensity coefficient of variance (or relative standard deviation, RSD) of each selected mass feature across all the raw files. It is used to evaluate intensity variation. If intensity CV is over 30%, this value will be highlighted in red.
MS2
No. of MS2 events It is number of triggered MS/MS spectra per file. It is used to evaluate MS2 event
Precursor ion distribution across mass plot It is density plot of the precursor ion across mass range based on the triggered MS/MS events.
Precursor ion distribution across RT plot It is density plot of the precursor ion across RT range based on the triggered MS/MS events.
Cosine similarity of precursor ion distribution across mass It measures the similarity of precursor ion distribution across mass. Cosine similarity score above 0.85 indicates that the precursor distributions across mass are similar between two files.
Cosine similarity of precursor ion distribution across RT It measures the similarity of precursor ion distribution across mass. Cosine similarity score above 0.85 indicates that the precursor distributions across RT are similar between two files.

Data Preparation

RawHummus supports mzXML and mzML formats for analysis, meaning that you need to convert your raw data into one of the format.

It’s pretty easy to convert your raw data to the required format. MSConvert is recommended for data conversion. ProteoWizard msConvert can be download from this link.

GNPS has provided a detailed explanation regarding file conversion. You can read the instruction here

Note that: For Waters MSe data, the precursor ion and fragments are stored in two files, and RawHummus will only evaluate the precursor ion.

Demo Data

A list of demo files have been provided, including log files and raw data files. Please use this link to download the demo data.

Citation

If you find this shiny App usful, please consider citing it:

Dong, Y., Kazachkova, Y., Gou, M., Morgan, L., Wachsman, T., Gazit, E. and Birkler, R.I.D., 2022. RawHummus: an R Shiny App for Automated Raw Data Quality Control in Metabolomics. Bioinformatics.