| Title: | Power Analysis for Phosphopeptide Abundance Hypothesis Test |
| Version: | 0.1.0 |
| Author: | Dan MacLean [aut, cre] |
| Maintainer: | Dan MacLean <dan.maclean@tsl.ac.uk> |
| Description: | Estimate best fit distributions and do power analysis for hypothesis tests on phosphopeptide abundance data. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.3 |
| URL: | https://teammaclean.github.io/peppwR/, https://github.com/TeamMacLean/peppwR |
| BugReports: | https://github.com/TeamMacLean/peppwR/issues |
| Depends: | R (≥ 4.1.0) |
| Imports: | cowplot, dplyr, fitdistrplus, ggplot2, methods, purrr, RColorBrewer, scales, tibble, tidyr, univariateML |
| Suggests: | bench, knitr, pkgdown, rmarkdown, testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| Packaged: | 2026-03-25 10:46:38 UTC; macleand |
| Repository: | CRAN |
| Date/Publication: | 2026-03-30 09:20:08 UTC |
Apply missingness to a vector
Description
Apply missingness to a vector
Usage
apply_missingness(x, na_rate, mnar_score = 0)
Arguments
x |
Numeric vector |
na_rate |
Proportion to make NA |
mnar_score |
MNAR intensity (0 = MCAR) |
Value
Vector with some values replaced by NA
JZS Bayes factor approximation
Description
JZS Bayes factor approximation
Usage
bf_jzs(t, n1, n2)
Arguments
t |
t-statistic |
n1 |
Sample size group 1 |
n2 |
Sample size group 2 |
Value
Bayes factor
Compute dataset-level MNAR metric
Description
Calculates Spearman correlation between log(mean_abundance) and NA rate across peptides to detect whether low-abundance peptides have more missing values than high-abundance peptides.
Usage
compute_dataset_mnar(missingness)
Arguments
missingness |
A tibble with columns na_rate and mean_abundance (as produced by fit_distributions) |
Value
A list with:
correlation: Spearman correlation coefficient (negative = MNAR pattern)
p_value: p-value for the correlation
n_peptides: Number of peptides with missing data used in calculation
interpretation: Human-readable interpretation string
MNAR Detection
MNAR (Missing Not At Random) in mass spectrometry typically manifests as low-abundance peptides having higher rates of missing values due to detection limits. This function detects this pattern by correlating mean abundance with NA rate across all peptides.
A negative correlation indicates that low-abundance peptides have more missing values - the hallmark of detection-limit-driven MNAR.
Interpretation
| Correlation (r) | Interpretation |
| r > -0.1 | No evidence of MNAR |
| -0.3 < r < -0.1 | Weak evidence |
| -0.5 < r < -0.3 | Moderate evidence |
| r < -0.5 | Strong evidence of MNAR |
Compute fitted density values
Description
Compute fitted density values
Usage
compute_fitted_density(raw_vals, dist_name, x_range)
Arguments
raw_vals |
Raw observed values |
dist_name |
Distribution name |
x_range |
X values to compute density at |
Value
Density values or NULL on failure
Compute missingness statistics for a vector of values
Description
Calculates the number and proportion of missing (NA) values.
Usage
compute_missingness(values)
Arguments
values |
Numeric vector (may contain NAs) |
Value
List with:
n_total: Total number of values
n_missing: Number of NA values
na_rate: Proportion missing (0-1)
See Also
compute_dataset_mnar() for dataset-level MNAR detection
Compute per-peptide power (helper for plot_power_vs_effect)
Description
Compute per-peptide power (helper for plot_power_vs_effect)
Usage
compute_perpeptide_power(
fits_data,
n_per_group,
effect_size,
alpha,
test,
n_sim
)
Compute QQ points for a distribution fit
Description
Compute QQ points for a distribution fit
Usage
compute_qq_points(raw_vals, dist_name)
Arguments
raw_vals |
Raw observed values |
dist_name |
Distribution name |
Value
List with theoretical and sample quantiles, or NULL
Compute t-statistic for two groups
Description
Compute t-statistic for two groups
Usage
compute_t_stat(x, y)
Arguments
x |
First group |
y |
Second group |
Value
t-statistic
Find minimum detectable effect size
Description
Find minimum detectable effect size
Usage
find_effect_size(
distribution,
params,
n_per_group,
target_power,
alpha,
test,
n_sim
)
Arguments
distribution |
Distribution name |
params |
Distribution parameters |
n_per_group |
Sample size per group |
target_power |
Target power level |
alpha |
Significance level |
test |
Statistical test |
n_sim |
Number of simulations per effect size |
Value
List with effect_size and effect_curve
Find required sample size for target power
Description
Find required sample size for target power
Usage
find_sample_size(
distribution,
params,
effect_size,
target_power,
alpha,
test,
n_sim
)
Arguments
distribution |
Distribution name |
params |
Distribution parameters |
effect_size |
Effect size to detect |
target_power |
Target power level |
alpha |
Significance level |
test |
Statistical test |
n_sim |
Number of simulations per sample size |
Value
List with n and power_curve
Fit distributions to peptide abundance data
Description
Fits candidate distributions to each peptide's abundance values and selects the best fit by AIC. Also computes missingness statistics including dataset-level MNAR detection.
Usage
fit_distributions(data, id, group, value, distributions = "continuous")
Arguments
data |
A data frame containing peptide abundance data |
id |
Column name for peptide identifier |
group |
Column name for group/condition |
value |
Column name for abundance values |
distributions |
Which distributions to fit: "continuous" (default), "counts", "all", or a character vector of distribution names |
Value
A peppwr_fits object containing:
-
$data: Nested tibble with original data and fit results -
$best: Best-fitting distribution for each peptide -
$missingness: Per-peptide missingness statistics -
$dataset_mnar: Dataset-level MNAR correlation and interpretation
Missingness Tracking
The returned object includes:
Per-peptide NA rates (in
$missingness)Dataset-level MNAR correlation (in
$dataset_mnar)
The dataset-level MNAR metric correlates log(mean_abundance) with NA rate across peptides. A negative correlation indicates low-abundance peptides have more missing values - typical of detection-limit-driven MNAR.
Print the result to see both metrics. For small sample sizes (N < 15), the dataset-level correlation is more reliable than per-peptide scores.
See Also
compute_dataset_mnar() for dataset-level MNAR details,
plot_missingness() to visualize missingness patterns
Format answer based on question type
Description
Format answer based on question type
Usage
format_answer(answer, question)
Arguments
answer |
The answer value |
question |
The question type |
Value
Formatted string
Get density function for a distribution
Description
Get density function for a distribution
Usage
get_dfunc(distribution)
Arguments
distribution |
Distribution name |
Value
Density function
Get distribution parameters from fitted results
Description
Get distribution parameters from fitted results
Usage
get_dist_params(dist_name, fit_df, raw_data)
Arguments
dist_name |
Distribution name |
fit_df |
Fit results data frame |
raw_data |
Raw data values |
Value
List of parameters
Get distributions for a preset
Description
Get distributions for a preset
Usage
get_distribution_preset(preset = "continuous")
Arguments
preset |
One of "continuous" (default), "counts", or "all" |
Value
Character vector of distribution names
Get quantile function for a distribution
Description
Get quantile function for a distribution
Usage
get_qfunc(distribution)
Arguments
distribution |
Distribution name |
Value
Quantile function
Get the random generation function for a distribution
Description
Get the random generation function for a distribution
Usage
get_rfunc(distribution)
Arguments
distribution |
Distribution name |
Value
Random generation function
Get the test function for a given test name
Description
Get the test function for a given test name
Usage
get_test_func(test)
Arguments
test |
Test name |
Value
Test function
Interpret dataset-level MNAR result
Description
Interpret dataset-level MNAR result
Usage
interpret_dataset_mnar(correlation, p_value, n_peptides)
Arguments
correlation |
Spearman correlation coefficient |
p_value |
p-value for the correlation |
n_peptides |
Number of peptides in calculation |
Value
Interpretation string
Check if data appears to be count data (non-negative integers)
Description
Check if data appears to be count data (non-negative integers)
Usage
is_count_data(x)
Arguments
x |
Numeric vector |
Value
TRUE if data looks like counts, FALSE otherwise
Map distribution name to R function prefix
Description
Map distribution name to R function prefix
Usage
map_dist_name(dist_name)
Arguments
dist_name |
Distribution name |
Value
R function name for random generation
Create a new peppwr_fits object
Description
Create a new peppwr_fits object
Usage
new_peppwr_fits(
data,
fits,
best,
call,
missingness = NULL,
dataset_mnar = NULL
)
Arguments
data |
Original data (nested tibble) |
fits |
Fit results per peptide (list of tibbles with dist, loglik, aic) |
best |
Best-fitting distribution per peptide (character vector) |
call |
Original function call |
missingness |
Tibble with missingness statistics per peptide (optional) |
dataset_mnar |
Dataset-level MNAR metric (optional, from compute_dataset_mnar) |
Value
A peppwr_fits object
Create a new peppwr_power object
Description
Create a new peppwr_power object
Usage
new_peppwr_power(mode, question, answer, simulations, params, call)
Arguments
mode |
Either "aggregate" or "per_peptide" |
question |
What was solved for: "power", "sample_size", or "effect_size" |
answer |
The computed answer |
simulations |
List of simulation details |
params |
List of input parameters used |
call |
Original function call |
Value
A peppwr_power object
Plot method for peppwr_fits
Description
Creates a bar chart showing the count of best-fit distributions
Usage
## S3 method for class 'peppwr_fits'
plot(x, ...)
Arguments
x |
A peppwr_fits object |
... |
Additional arguments (ignored) |
Value
A ggplot object
Plot method for peppwr_power
Description
Creates power curves or % peptides at threshold plots
Usage
## S3 method for class 'peppwr_power'
plot(x, ...)
Arguments
x |
A peppwr_power object |
... |
Additional arguments (ignored) |
Value
A ggplot object
Plot density overlay: observed histogram with fitted density curve
Description
Plot density overlay: observed histogram with fitted density curve
Usage
plot_density_overlay(fits, peptide_id = NULL, n_overlay = 6)
Arguments
fits |
A peppwr_fits object |
peptide_id |
Specific peptide to plot (NULL for multiple) |
n_overlay |
Number of peptides to overlay when peptide_id is NULL |
Value
A ggplot object
Plot missingness statistics
Description
Creates a visualization of missing data patterns with two panels:
Usage
plot_missingness(fits)
Arguments
fits |
A peppwr_fits object |
Details
-
Panel 1: Distribution of NA rates across peptides
-
Panel 2: Mean abundance vs NA rate scatter plot showing the dataset-level MNAR correlation. The subtitle displays the Spearman correlation coefficient and p-value. A negative correlation indicates that low-abundance peptides have more missing values - the hallmark of detection-limit-driven MNAR.
Value
A ggplot object or gtable (combined panels)
MNAR Detection
MNAR (Missing Not At Random) in mass spectrometry typically manifests as low-abundance peptides having higher rates of missing values due to detection limits. Panel 2 visualizes this relationship and reports the correlation coefficient.
See Also
compute_dataset_mnar() for the underlying correlation calculation
Plot distribution of fitted parameters across peptidome
Description
Plot distribution of fitted parameters across peptidome
Usage
plot_param_distribution(fits)
Arguments
fits |
A peppwr_fits object |
Value
A ggplot object
Plot power curve for aggregate mode
Description
Plot power curve for aggregate mode
Usage
plot_power_aggregate(x)
Arguments
x |
A peppwr_power object |
Value
A ggplot object
Plot power heatmap: N x effect size grid
Description
Plot power heatmap: N x effect size grid
Usage
plot_power_heatmap(
distribution,
params,
n_range,
effect_range,
n_steps = 6,
n_sim = 100,
test = "wilcoxon",
alpha = 0.05
)
Arguments
distribution |
Distribution name |
params |
Distribution parameters |
n_range |
Range of sample sizes (vector of length 2) |
effect_range |
Range of effect sizes (vector of length 2) |
n_steps |
Number of grid points per dimension |
n_sim |
Number of simulations per grid cell |
test |
Statistical test to use |
alpha |
Significance level |
Value
A ggplot object
Plot % peptides at threshold for per-peptide mode
Description
Plot % peptides at threshold for per-peptide mode
Usage
plot_power_perpeptide(x)
Arguments
x |
A peppwr_power object |
Value
A ggplot object
Plot power vs effect size at fixed N
Description
Plot power vs effect size at fixed N
Usage
plot_power_vs_effect(power_result, effect_range, n_steps = 10, n_sim = NULL)
Arguments
power_result |
A peppwr_power object |
effect_range |
Range of effect sizes to explore |
n_steps |
Number of effect size values to compute |
n_sim |
Number of simulations per point |
Value
A ggplot object
Plot QQ plots for goodness-of-fit
Description
Plot QQ plots for goodness-of-fit
Usage
plot_qq(fits, peptide_id = NULL, n_plots = 6)
Arguments
fits |
A peppwr_fits object |
peptide_id |
Specific peptide to plot (NULL for multiple) |
n_plots |
Number of peptides to plot when peptide_id is NULL |
Value
A ggplot object
Power analysis for peptide abundance data
Description
Power analysis for peptide abundance data
Usage
power_analysis(distribution, ...)
Arguments
distribution |
Distribution name (character) or peppwr_fits object for per-peptide mode |
... |
Additional arguments |
Value
A peppwr_power object
Power analysis with specified distribution (aggregate mode)
Description
Power analysis with specified distribution (aggregate mode)
Default power analysis method (aggregate mode with defaults)
Usage
## S3 method for class 'character'
power_analysis(
distribution,
params,
effect_size = NULL,
n_per_group = NULL,
target_power = NULL,
alpha = 0.05,
test = "wilcoxon",
find = "power",
n_sim = 1000,
...
)
## Default S3 method:
power_analysis(distribution, ...)
Arguments
distribution |
Distribution name (e.g., "norm", "gamma", "lnorm") |
params |
List of distribution parameters |
effect_size |
Fold change to detect |
n_per_group |
Sample size per group (required for find="power") |
target_power |
Target power (required for find="sample_size" or find="effect_size") |
alpha |
Significance level (default 0.05) |
test |
Statistical test to use (default "wilcoxon") |
find |
What to solve for: "power", "sample_size", or "effect_size" |
n_sim |
Number of simulations (default 1000) |
... |
Additional arguments (ignored) |
Value
A peppwr_power object
Power analysis for per-peptide mode using fitted distributions
Description
Power analysis for per-peptide mode using fitted distributions
Usage
## S3 method for class 'peppwr_fits'
power_analysis(
distribution,
effect_size = NULL,
n_per_group = NULL,
target_power = NULL,
alpha = 0.05,
test = "wilcoxon",
find = "power",
n_sim = 1000,
on_fit_failure = "exclude",
proportion_threshold = 0.5,
include_missingness = FALSE,
apply_fdr = FALSE,
prop_null = 0.9,
fdr_threshold = 0.05,
...
)
Arguments
distribution |
A peppwr_fits object from fit_distributions() |
effect_size |
Fold change to detect |
n_per_group |
Sample size per group (required for find="power") |
target_power |
Target power (required for find="sample_size") |
alpha |
Significance level (default 0.05) |
test |
Statistical test to use (default "wilcoxon") |
find |
What to solve for: "power" or "sample_size" |
n_sim |
Number of simulations per peptide (default 1000) |
on_fit_failure |
How to handle failed fits: "exclude", "empirical", or "lognormal" |
proportion_threshold |
Proportion of peptides that must reach target_power (default 0.5) |
include_missingness |
If TRUE, incorporate peptide-specific NA rates into simulations |
apply_fdr |
If TRUE, use FDR-aware simulation with Benjamini-Hochberg correction.
Note: not compatible with |
prop_null |
Proportion of true null peptides (default 0.9 = 90% unchanged) |
fdr_threshold |
FDR threshold for calling discoveries (default 0.05) |
... |
Additional arguments (ignored) |
Value
A peppwr_power object
Print method for peppwr_fits
Description
Print method for peppwr_fits
Usage
## S3 method for class 'peppwr_fits'
print(x, ...)
Arguments
x |
A peppwr_fits object |
... |
Additional arguments (ignored) |
Value
The object invisibly
Print method for peppwr_power
Description
Print method for peppwr_power
Usage
## S3 method for class 'peppwr_power'
print(x, ...)
Arguments
x |
A peppwr_power object |
... |
Additional arguments (ignored) |
Value
The object invisibly
Print method for summary.peppwr_fits
Description
Print method for summary.peppwr_fits
Usage
## S3 method for class 'summary.peppwr_fits'
print(x, ...)
Arguments
x |
A summary.peppwr_fits object |
... |
Additional arguments (ignored) |
Value
The object invisibly
Print method for summary.peppwr_power
Description
Print method for summary.peppwr_power
Usage
## S3 method for class 'summary.peppwr_power'
print(x, ...)
Arguments
x |
A summary.peppwr_power object |
... |
Additional arguments (ignored) |
Value
The object invisibly
Run power simulation
Description
Run power simulation
Usage
run_power_sim(
distribution,
params,
n_per_group,
effect_size,
alpha = 0.05,
test = "wilcoxon",
n_sim = 1000
)
Arguments
distribution |
Distribution name |
params |
List of distribution parameters |
n_per_group |
Number of samples per group |
effect_size |
Fold change for treatment |
alpha |
Significance level |
test |
Statistical test to use ("wilcoxon", "bootstrap_t", "bayes_t", "rankprod") |
n_sim |
Number of simulations |
Value
Power estimate (proportion of significant results)
Run power simulation using empirical bootstrap
Description
Estimates power by repeatedly resampling from observed data rather than sampling from fitted distributions.
Usage
run_power_sim_empirical(
raw_data,
n_per_group,
effect_size,
alpha = 0.05,
test = "wilcoxon",
n_sim = 1000
)
Arguments
raw_data |
Numeric vector of observed values |
n_per_group |
Number of samples per group |
effect_size |
Fold change for treatment |
alpha |
Significance level |
test |
Statistical test to use |
n_sim |
Number of simulations |
Value
Power estimate (proportion of significant results)
Run FDR-aware power simulation for whole peptidome
Description
Simulates an entire peptidome experiment with a mix of true nulls and true alternatives, then applies Benjamini-Hochberg correction to compute FDR-adjusted power.
Usage
run_power_sim_fdr(
fits,
effect_size,
n_per_group,
prop_null = 0.9,
fdr_threshold = 0.05,
alpha = 0.05,
test = "wilcoxon",
n_sim = 1000
)
Arguments
fits |
A peppwr_fits object |
effect_size |
Fold change for treatment |
n_per_group |
Number of samples per group |
prop_null |
Proportion of peptides that are true nulls (no effect). Default 0.9 (90% unchanged). |
fdr_threshold |
FDR threshold for calling discoveries. Default 0.05. |
alpha |
Nominal significance level (used for simulation). Default 0.05. |
test |
Statistical test to use. Default "wilcoxon". |
n_sim |
Number of simulation iterations. Default 1000. |
Value
FDR-adjusted power estimate (proportion of true alternatives detected)
Run power simulation with missingness
Description
Estimates power accounting for realistic missing data patterns.
Usage
run_power_sim_with_missingness(
distribution,
params,
n_per_group,
effect_size,
na_rate = 0,
mnar_score = 0,
alpha = 0.05,
test = "wilcoxon",
n_sim = 1000
)
Arguments
distribution |
Distribution name |
params |
List of distribution parameters |
n_per_group |
Number of samples per group |
effect_size |
Fold change for treatment |
na_rate |
Proportion of values that will be NA |
mnar_score |
MNAR intensity (0 = MCAR) |
alpha |
Significance level |
test |
Statistical test to use |
n_sim |
Number of simulations |
Value
Power estimate (proportion of significant results)
Simulate experiment using empirical bootstrap
Description
Resamples from observed data instead of using parametric distributions. Useful when distribution fitting fails or for non-parametric analysis.
Usage
simulate_empirical(raw_data, n_per_group, effect_size)
Arguments
raw_data |
Numeric vector of observed values |
n_per_group |
Number of samples per group |
effect_size |
Fold change for treatment (multiplicative effect) |
Value
List with control and treatment vectors
Simulate an experiment with control and treatment groups
Description
Simulate an experiment with control and treatment groups
Usage
simulate_experiment(distribution, params, n_per_group, effect_size)
Arguments
distribution |
Distribution name (e.g., "norm", "gamma", "lnorm") |
params |
List of distribution parameters |
n_per_group |
Number of samples per group |
effect_size |
Fold change for treatment (multiplicative effect) |
Value
List with control and treatment vectors
Simulate experiment with realistic missingness
Description
Generates control and treatment samples from a distribution, then introduces missing values according to the specified rate and MNAR pattern.
Usage
simulate_with_missingness(
distribution,
params,
n_per_group,
effect_size,
na_rate = 0,
mnar_score = 0
)
Arguments
distribution |
Distribution name (e.g., "norm", "gamma", "lnorm") |
params |
List of distribution parameters |
n_per_group |
Number of samples per group |
effect_size |
Fold change for treatment (multiplicative effect) |
na_rate |
Proportion of values to make NA (0-1) |
mnar_score |
MNAR intensity: 0 = MCAR, positive = low values more likely to be missing. Typical values: 0-3. |
Value
List with control and treatment vectors (may contain NAs)
Summary method for peppwr_fits
Description
Summary method for peppwr_fits
Usage
## S3 method for class 'peppwr_fits'
summary(object, ...)
Arguments
object |
A peppwr_fits object |
... |
Additional arguments (ignored) |
Value
A list with summary statistics
Summary method for peppwr_power
Description
Summary method for peppwr_power
Usage
## S3 method for class 'peppwr_power'
summary(object, ...)
Arguments
object |
A peppwr_power object |
... |
Additional arguments (ignored) |
Value
A list with summary statistics
Bayes factor t-test
Description
Computes Bayes factor for difference between two groups
Usage
test_bayes_t(control, treatment)
Arguments
control |
Control group values |
treatment |
Treatment group values |
Value
Bayes factor (BF10: evidence for alternative over null)
Bootstrap-t test
Description
Performs a bootstrap-t test comparing two groups
Usage
test_bootstrap_t(control, treatment, n_boot = 1000)
Arguments
control |
Control group values |
treatment |
Treatment group values |
n_boot |
Number of bootstrap iterations (default 1000) |
Value
p-value from the test
Rank Products test
Description
Performs rank products test comparing two groups Simplified implementation for two-group comparison
Usage
test_rankprod(control, treatment, n_perm = 1000)
Arguments
control |
Control group values |
treatment |
Treatment group values |
n_perm |
Number of permutations for p-value estimation (default 1000) |
Value
p-value from the test
Wilcoxon rank-sum test
Description
Wilcoxon rank-sum test
Usage
test_wilcoxon(control, treatment)
Arguments
control |
Control group values |
treatment |
Treatment group values |
Value
p-value from the test
Validate a peppwr_fits object
Description
Validate a peppwr_fits object
Usage
validate_peppwr_fits(x)
Arguments
x |
Object to validate |
Value
The validated object (invisibly)
Validate a peppwr_power object
Description
Validate a peppwr_power object
Usage
validate_peppwr_power(x)
Arguments
x |
Object to validate |
Value
The validated object (invisibly)