
Coverage tells you which lines ran. It says nothing about whether
your tests would catch a bug. You can delete every assertion, run
covr, and still see 100%.
{muttest} measures the quality of your tests — not just how much code they execute.
covr tells you which
lines were executed. It cannot tell you whether your assertions are
strong enough to catch a real bug. A test suite full of
expect_true(is.numeric(x)) checks will reach 100% coverage
while missing every meaningful failure.
Mutation testing addresses this gap by asking a harder question: if this code were subtly wrong, would your tests notice?
Many teams now use LLMs to write their tests. LLMs are good at producing syntactically correct, passing tests quickly — but they might cover only the obvious cases and miss the boundaries:
# What an LLM may write for is_adult():
test_that("is_adult works", {
expect_true(is.numeric(is_adult(25))) # checks return type, not logic
expect_true(is_adult(25)) # clearly an adult
expect_false(is_adult(10)) # clearly a minor
})
# What actually catches the >= vs > boundary bug:
test_that("is_adult handles the boundary age", {
expect_true(is_adult(18)) # kills the >= → > mutant
})Both test suites pass. Both have 100% coverage. Only one would catch
a developer accidentally writing age > 18 instead of
age >= 18.
Mutation testing gives you a score that reflects assertion quality, not just execution. It gives you a concrete way to understand the real strength — and the real gaps — in an LLM-generated test suite.
This reveals whether your tests are asserting the right things:
{muttest} not only gives you the score, but it also tells you which files need stronger assertions.
Given our codebase is:
#' R/is_adult.R
is_adult <- function(age) {
age >= 18
}And our tests are:
#' tests/testthat/test-is_adult.R
test_that("is_adult returns TRUE for adults", {
expect_true(is_adult(25))
})
test_that("is_adult returns FALSE for minors", {
expect_false(is_adult(10))
})When running muttest::muttest() we’ll get a report of
the mutation score:
withr::with_dir(system.file("examples", "boundary", package = "muttest"), {
plan <- muttest::muttest_plan(
mutators = muttest::comparison_operators()
)
muttest::muttest(plan)
})
#> ℹ Mutation Testing
#> | K | S | E | T | % | Mutator | File
#> ✔ | 1 | 0 | 0 | 1 | 100 | >= → <= | is_adult.R
#> x | 1 | 1 | 0 | 2 | 50 | >= → > | is_adult.R
#>
#> Duration: 1.99 s
#>
#> ── Survived Mutants ────────────────────────────────────────────────────────────
#> is_adult.R >= → >
#> 2- age >= 18
#> 2+ age > 18
#>
#> ── Results ─────────────────────────────────────────────────────────────────────
#> [ KILLED 1 | SURVIVED 1 | ERRORS 0 | TOTAL 2 | SCORE 50.0% ]The mutation score is: \(\text{Mutation Score} = \frac{\text{Killed Mutants}}{\text{Total Mutants}} \times 100\%\), where a Mutant is defined as variant of the original code that is used to test the robustness of the test suite.
comparison_operators() generates mutants by swapping
each comparison operator for related alternatives. For
>= it produces two mutants:
#' R/is_adult.R — mutant 1: ">=" -> ">"
is_adult <- function(age) {
age > 18
}#' R/is_adult.R — mutant 2: ">=" -> "<="
is_adult <- function(age) {
age <= 18
}Tests are run against both mutants.
Mutant 2 (>= → <=) is
killed: is_adult(25) now returns
FALSE, which fails the first test.
Mutant 1 (>= → >)
survives: is_adult(25) still returns
TRUE and is_adult(10) still returns
FALSE — the boundary value 18 is never tested,
so the test suite cannot tell >= from
>.
#' tests/testthat/test-is_adult.R
test_that("is_adult returns TRUE for adults", {
# ✔ Kills mutant 2 (<=): is_adult(25) returns FALSE
# 🟢 Doesn't kill mutant 1 (>): is_adult(25) still returns TRUE
expect_true(is_adult(25))
})
test_that("is_adult returns FALSE for minors", {
# 🟢 Doesn't kill mutant 1 (>): is_adult(10) still returns FALSE
# 🟢 Doesn't kill mutant 2 (<=): is_adult(10) returns TRUE → killed by first test anyway
expect_false(is_adult(10))
})We have killed 1 mutant out of 2, so the mutation score is 50%. The survivor tells us exactly what to fix — add a test at the boundary:
test_that("is_adult returns TRUE at the boundary age", {
expect_true(is_adult(18)) # kills mutant 1: age > 18 returns FALSE for age = 18
})With this test added the score reaches 100%.
A mutator describes one kind of code change. Pass a list of mutators
to muttest_plan() to control what gets mutated.
Individual mutators
| Function | Description | Example |
|---|---|---|
operator() |
Mutate a binary operator | operator("+", "-"):
a + b → a - b |
boolean_literal() |
Mutate a boolean literal | boolean_literal("TRUE", "FALSE"):
TRUE → FALSE |
na_literal() |
Mutate an NA or NULL literal | na_literal("NA", "NULL"):
NA → NULL |
call_name() |
Mutate a function call name | call_name("any", "all"):
any(x) → all(x) |
string_empty() |
Mutate non-empty string literals to the empty string | string_empty():
"hello" → "" |
string_fill() |
Mutate the empty string literal to a placeholder string | string_fill():
"" → "mutant" |
numeric_increment() |
Increment numeric literals | numeric_increment():
5 → 6 |
numeric_decrement() |
Decrement numeric literals | numeric_decrement():
5 → 4 |
index_increment() |
Increment subscript indices | index_increment():
x[i] → x[i + 1L] |
index_decrement() |
Decrement subscript indices | index_decrement():
x[i] → x[i - 1L] |
negate_condition() |
Negate the condition of if/while statements | negate_condition():
if (x > 0) → if (!(x > 0)) |
remove_condition_negation() |
Remove negation from the condition of if/while statements | remove_condition_negation():
if (!done) → if (done) |
remove_negation() |
Remove logical negation | remove_negation():
!is.na(x) → is.na(x) |
replace_return_value() |
Replace the value in explicit return() calls | replace_return_value():
return(x) → return(NULL) |
Preset collections — return a ready-made list of mutators
| Function | Description | Example |
|---|---|---|
arithmetic_operators() |
Arithmetic operator mutators | +↔︎-,
*↔︎/, ^→*,
%%→*, %/%→/ |
comparison_operators() |
Comparison operator mutators | <↔︎>,
==↔︎!=, <→<=,
>→>= … |
logical_operators() |
Logical operator mutators | &&↔︎||,
&↔︎| |
boolean_literals() |
Boolean literal mutators | TRUE↔︎FALSE,
T↔︎F |
na_literals() |
NA and NULL literal mutators | NA↔︎NULL,
NA↔︎NA_real_,
NA↔︎NA_integer_,
NA↔︎NA_character_ |
numeric_literals() |
Numeric literal mutators | 5→6,
5→4 |
index_mutations() |
Index mutation mutators | x[i]→x[i + 1L],
x[i]→x[i - 1L] |
string_literals() |
String literal mutators | "hello"→"",
""→"mutant" |
condition_mutations() |
Condition mutation mutators | if (x)→if (!(x)),
if (!x)→if (x) |
vignette("getting-started", package = "muttest") — a
full worked example from zero to a mutation score, including how to
interpret and improve results.vignette("mutation-testing-101", package = "muttest") —
conceptual background, the LLM-tests problem in depth, and when mutation
testing pays off.vignette("mutators", package = "muttest") — all
available mutators, when to use each, and how to build custom
pairs.vignette("interpreting-results", package = "muttest") —
how to read surviving mutants and turn them into stronger tests.vignette("ci-integration", package = "muttest") — run
mutation tests on every push, add a score badge, and enforce
thresholds.