The goal of rdecision
is to provide methods for
assessing health care interventions using cohort models (decision trees
and semi-Markov models) which can be constructed using only a few lines
of R code. Mechanisms are provided for associating an uncertainty
distribution with each source variable and for ensuring transparency of
the mathematical relationships between variables. The package
terminology follows Briggs et al “Decision Modelling for Health
Economic Evaluation”1.
You can install the released version of rdecision from CRAN with:
install.packages("rdecision")
Consider the fictitious and idealized decision problem of choosing between providing two forms of lifestyle advice, offered to people with vascular disease, which reduce the risk of needing an interventional procedure. It is assumed that the interventional procedure is the insertion of a stent, that the current standard of care is the provision of dietary advice, and that the new form of care is enrolment on an exercise programme. To assess the decision problem, we construct a model from the perspective of a healthcare provider, with a time horizon of one year, and assume that the utility of both forms of advice is equal. The model evaluates the incremental benefit of the exercise programme as the incremental number of interventional procedures avoided against the incremental cost of the exercise programme.
The cost to a healthcare provider of the interventional procedure (e.g., inserting a stent) is 5000 GBP; the cost of providing the current form of lifestyle advice, an appointment with a dietician (“diet”), is 50 GBP and the cost of providing an alternative form, attendance at an exercise programme (“exercise”), is 750 GBP. None of the costs are subject to uncertainty, and are modelled as constant model variables.
<- ConstModVar$new("Cost of diet programme", "GBP", 50.0)
cost_diet <- ConstModVar$new("Cost of exercise programme", "GBP", 750.0)
cost_exercise <- ConstModVar$new("Cost of stent intervention", "GBP", 5000.0) cost_stent
If an advice programme is successful, there is no need for an interventional procedure. In a small trial of the “diet” programme, 12 out of 68 patients (17.6%) avoided having a procedure, and in a separate small trial of the “exercise” programme 18 out of 58 patients (31.0%) avoided the procedure. It is assumed that the baseline characteristics in the two trials were comparable. The trial results are represented as scalar integers.
<- 12L
s_diet <- 56L
f_diet <- 18L
s_exercise <- 40L f_exercise
The proportions of the two programmes being successful (i.e., avoiding an interventional procedure) are uncertain due to the finite size of each trial and are represented by model variables with uncertainties which follow Beta distributions. Probabilities of the failure of the programmes are calculated using expression model variables to ensure that the total probability associated with each chance node is one.
<- BetaModVar$new(
p_diet alpha = s_diet, beta = f_diet, description = "P(diet)", units = ""
)<- BetaModVar$new(
p_exercise alpha = s_exercise, beta = f_exercise, description = "P(exercise)", units = ""
)
<- ExprModVar$new(
q_diet ::quo(1.0 - p_diet), description = "1 - P(diet)", units = ""
rlang
)<- ExprModVar$new(
q_exercise ::quo(1.0 - p_exercise), description = "1 - P(exercise)", units = ""
rlang )
The decision tree has one decision node, representing the single choice of the decision problem (i.e., between the two advice programmes), two chance nodes, representing whether each programme is a success or failure, and four leaf nodes (intervention or no intervention for each of the two programmes).
<- DecisionNode$new("Programme") decision_node
<- ChanceNode$new("Outcome")
chance_node_diet <- ChanceNode$new("Outcome") chance_node_exercise
<- LeafNode$new("No intervention")
leaf_node_diet_no_stent <- LeafNode$new("Intervention")
leaf_node_diet_stent <- LeafNode$new("No intervention")
leaf_node_exercise_no_stent <- LeafNode$new("Intervention") leaf_node_exercise_stent
There are two action edges emanating from the decision node, which represent the two choices, and four reaction edges, representing the consequences of the success and failure of each programme.
<- Action$new(
action_diet cost = cost_diet, label = "Diet"
decision_node, chance_node_diet,
)<- Action$new(
action_exercise cost = cost_exercise, label = "Exercise"
decision_node, chance_node_exercise, )
<- Reaction$new(
reaction_diet_success
chance_node_diet, leaf_node_diet_no_stent,p = p_diet, cost = 0.0, label = "Success"
)
<- Reaction$new(
reaction_diet_failure
chance_node_diet, leaf_node_diet_stent,p = q_diet, cost = cost_stent, label = "Failure"
)
<- Reaction$new(
reaction_exercise_success
chance_node_exercise, leaf_node_exercise_no_stent,p = p_exercise, cost = 0.0, label = "Success"
)
<- Reaction$new(
reaction_exercise_failure
chance_node_exercise, leaf_node_exercise_stent,p = q_exercise, cost = cost_stent, label = "Failure"
)
The decision tree model is constructed from the nodes and edges.
<- DecisionTree$new(
dt V = list(
decision_node,
chance_node_diet,
chance_node_exercise,
leaf_node_diet_no_stent,
leaf_node_diet_stent,
leaf_node_exercise_no_stent,
leaf_node_exercise_stent
),E = list(
action_diet,
action_exercise,
reaction_diet_success,
reaction_diet_failure,
reaction_exercise_success,
reaction_exercise_failure
) )
The method evaluate
is used to calculate the costs and
utilities associated with the decision problem. By default, it evaluates
it once with all the model variables set to their expected values and
returns a data frame.
<- dt$evaluate() rs
Examination of the results of evaluation shows that the expected per-patient net cost of the diet advice programme is 4,168 GBP and the per-patient net cost of the exercise programme is 4,198 GBP; i.e., the net cost of the exercise programme exceeds the diet programme by 31 GBP per patient. The savings associated with the greater efficacy of the exercise programme do not offset the increased cost of delivering it.
To estimate the uncertainty of the relative effectiveness, the probabilities of the success proportions of the two treatments can be represented as model variables who uncertainty follows a Beta distribution, and the decision tree re-evaluated.
$set_cost(cost_diet)
action_diet$set_cost(cost_exercise) action_exercise
$set_probability(p_diet)
reaction_diet_success
$set_probability(q_diet)
reaction_diet_failure$set_cost(cost_stent)
reaction_diet_failure
$set_probability(p_exercise)
reaction_exercise_success
$set_probability(q_exercise)
reaction_exercise_failure$set_cost(cost_stent) reaction_exercise_failure
<- 1000L
N <- dt$evaluate(setvars = "random", by = "run", N = N) rs
The confidence interval of the cost saving is estimated by repeated
evaluation of the tree, each time sampling from the uncertainty
distribution of the two probabilities using, for example,
DT$evaluate(setvars = "random", N = 1000L)
and inspecting
the resulting data frame. From 1000 runs, the 95% confidence interval of
the per patient cost saving is -763.93 GBP to 662.84 GBP, with 46.3%
being cost saving. Although the exercise programme is more costly to
provide than the dietary advice programme, it is more effective and
leads to saving overall because fewer costly interventional procedures
are needed. However, due to the uncertainties of the effectiveness of
each programme, it can be concluded that more evidence is required to be
confident that the exercise programme is cost saving overall.
The method threshold
is used to find the threshold of
one of the model variables at which the cost difference reaches zero. By
univariate threshold analysis, the exercise program will be cost saving
when its cost of delivery is less than 719 GBP or when its success rate
is greater than 31.74%. These thresholds are also subject to
uncertainty.
Sonnenberg and Beck2 introduced an illustrative example of
a semi-Markov process with three states: “Well”, “Disabled” and “Dead”
and one transition between each state, each with a per-cycle
probability. In rdecision
such a model is constructed as
follows. Note that transitions from a state to itself must be specified
if allowed, otherwise the state would be a temporary state.
# create states
<- MarkovState$new(name = "Well", utility = 1.0)
s.well <- MarkovState$new(name = "Disabled", utility = 0.7)
s.disabled <- MarkovState$new(name = "Dead", utility = 0.0) s.dead
# create transitions leaving rates undefined
<- list(
E $new(s.well, s.well),
Transition$new(s.dead, s.dead),
Transition$new(s.disabled, s.disabled),
Transition$new(s.well, s.disabled),
Transition$new(s.well, s.dead),
Transition$new(s.disabled, s.dead)
Transition )
# create the model
<- SemiMarkovModel$new(V = list(s.well, s.disabled, s.dead), E) M
# create transition probability matrix
<- c("Well", "Disabled", "Dead")
snames <- matrix(
Pt data = c(0.6, 0.2, 0.2, 0.0, 0.6, 0.4, 0.0, 0.0, 1.0),
nrow = 3L, byrow = TRUE,
dimnames = list(source = snames, target = snames)
)# set the transition rates from per-cycle probabilities
$set_probabilities(Pt) M
With a starting population of 10,000, the model can be run for 25 years as follows.
# set the starting populations
$reset(c(Well = 10000.0, Disabled = 0.0, Dead = 0.0)) M
# cycle
<- M$cycles(25L, hcc.pop = FALSE, hcc.cost = FALSE) MT
The output, after rounding, of the cycles
function is
the Markov trace, shown below, which replicates Table 22. In
more recent usage, cumulative utility is normally called incremental
utility, and expressed per patient (i.e., divided by 10,000).
Years | Well | Disabled | Dead | Cumulative Utility |
---|---|---|---|---|
0 | 10000 | 0 | 0 | 0 |
1 | 6000 | 2000 | 2000 | 7400 |
2 | 3600 | 2400 | 4000 | 12680 |
3 | 2160 | 2160 | 5680 | 16352 |
23 | 0 | 1 | 9999 | 23749 |
24 | 0 | 0 | 10000 | 23749 |
25 | 0 | 0 | 10000 | 23750 |
In addition to using base R3, redecision
relies heavily on the R6
implementation of
classes4 and the rlang
package for error
handling and non-standard evaluation used in expression model
variables5. Building the package vignettes and documentation
relies on the testthat
package6, the
devtools
package7 and
rmarkdown
10.
Underpinning graph theory is based on terminology, definitions and algorithms from Gross et al11, the Wikipedia glossary12 and links therein. Topological sorting of graphs is based on Kahn’s algorithm13. Some of the terminology for decision trees was based on the work of Kaminski et al14 and an efficient tree drawing algorithm was based on the work of Walker15. In semi-Markov models, representations are exported in the DOT language16.
Terminology for decision trees and Markov models in health economic evaluation was based on the book by Briggs et al1 and the output format and terminology follows ISPOR recommendations18.
Citations for examples used in vignettes are given in applicable vignette files.