Overview

hesim is an R package that integrates statistical modeling, economic modeling, and decision analysis for the economic evaluation of health technologies. Economic evaluation proceeds in a three-step process as shown in the figure (adapted from BCEAweb) below .




First, a statistical model (or statistical models) are fit to estimate parameters. Second, the parameters are combined to construct an economic model, which, in turn, is used to simulate measures of clinical benefits (e.g., quality-adjusted life-years (QALYs)) and costs. Third, the economic value of a medical technology is evaluated by summarizing clinical benefits and costs. The entire analysis is inherently Bayesian, as uncertainty in the parameters from the statistical models is propagated throughout the economic model and decision analysis with probabilistic sensitivity analysis (PSA). Furthermore, since the statistical and economic models are integrated, patient heterogeneity can be easily introduced with patient level covariates.

hesim focuses on step 2 (economic modeling) but is naturally integrated with step 1 (statistical modeling) and provides support for step 3 (decision analysis). As shown in the figure below, a typical analysis would proceed as follows:

  1. The parameters of a statistical model are estimated using an “estimation” dataset, such as extracted data from published studies for a meta-analysis or multi-state survival data from a clinical trial. Parameters are estimated for disease progression, utility, and costs. Each statistical model is constructed by combining the parameter estimates with information on the treatment strategies, patients, and/or health states of interest for the economic model.
  2. The disease progression, utility, and cost models are combined into a coherent economic model to simulate outcomes of interest such as disease progression, QALYs, and costs.
  3. Relevant outcomes from step 2 are used to perform decision analysis using approaches such as cost-effectiveness analysis (CEA) and multi-criteria decision analysis (MCDA).




hesim supports both cohort-level and individual-level economic models. The cohort-level models simulate disease progression for representative cohorts of patients with similar characteristics while the individual-level models simulate individual patients probabilistically using random number generation. Economic models currently supported include:

  • N-state partitioned survival models (PSMs)
  • Individual-level continuous time state transition models (CTSTMs)

The package is under active development and new features are being added. We expect to release a stable version of the package in the beginning of January and to add cohort-level CTSTMs and cohort-level discrete time state transition models (DTSTMs).

Statistical models

The statistical models in hesim are characterized by a distribution over outcomes, \(y\), which depend on model parameters, \(\alpha_1, \ldots, \alpha_L\),

\[ \begin{aligned} P(y| \alpha_1(x_1), \ldots, \alpha_L(x_L)). \end{aligned} \] The \(l\)th model parameter can depend on a vector of coefficients, \(\beta_l\), and an input vector, \(x_l\), of covariates through linked transformed linear models \(g(\alpha_l(x_l)) = x_l^T \beta_l\). The outcomes, \(y\), can be predicted using the mean of the distribution, \(E(y|\cdot)\), or sampled with a random draw from the probability distribution, \(P(y|\cdot)\).

Parameter estimates

In hesim, the estimated coefficients for each parameter are generally stored in matrices (or a single matrix if the model only has one parameter), where the columns of each matrix are covariates and the rows are random samples from a relevant probability distribution (i.e. the posterior distribution in a Bayesian model). Parameters are stored in objects prefixed by params_ (e.g., params_surv for a survival model, params_lm for a linear model). A params_ prefixed object can be created in one of two ways:

  1. With a function of the same name such as params_surv() or params_lm()
  2. By using the generic function create_params() to create objects from fitted statistical models

This flexibility is provided so that parameters can either be estimated using models fit with R or from an external source. Here we illustrate the second method by fitting a 3 state (4 transition) Weibull multi-state model using flexsurvreg() from the flexsurv package. We randomly draw 3 samples of the coefficients for the shape and scale parameters from a multivariate normal distribution that approximates the Bayesian posterior distribution and store them in a params_surv object (note that hesim accepts Weibull distributions by the name “weibull” or “weibull.quiet”).

library("hesim")
library("flexsurv")
library("data.table")

n_samples <- 3 
mstate_data <- data.table(hesim::ctstm3_exdata$transitions)
mstate_data[, trans := factor(trans)]
fit_wei <- flexsurv::flexsurvreg(Surv(years, status) ~ factor(strategy_id)*trans +
                                                    shape(trans), 
                                                    data = mstate_data, 
                                                    dist = "weibull")
params_wei <- create_params(fit_wei, n = n_samples)
print(params_wei)
## $coefs
## $coefs$shape
##           shape shape(trans2) shape(trans3) shape(trans4)
## [1,] -0.2894870     0.1735798     0.3288783   -0.03941342
## [2,] -0.3546113     0.2071636     0.3953519   -0.09107497
## [3,] -0.2737590     0.1403508     0.3277576    0.10013674
## 
## $coefs$scale
##         scale factor(strategy_id)2   trans2     trans3       trans4
## [1,] 1.618603            0.3414493 1.042368 -1.1619454  0.115441315
## [2,] 1.448825            0.3150508 1.281214 -0.8977643 -0.005684006
## [3,] 1.476267            0.1591093 1.236283 -0.8444758 -0.361579679
##      factor(strategy_id)2:trans2 factor(strategy_id)2:trans3
## [1,]                  -0.1668731                  -0.4332674
## [2,]                  -0.1418561                  -0.5040726
## [3,]                  -0.1412258                  -0.4880143
##      factor(strategy_id)2:trans4
## [1,]                  -0.8940817
## [2,]                  -0.3092824
## [3,]                  -0.5334729
## 
## 
## $dist
## [1] "weibull.quiet"
## 
## $n_samples
## [1] 3
## 
## attr(,"class")
## [1] "params_surv"

Input data

The input vectors for each parameter, \(x_{l,hijk}\), are rows in multidimensional input matrices, \(X_l\), where each row denotes a unique observation. The input matrix for parameter \(l\) is indexed by health-related indices \(h\) and patients \(i\) for a treatment strategy \(k\). Example health-related indices include a health state, a transition between two health states, or a survival endpoint in a partitioned survival model (PSM). In some cases, the health-related index \(h\) can be suppressed and separate models can be fit for each health index. This is, for instance, the case in a PSM where separate models are fit for each survival endpoint.

The data and variables needed for the statistical models are stored in objects of class “expanded_hesim_data”, which are data tables or data frames in long format that contain (i) treatment strategy, patient, and/or health-related ID variables indexing each row, and (ii) the covariates for the statistical model. The names of the ID variables are attributes of the data table.

An “expanded_hesim_data” object can be created directly or by expanding an object of class “hesim_data”, which contains relevant data for the economic model. For instance, suppose we would like to create a dataset for modeling transitions between 3 health states for 2 treatment strategies and 2 representative patients. We first use hesim_data() to store the data for each treatment strategy, representative patient, and transition.

strategies <- data.table(strategy_id = c(1, 2))
patients <- data.table(patient_id = seq(1, 2),
                          age = c(45, 50),
                          female = c(0, 1))
tmat <- rbind(c(NA, 1, 2),
              c(3, NA, 4),
              c(NA, NA, NA))
colnames(tmat) <- rownames(tmat) <- c("Healthy", "Sick", "Dead")
transitions <- create_trans_dt(tmat)
transitions[, trans := factor(transition_id)]
hesim_dat <- hesim_data(strategies = strategies,
                        patients = patients,
                        transitions = transitions)
print(hesim_dat)
## $strategies
##    strategy_id
## 1:           1
## 2:           2
## 
## $patients
##    patient_id age female
## 1:          1  45      0
## 2:          2  50      1
## 
## $transitions
##    transition_id from to from_name to_name trans
## 1:             1    1  2   Healthy    Sick     1
## 2:             2    1  3   Healthy    Dead     2
## 3:             3    2  1      Sick Healthy     3
## 4:             4    2  3      Sick    Dead     4
## 
## attr(,"class")
## [1] "hesim_data"

Then we use expand() to create a dataset in long format where there is one row for each treatment strategy, patient, and transition.

transmod_data <- expand(hesim_dat, 
                        by = c("strategies", "patients", "transitions"))
transmod_data[1:16]
##     strategy_id patient_id transition_id age female from to from_name
##  1:           1          1             1  45      0    1  2   Healthy
##  2:           1          1             2  45      0    1  3   Healthy
##  3:           1          1             3  45      0    2  1      Sick
##  4:           1          1             4  45      0    2  3      Sick
##  5:           1          2             1  50      1    1  2   Healthy
##  6:           1          2             2  50      1    1  3   Healthy
##  7:           1          2             3  50      1    2  1      Sick
##  8:           1          2             4  50      1    2  3      Sick
##  9:           2          1             1  45      0    1  2   Healthy
## 10:           2          1             2  45      0    1  3   Healthy
## 11:           2          1             3  45      0    2  1      Sick
## 12:           2          1             4  45      0    2  3      Sick
## 13:           2          2             1  50      1    1  2   Healthy
## 14:           2          2             2  50      1    1  3   Healthy
## 15:           2          2             3  50      1    2  1      Sick
## 16:           2          2             4  50      1    2  3      Sick
##     to_name trans
##  1:    Sick     1
##  2:    Dead     2
##  3: Healthy     3
##  4:    Dead     4
##  5:    Sick     1
##  6:    Dead     2
##  7: Healthy     3
##  8:    Dead     4
##  9:    Sick     1
## 10:    Dead     2
## 11: Healthy     3
## 12:    Dead     4
## 13:    Sick     1
## 14:    Dead     2
## 15: Healthy     3
## 16:    Dead     4
# These are the ID variables
attributes(transmod_data)$id_vars
## [1] "strategy_id"   "patient_id"    "transition_id"

Creating a statistical model

A statistical model is instantiated by combining the parameter estimates and input data. A statistical model can be created from an “expanded_hesim_data” object and either a parameter object (prefixed by params_) or a fitted statistical model using functions prefixed by create_. For example, we can create a model of health state transitions for an individual-level CTSTM using create_IndivCtstmTrans() from the fit_wei object above, which is a fitted Weibull multi-state model.

transmod <- create_IndivCtstmTrans(fit_wei, transmod_data,
                                   trans_mat = tmat, n = 100)
class(transmod)
## [1] "IndivCtstmTrans" "CtstmTrans"      "R6"

Economic model

An economic model for an individual-level CTSTM would be constructed by combining the transmod object with statistical models for costs and utility. Examples analyses are currently available for PSMs and individual-level CTSTMs.

Decision analysis

Once output has been simulated with an economic model, a CEA or MCDA can be performed. hesim (here) provides support for CEA including summarizing a PSA and individualized cost-effectiveness analysis. Output from hesim economic models can also be easily analyzed using other R packages such as bcea.