Multivariate ordinal response

This vignette demonstrates the win ratio (or net benefit) approach to multivariate ordinal data (Buyse 2010; Bebu and Lachin 2015; Mao 2024), under consensus or prioritized order between components.

Data and Methodology

Let \(Y = (y_1,\ldots, y_K)\) be a \(K\)-vector of ordinal responses, e.g., \(y_k =0,1,\ldots, m_k -1\) for some \(m_k\in\mathbb Z^+\). The components may be rating scores provided by different readers on the same subject’s medical image. Such vector-valued responses can be partially ordered by consensus among components, that is, we say \(Y_i\prec Y_j\) if \(Y_i\leq Y_j\) component-wise, with strict inequality for at least one component. Alternatively, if some reader, say the first one, is more experienced than the rest, their score can be prioritized. Then we say \(Y_i\prec Y_j\) if \(y_{1i} < y_{1j}\), or \(y_{1i} = y_{1j}\) and a consensus order holds on the other components. In either case, we assume that a higher score is better, so that \(Y_i\succ Y_j\) (or \(Y_j\prec Y_i\)) means a “win” for \(Y_i\) and a “loss” for \(Y_j\).

Two-group comparison

Let \(Z = 1, 0\) be a binary indicator for treatment and control groups, respectively. The probability of a treated winning against an untreated is \(P(Y_i\succ Y_j\mid Z_i = 1, Z_j = 0)\). Likewise, that of a treated losing to an untreated is \(P(Y_i\prec Y_j\mid Z_i = 1, Z_j = 0)\). To summarize the relative favorability of treatment against control, consider \[\begin{align}\tag{1} \mbox{Win ratio: } \hspace{1ex}& WR = \frac{P(Y_i\succ Y_j\mid Z_i = 1, Z_j = 0)}{P(Y_i\prec Y_j\mid Z_i = 1, Z_j = 0)}\\ \mbox{Net benefit: } \hspace{1ex}& NB = P(Y_i\succ Y_j\mid Z_i = 1, Z_j = 0) - P(Y_i\prec Y_j\mid Z_i = 1, Z_j = 0). \end{align}\]

Given \((Y_i, Z_i)\) \((i=1,\ldots, n)\), a random \(n\)-sample of \((Y, Z)\), standard two-sample \(U\)-statistics can be used to estimate the win and loss probabilities, which can then replace the target quantities in (1) for nonparametric estimates of WR and NB.

Win ratio regression

If \(Z\) is non-binary but rather a \(p\)-vector of possibly continuous components, the nonparametric approach no longer works. To reduce dimension, we posit a multiplicative win ratio model \[\begin{align}\tag{2} WR(Z_i, Z_j) := \frac{P(Y_i\succ Y_j\mid Z_i, Z_j)}{P(Y_i\prec Y_j\mid Z_i, Z_j)} =\exp\{\beta^{\rm T}(Z_i - Z_j)\}. \end{align}\] This means that unit increases in the covariates lead to win ratios \(\exp(\beta)\) (component-wise). Standard estimators of \(\exp(\beta)\) reduce to the two-sample win ratio when \(Z = 1, 0\).

Code and Example

Load the package:

library(poset)

Basic syntax

For two-sample WR/NB:

wrtest(Y1, Y0, fun = wprod)

where Y1 and Y0 are response matrices in the treatment and control, respectively, each with \(K\) columns for the \(K\) components. fun can be a user-defined function for the partial order that takes two \(K\)-vectors and outputs \(1\), \(-1\), \(0\) if the first wins, loses, or ties with the second, respectively. The default is the function wprod for the consensus (product) order.

For win ratio regression (2):

obj <- wreg(Y, Z, fun = wprod)

where Y is an \(n\times K\) response matrix and \(Z\) is an \(n\times p\) design (covariate) matrix. Again, the win function can be customized in fun. Regression results are summarized by summary(obj).

A data example

A total of 186 patients with non-alcoholic fatty liver disease were recruited at the University of Wisconsin Hospitals in 2017. The patients underwent computed tomography scan of the liver for the presence of non-alcoholic steato-hepatitis, the most severe form of non-alcoholic fatty liver disease. The image was subsequently assessed by two radiologists using a scale of 1 to 5, with higher values indicating greater likelihood of disease. Predictors of rating scores include patient sex, the presence of advanced fibrosis (AF), and quantitative biomarkers such as percent of steatosis, i.e., liver fat content, liver mean gray level intensity (SSF2), and liver surface nodularity (LSN) score.

head(liver)
#>   R1NASH R2NASH Sex    AF Steatosis  SSF2  LSN
#> 1      3      2   M FALSE        30  0.21 2.33
#> 2      1      1   F FALSE         5  0.38 2.86
#> 3      4      2   M FALSE        70  0.58 3.65
#> 4      4      4   F  TRUE        30 -0.08 2.73
#> 5      4      3   M  TRUE        70 -0.04 2.53
#> 6      3      3   M FALSE        10  0.02 2.88

First, compare the bivariate scores between AF and non-AF by win ratio/net benefit:

# lower score is better
Y1 <- 5 - liver[liver$AF, c("R1NASH", "R2NASH")] # advanced
Y0 <- 5 - liver[!liver$AF, c("R1NASH", "R2NASH")] # not advanced
obj <- wrtest(Y1, Y0)
obj
#> Call:
#> wrtest(Y1 = Y1, Y0 = Y0)
#> 
#> Two-sample (Y1 vs Y0) win ratio/net benefit analysis
#> 
#> Number of pairs: N1 x N0 =  69 x 116  =  8004 
#>   Win: 2392 (29.9%)
#>   Loss: 4251 (53.1%)
#>   Tie: 1361 (17%)
#> 
#> Win ratio (95% CI): 0.56 (0.37, 0.86), p-value = 0.00856547
#> Net benefit (95% CI): -0.232 (-0.4, -0.065), p-value = 0.006577537

This shows, in particular, that AF is \(1-0.56=44\%\) less likely than non-AF to have favorable scores by consensus of the two readers.

To regress the scores against other covariates:

Y <- 5 - liver[, c("R1NASH", "R2NASH")] # lower score is better
Z <- cbind("Female" = liver$Sex == "F",
           liver[, c("AF", "Steatosis",   "SSF2",  "LSN")]) # covariates
obj <- wreg(Y, Z) # fit model
obj
#> Call:
#> wreg(Y = Y, Z = Z)
#> 
#> n = 154 subjects with complete data
#> Comparable (win/loss) pairs: 9548/11781 = 81%
#> 
#>    Female         AF   Steatosis         SSF2        LSN
#>  -0.18956 -0.9660827 -0.02779146 -0.007926333 -0.1029914

Some basic information of the model is printed, like the number and percentage of comparable pairs used in the regression, as well as the regression coefficients (log-WR).

For more detailed inference results:

summary(obj)
#> Call:
#> wreg(Y = Y, Z = Z)
#> 
#> n = 154 subjects with complete data
#> Comparable (win/loss) pairs: 9548/11781 = 81%
#> 
#> Newton-Raphson algoritm converged in 7 iterations
#> 
#>                coef exp(coef)  se(coef)      z Pr(>|z|)    
#> Female    -0.189560    0.8273  0.259988 -0.729 0.465934    
#> AF        -0.966083    0.3806  0.280313 -3.446 0.000568 ***
#> Steatosis -0.027791    0.9726  0.005281 -5.262 1.42e-07 ***
#> SSF2      -0.007926    0.9921  0.003953 -2.005 0.044953 *  
#> LSN       -0.102991    0.9021  0.125718 -0.819 0.412657    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#>           exp(coef) exp(-coef) lower .95 upper .95
#> Female      0.82732    1.20872   0.49702    1.3771
#> AF          0.38057    2.62763   0.21970    0.6592
#> Steatosis   0.97259    1.02818   0.96258    0.9827
#> SSF2        0.99210    1.00796   0.98445    0.9998
#> LSN         0.90213    1.10848   0.70512    1.1542
#> 
#> Overall Wald test = 79.129 on 5 df,  p = 1.221245e-15

Advanced fibrosis status and percent of steatosis are strongly and significantly associated with the likelihood of non-alcoholic steato-hepatitis. In particular, AF is \(38.1\%\) times as likely to have favorable reader assessments as non-AF. Furthermore, one percentage-point increase in steatosis results in \(1-0.97259\doteq 2.7\%\) reduction in the likelihood of favorable assessments.

Exercise

Confirm that wreg() with AF as the only covariate in Z produces the same results as wrtest() does.
Try a different win function, e.g., one that prioritizes the score of reader 1, through fun and compare the results with those under the consensus order.