| Title: | Conditional Logistic Regression |
| Version: | 1.0 |
| Author: | Adam Kapelner |
| Maintainer: | Adam Kapelner <kapelner@qc.cuny.edu> |
| Description: | Performs inference for Bayesian conditional logistic regression with informative priors built from the concordant pair data. We include many options to build the priors. And we include many options during the inference step for estimation, testing and confidence set creation. For details, see Kapelner and Tennenbaum (2026) “Improved Conditional Logistic Regression using Information in Concordant Pairs with Software” <doi:10.48550/arXiv.2602.08212>. |
| SystemRequirements: | GNU make |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| LazyData: | true |
| Depends: | R (≥ 4.0), Rcpp (≥ 1.0.14) |
| Imports: | checkmate, coda, fastLogisticRegressionWrap, geepack, glmmTMB, methods, RcppParallel (≥ 5.0.1), rstan (≥ 2.18.1), rstantools (≥ 2.6.0) |
| Suggests: | testthat (≥ 3.0.0), survival, ggdist, data.table |
| LinkingTo: | BH (≥ 1.66.0), Rcpp (≥ 0.12.0), RcppEigen (≥ 0.3.3.3.0), RcppParallel (≥ 5.0.1), rstan (≥ 2.18.1), StanHeaders (≥ 2.18.0) |
| URL: | https://github.com/Tennenbaum-J/bclogit_package_and_paper_repo |
| BugReports: | https://github.com/Tennenbaum-J/bclogit_package_and_paper_repo/issues |
| RoxygenNote: | 7.3.3 |
| Biarch: | true |
| NeedsCompilation: | yes |
| Packaged: | 2026-02-20 00:56:31 UTC; kapelner |
| Repository: | CRAN |
| Date/Publication: | 2026-02-25 10:00:14 UTC |
Bayesian Conditional Logistic Regression with Concordant Pairs
Description
bclogit
Details
Fits a conditional logistic regression model to incidence data. Allows for use of the concordant pairs in the fitting.
Author(s)
Jacob Tennenbaum \email{Jacob.Tennenbaum51@qmail.cuny.edu}
Adam Kapelner \email{kapelner@qc.cuny.edu}
References
Jacob Tennenbaum and Adam Kapelner (2026). "Improved Conditional Logistic Regression using Information in Concordant Pairs with Software." arXiv preprint arXiv:2602.08212.
See Also
Useful links:
Initialize a new bclogit model
Description
This function fits a Bayesian conditional logistic regression model, incorporating information from concordant pairs to improve estimation.
Usage
## S3 method for class 'formula'
bclogit(
formula,
data,
treatment = NULL,
strata = NULL,
subset = NULL,
na.action = NULL,
concordant_method = "GLM",
prior_type = "Naive",
chains = 4,
return_raw_stan_output = FALSE,
prior_variance_treatment = 100,
stan_refresh = 0,
...
)
bclogit(
formula,
data,
treatment = NULL,
strata = NULL,
subset = NULL,
na.action = NULL,
concordant_method = "GLM",
prior_type = "Naive",
chains = 4,
return_raw_stan_output = FALSE,
prior_variance_treatment = 100,
stan_refresh = 0,
...
)
## Default S3 method:
bclogit(
formula = NULL,
data = NULL,
treatment = NULL,
strata = NULL,
subset = NULL,
na.action = NULL,
concordant_method = "GLM",
prior_type = "Naive",
chains = 4,
return_raw_stan_output = FALSE,
prior_variance_treatment = 100,
stan_refresh = 0,
...,
y = NULL,
X = NULL,
treatment_name = NULL,
call = NULL
)
Arguments
formula |
For the formula method, a symbolic description of the model to be fitted. |
data |
A data.frame, data.table, or model.matrix containing the variables (optional for formula method). |
treatment |
Optional vector specifying the treatment variable (required for default method, or can be specified in formula method). |
strata |
Vector specifying the strata (matched pairs). |
subset |
An optional vector specifying a subset of observations. |
na.action |
A function which indicates what should happen when the data contain NAs. |
concordant_method |
The method to use for fitting the concordant pairs and reservoir. Options are "GLM", "GEE", and "GLMM". |
prior_type |
The type of prior to use for the discordant pairs. Options are "Naive", "G prior", "PMP", and "Hybrid". |
chains |
Number of chains for Stan sampling. Default is 4. |
return_raw_stan_output |
Logical; if |
prior_variance_treatment |
Prior variance for the treatment coefficient in the
covariance matrix |
stan_refresh |
How often Stan reports sampling progress (in iterations). Default is 0 (silent). Set to a positive integer (e.g., 1 or 100) to see progress. |
... |
Additional arguments passed to |
y |
For the default method, a binary (0,1) vector containing the response of each subject. |
X |
A data.frame, data.table, or model.matrix containing the variables. |
treatment_name |
Optional string name for the treatment variable. |
call |
Optional call object to store in the result. |
Value
An object of class "bclogit".
A list of class bclogit containing:
coefficients |
Estimated coefficients (posterior means). |
var |
Variance-covariance matrix of coefficients. |
model |
The fitted Stan model object. |
posterior_samples |
Raw posterior samples as a 3D array (iterations x chains x parameters) from |
concordant_model |
The fitted model object for the concordant pairs/reservoir (GLM/GEE/GLMM). |
matched_data |
The processed matched pairs data from the premodeling step. |
prior_info |
Information about the prior derived from concordant pairs. |
call |
The function call. |
terms |
The model terms. |
num_discordant |
Number of discordant pairs used. |
num_concordant |
Number of concordant pairs/reservoir entries used. |
A list of class "bclogit" containing:
coefficients |
Estimated coefficients (posterior means). |
var |
Variance-covariance matrix of the coefficients (posterior covariance). |
model |
The fitted Stan model object for the discordant pairs. |
posterior_samples |
Raw posterior samples as a 3D array
(iterations x chains x parameters) from |
concordant_model |
The fitted model object for the concordant pairs (GLM, GEE, or GLMM). |
matched_data |
The processed matched pairs data from the C++ pre-modeling step. |
prior_info |
A list with elements |
call |
The function call. |
terms |
The model terms. |
xlevels |
Factor level information (always |
n |
Total number of observations. |
num_discordant |
Number of discordant pairs used for fitting. |
num_concordant |
Number of concordant pairs used for the prior. |
X_model_matrix_col_names |
Column names of the covariate model matrix. |
treatment_name |
Name of the treatment variable. |
Methods (by class)
-
bclogit(formula): Formula method -
bclogit(default): Default method for matrix/data input.
See Also
summary.bclogit, confint.bclogit,
vcov.bclogit, coef.bclogit
Examples
# Example usage
data("fhs")
fit <- bclogit(PREVHYP ~ TOTCHOL + CIGPDAY + BMI + HEARTRTE,
data = fhs, treatment = PERIOD, strata = RANDID)
summary(fit)
Frequentist Conditional Logistic Regression
Description
Fits a conditional logistic regression model for matched pairs using
the discordant-pair GLM trick. This is a fast frequentist alternative
to bclogit.
Usage
## Default S3 method:
clogit(
formula = NULL,
data = NULL,
treatment = NULL,
strata = NULL,
subset = NULL,
na.action = NULL,
do_inference_on_var = "all",
...,
y = NULL,
X = NULL,
treatment_name = NULL,
call = NULL
)
## S3 method for class 'formula'
clogit(
formula,
data,
treatment = NULL,
strata = NULL,
subset = NULL,
na.action = NULL,
do_inference_on_var = "all",
...
)
clogit(
formula,
data,
treatment = NULL,
strata = NULL,
subset = NULL,
na.action = NULL,
do_inference_on_var = "all",
...
)
Arguments
formula |
For the formula method, a symbolic description of the model. |
data |
A data frame containing the variables (for formula method). |
treatment |
Vector specifying the treatment variable. |
strata |
Vector specifying the strata (matched pairs). |
subset |
An optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
A function which indicates what should happen when the data contain NAs. |
do_inference_on_var |
Which variable(s) to compute standard errors for.
|
... |
Additional arguments passed to methods. |
y |
For the default method, a binary (0,1) response vector. |
X |
A data.frame, data.table, or model.matrix containing the variables. |
treatment_name |
Optional string name for the treatment variable. |
call |
Optional call object to store in the result. |
Value
A list of class "clogit_bclogit" containing:
coefficients |
Estimated coefficients (posterior means). |
var |
Variance-covariance matrix of the coefficients (diagonal, built from standard errors).
|
flr_model |
The fitted fast logistic regression model object returned by
|
call |
The function call. |
terms |
The model terms. |
n |
Total number of observations. |
num_discordant |
Number of discordant pairs used for fitting. |
num_concordant |
Number of concordant pairs. |
X_model_matrix_col_names |
Column names of the covariate model matrix. |
treatment_name |
Name of the treatment variable. |
se |
Standard errors of the coefficients. |
z |
Z-statistics for each coefficient. |
pval |
Approximate p-values for each coefficient. |
do_inference_on_var |
The value of the |
An object of class "clogit_bclogit".
An object of class "clogit_bclogit".
Methods (by class)
-
clogit(default): Default method for matrix/data input. -
clogit(formula): Formula method
See Also
bclogit, summary.clogit_bclogit
Examples
data("fhs")
fit <- clogit(PREVHYP ~ TOTCHOL + CIGPDAY + BMI + HEARTRTE,
data = fhs, treatment = PERIOD, strata = RANDID)
summary(fit)
n <- 200
dat <- data.frame(
y = rbinom(n, 1, 0.5),
x1 = rnorm(n),
treatment = rep(c(0, 1), n / 2),
strata = rep(1:(n / 2), each = 2)
)
fit <- clogit(y ~ x1, data = dat, treatment = treatment, strata = strata)
# Inference on treatment only (faster):
fit2 <- clogit(y ~ x1, data = dat, treatment = treatment, strata = strata,
do_inference_on_var = 1)
n <- 200
dat <- data.frame(
y = rbinom(n, 1, 0.5),
x1 = rnorm(n),
x2 = rnorm(n),
treatment = rep(c(0, 1), n / 2),
strata = rep(1:(n / 2), each = 2)
)
fit <- clogit(y ~ x1 + x2, data = dat, treatment = treatment, strata = strata)
summary(fit)
coef(fit)
vcov(fit)
Extract coefficients from a bclogit model
Description
Extract coefficients from a bclogit model
Usage
## S3 method for class 'bclogit'
coef(object, ...)
Arguments
object |
A |
... |
Additional arguments. |
Value
Numeric vector of coefficients.
Extract coefficients from a clogit_bclogit model
Description
Extract coefficients from a clogit_bclogit model
Usage
## S3 method for class 'clogit_bclogit'
coef(object, ...)
Arguments
object |
A |
... |
Additional arguments. |
Value
Numeric vector of coefficients.
Examples
n <- 200
dat <- data.frame(
y = rbinom(n, 1, 0.5), x1 = rnorm(n),
treatment = rep(c(0, 1), n / 2),
strata = rep(1:(n / 2), each = 2)
)
fit <- clogit(y ~ x1, data = dat, treatment = treatment, strata = strata)
coef(fit)
Credible Intervals for bclogit Parameters
Description
Computes Bayesian credible intervals for the model parameters.
Usage
## S3 method for class 'bclogit'
confint(object, parm, level = 0.95, type = c("HPD_one", "CR", "HPD_many"), ...)
Arguments
object |
A |
parm |
A specification of which parameters to be given credible intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered. |
level |
The confidence level required (default 0.95). |
type |
Type of interval to compute: "HPD_one" (default unimodal HPD interval via coda), "CR" (equal-tailed credible region), "HPD_many" (multimodal HPD interval via ggdist). |
... |
Additional arguments. |
Value
A matrix with columns lower and upper.
For "HPD_many", a parameter may appear on multiple rows when the interval is disjoint.
The matrix has a Probability attribute.
Framingham Heart Study Dataset
Description
A subset of the Framingham Heart Study data.
Usage
fhs
Format
A data frame with 5944 rows and 39 variables:
- RANDID
Unique identification number for each participant. (This is the strata for the matched pairs).
- PERIOD
Examination Cycle where 0 = baseline, 1 = endpoint (the treatment variable)
- SEX
Participant sex (1 = Male, 2 = Female).
- TOTCHOL
Total serum cholesterol (mg/dL).
- AGE
Age at exam (years).
- SYSBP
Systolic blood pressure (mmHg).
- DIABP
Diastolic blood pressure (mmHg).
- CURSMOKE
Current smoking status (0 = No, 1 = Yes).
- CIGPDAY
Number of cigarettes smoked per day.
- BMI
Body Mass Index (kg/m^2).
- DIABETES
Diabetes status (0 = No, 1 = Yes).
- BPMEDS
Use of Anti-hypertensive medication (0 = No, 1 = Yes).
- HEARTRTE
Heart rate (beats/minute).
- GLUCOSE
Fast blood glucose (mg/dL).
- educ
Education level.
- PREVCHD
Prevalence of Coronary Heart Disease.
- PREVAP
Prevalence of Angina Pectoris.
- PREVMI
Prevalence of Myocardial Infarction.
- PREVSTRK
Prevalence of Stroke.
- PREVHYP
Prevalence of Hypertension.
- TIME
Number of days since baseline exam.
- HDLC
High Density Lipoprotein Cholesterol.
- LDLC
Low Density Lipoprotein Cholesterol.
- DEATH
Death status.
- ANGINA
Angina Pectoris status.
- HOSPMI
Hospitalized Myocardial Infarction.
- MI_FCHD
Myocardial Infarction or Fatal Coronary Heart Disease.
- ANYCHD
Any Coronary Heart Disease event.
- STROKE
Stroke status.
- CVD
Cardiovascular Disease status.
- HYPERTEN
Hypertension status.
- TIMEAP
Time to Angina Pectoris.
- TIMEMI
Time to Myocardial Infarction.
- TIMEMIFC
Time to Myocardial Infarction or Fatal CHD.
- TIMECHD
Time to Any CHD.
- TIMESTRK
Time to Stroke.
- TIMECVD
Time to Cardiovascular Disease.
- TIMEDTH
Time to Death.
- TIMEHYP
Time to Hypertension.
Details
This dataset was constructed by running the following code:
pacman::p_load(riskCommunicator, data.table)
data("framingham")
D = data.table(framingham)
D = D[!is.na(CIGPDAY)] #we drop missing data in covariates
D = D[!is.na(BMI)]
D = D[!is.na(HEARTRTE)]
D = D[!is.na(TOTCHOL)]
Dba = D[PERIOD %in% c(1,3)] #we drop intermediate periods so we have matched pairs
Dba[PERIOD == 1, PERIOD := 0]
Dba[PERIOD == 3, PERIOD := 1]
Dba[, num_periods_per_id := .N, by = RANDID]
Dba = Dba[num_periods_per_id == 2] #we drop intermediate periods so we have matched pairs
Dba[, num_periods_per_id := NULL]
Source
https://biolincc.nhlbi.nih.gov/teaching/
Extract model formula
Description
Extract model formula
Usage
## S3 method for class 'bclogit'
formula(x, ...)
Arguments
x |
A |
... |
Additional arguments. |
Value
The formula used in the model.
Print summary of a bclogit model
Description
Print summary of a bclogit model
Usage
## S3 method for class 'summary.bclogit'
print(x, digits = max(3, getOption("digits") - 3), ...)
Arguments
x |
A |
digits |
Number of significant digits to print. |
... |
Additional arguments. |
Value
Invisibly returns x.
Print summary of a clogit_bclogit model
Description
Print summary of a clogit_bclogit model
Usage
## S3 method for class 'summary.clogit_bclogit'
print(x, digits = max(3, getOption("digits") - 3), ...)
Arguments
x |
A |
digits |
Number of significant digits to print. |
... |
Additional arguments. |
Value
Invisibly returns x.
Examples
n <- 200
dat <- data.frame(
y = rbinom(n, 1, 0.5), x1 = rnorm(n),
treatment = rep(c(0, 1), n / 2),
strata = rep(1:(n / 2), each = 2)
)
fit <- clogit(y ~ x1, data = dat, treatment = treatment, strata = strata)
print(summary(fit))
Summarize a bclogit model
Description
Summarize a bclogit model
Usage
## S3 method for class 'bclogit'
summary(object, level = 0.95, inference_method = "HPD_one", ...)
Arguments
object |
A |
level |
Confidence level for credible intervals (default 0.95). |
inference_method |
Method used for both the displayed confidence set bounds and the
p-value (computed via bisection over alpha). Options are:
|
... |
Additional arguments (not used). |
Value
A list of class "summary.bclogit" containing:
call |
The original function call. |
coefficients |
A matrix with one row per parameter and columns for the posterior mean
estimate, posterior median estimate, standard error, lower and upper credible interval bounds,
optionally |
num_discordant |
Number of discordant pairs used for fitting. |
num_concordant |
Number of concordant pairs used for the prior. |
level |
The credible interval level used. |
inference_method |
The inference method used for interval and p-value computation. |
prior_info |
A list with elements |
treatment_name |
Name of the treatment variable. |
Summarize a clogit_bclogit model
Description
Summarize a clogit_bclogit model
Usage
## S3 method for class 'clogit_bclogit'
summary(object, ...)
Arguments
object |
A |
... |
Additional arguments (not used). |
Value
A list of class "summary.clogit_bclogit" containing:
call |
The original function call. |
coefficients |
A matrix with one row per parameter and columns |
num_discordant |
Number of discordant pairs used for fitting. |
num_concordant |
Number of concordant pairs. |
n |
Total number of observations. |
treatment_name |
Name of the treatment variable. |
do_inference_on_var |
The value of the |
Extract variance-covariance matrix from a bclogit model
Description
Extract variance-covariance matrix from a bclogit model
Usage
## S3 method for class 'bclogit'
vcov(object, ...)
Arguments
object |
A |
... |
Additional arguments. |
Value
A matrix of the estimated covariance of the coefficients.
Extract variance-covariance matrix from a clogit_bclogit model
Description
Extract variance-covariance matrix from a clogit_bclogit model
Usage
## S3 method for class 'clogit_bclogit'
vcov(object, ...)
Arguments
object |
A |
... |
Additional arguments. |
Value
A matrix of the estimated covariance of the coefficients.