integrityIncreasing concerns about the trustworthiness of research have prompted calls to scrutinise studies’ individual participant data (IPD), that is, de-identified raw line-by-line data for each participant in a study.
integrity was developed to support application of the
IPD Integrity Tool (Hunter
et al. 2024A). It enables structured and transparent assessment of
the integrity and trustworthiness of randomised controlled trials (RCTs)
using IPD and informs decisions about whether RCTs should be included in
evidence synthesis or considered suitable for publication. Further
information may be found about the development of the tool here -
(Hunter et al., 2024B).
If you use our package please cite:
Hunter KE, Aberoumand M, Libesman S, Sotiropoulos JX, Williams JG, Aagerup J, Wang R, Mol BW, Li W, Barba A, Shrestha N. Webster AC, Seidler AL. The Individual Participant Data Integrity Tool for assessing the integrity of randomised trials. Research Synthesis Methods. 2024 Nov;15(6):917-39.
Each step of the workflow is illustrated using a case study on umbilical cord management at preterm birth, based on a de-identified and altered data set from the iCOMP study. The main goal was to determine the optimal umbilical cord management strategy at preterm birth, such as milking or delayed cord clamping.
Load the integrity package into R.
if(requireNamespace("pkgload", quietly = TRUE) && file.exists("../DESCRIPTION")) {
pkgload::load_all("..")
} else {
library(integrity)
}
## ℹ Loading integrity
## Warning: package 'testthat' was built under R version 4.5.2
Next, import the data set you wish to examine into R. There are a variety of functions in R or CRAN packages to do this:
read.csv and read.table functions to
import comma-separated and tab-separated text files.read.sas for SAS, read.sav for SPSS and
read_dta for STATA in the CRAN package haven.read_excel function for Microsoft Excel in the CRAN
package readxl.Case study: The altered iCOMP case study is loaded with the
integrity package. The data are in a Microsoft Excel
file.
library(readxl)
examplePath <- system.file("extdata", "dataset.xlsx", package = "integrity")
dataset <- read_excel(examplePath, sheet=1)
dataset[1:5, ]
## # A tibble: 5 × 18
## infant_ID rand_date mat_age blood_loss treatment_cat GA_weeks
## <dbl> <dttm> <dbl> <dbl> <dbl> <dbl>
## 1 1 2019-03-21 00:00:00 36 200 2 30
## 2 2 2020-07-17 00:00:00 18 200 1 28
## 3 3 2019-06-14 00:00:00 20 300 1 32
## 4 4 2019-10-08 00:00:00 30 500 2 29
## 5 5 2019-03-02 00:00:00 34 400 1 32
## # ℹ 12 more variables: birthweight <dbl>, sex <dbl>, hospital_days <dbl>,
## # temp <dbl>, inf_transfusion_any <dbl>, Hct <dbl>, CLD <dbl>, IVH <dbl>,
## # NEC <dbl>, inf_death <dbl>, enrol_start <dttm>, enrol_end <dttm>
In the tibble above, the sample identifiers can be seen
(infant_ID), as well as the date of randomisation
(rand_date) and the first few clinical covariates.
The following elements are required to be paired with the corresponding column names in your data set:
participantID: The name of the column which corresponds
to the unique participant identifier (this variable is mandatory).enrollment: lists the names of three columns
corresponding to start (date of first participant
enrollment), randomisation (date of participant
randomisation) and end (date of the last participant
enrollment).baseline: lists named dichotomous,
polytomous, numeric are for specifying the
column name(s) of the column(s) which correspond(s) to baseline
measurements.intervention: the name of the column specifying the
intervention or group allocation for each individual (this variable is
mandatory).outcome: lists named common and
rare, with sublists named by dichotomous,
numeric or polytomous, containing the names of
columns of those data types for outcomes assessed.correlated: A named list of two entries of column names
that are expected to be correlated.unexpected: A named list of column names with values
that are not expected to be seen. days is a special sublist
and applies to date columns, which are converted into days of the week
before comparison. It must have two elements: names, which
are the unexpected day names, and locale, which is the
locale of the unexpected day names specified.Only participantID is strictly required.
enrollment, baseline,
intervention, outcome,
correlated, and unexpected should be supplied
when available; if a section is omitted, the checks that depend on it
will be skipped.
The variable types and expectations need to be defined before running the checks. The package accepts the same metadata structure which may be created in multiple different formats depending on your preference: a list written directly in a R script or markdown, or an Excel template workbook.
Coding the list directly in R as below, is often the simplest option for users already working inside an R script or an R Markdown document, because the metadata can be written directly next to the analysis code. The R code below may be used as a template and altered based on relevant variables in a new dataset.
dataset_info <- list(
participantID = "infant_ID",
enrollment = list(
start = "enrol_start",
randomisation = "rand_date",
end = "enrol_end"
),
baseline = list(
dichotomous = c("sex"), # add more variables if needed e.g., c("sex", "respiratory_support")
#polytomous = c("ordinal_or_nominal_variable"), # no polytomous baseline variables in this data set so it's commented out
numeric = c("mat_age", "GA_weeks", "birthweight")
# can add polytomous variables if needed
),
intervention = "treatment_cat",
outcome = list(
common = list(
dichotomous = c("IVH", "NEC"),
polytomous = c("CLD"), # if certain variable types don't exist, just delete the relevant line.
numeric = c("hospital_days")
),
rare = list(
dichotomous = c("inf_death") # add more variables if needed e.g., c("inf_death", "severe_IVH")
)
),
correlated = list(
timeAndSize = c("GA_weeks", "birthweight")
),
unexpected = list(
days = list(
names = c("Saturday", "Sunday"),
locale = "C"
),
mat_age = c("less than 10", "greater than 50"),
GA_weeks = c("less than 22", "greater than 37")
)
)
In the unexpected$days section, locale
controls the language used when R converts dates into weekday names.
locale = "C" is the most likely option to use because it
returns standard English weekday names such as Saturday and
Sunday, which usually match the values entered in
names. Other locale values are possible if your system or
dataset uses a different language or naming convention, but
"C" will usually be the safest default.
An Excel template is also available if users prefer to enter the
metadata in a spreadsheet. The workbook has one row per entry and four
columns named level_1, level_2,
level_3, and value. Repeated values, such as
several numeric baseline variables, are entered as multiple rows.
This workbook can be edited in Microsoft Excel, then imported into R
with read_metadata_excel().
example_excel_path <- system.file("extdata", "variables_template.xlsx", package = "integrity")
dataset_info <- read_metadata_excel(example_excel_path)
Simply provide the data frame and data information to
run_checks. The function first performs automated data
checking and cleaning to ensure that all variables defined in the
dataset_info file are present in the dataset. The function
will also convert columns nominated as factors into factors where
required, and remove any columns containing only missing values.
result <- run_checks(dataset, dataset_info)
names(result)
## [1] "check_table" "detail_tables" "images" "summary_table"
This creates a list of result objects, including overall check tables, detailed per-variable tables for selected checks, plots, and summary tables. The output for each item below should be reviewed consecutively and rated using the decision guide and rating sheet (found here - Hunter et al. 2024A)
The sections below present the output from the integrity
run_checks function, split under each domain and item.
Item 1.1: Repeating patterns within baseline variables
This item is manually performed by sorting and visually inspecting the data to identify repeating patterns within baseline variables, e.g. check whether values appear to repeat at regular intervals, which may indicate rows were copied and pasted. Rare or unusual entries can be particularly useful for detecting such patterns; assess whether these entries recur systematically, such as every 11 rows. Perform these assessments using the original dataset order, randomisation order, and separately within each study group.
item_1_1 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "1.1", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_1_1, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Repeated Baselines Within Variables | Skipped | This step is mannually peformed through visual inspection of the raw data |
Item 1.2: Repeating data patterns across baseline variables
This item looks for duplication across participants, e.g. do all participants with a height of 180cm have the same weight? Duplicate entries for baseline variables are listed below (if there are none, no input will be printed).
item_1_2 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "1.2", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_1_2, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Repeated Baselines | Potential integrity issue | sex:1, mat_age:30, GA_weeks:33, birthweight:1568 occurs 2 times. |
Item 1.3: Repeating data patterns across baseline variables and rare variables.
item_1_3 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "1.3", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_1_3, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Repeated Baselines in Rare Outcomes | Pass | No duplicates found. |
Item 1.4: Bias in the terminal (rightmost) digits.
This item plots the terminal digit for the selected continuous variables (avoid variables that tend to be rounded or that lack precision). Inspect the bar charts for biased or unexpected distribution.
item_1_4 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "1.4", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_1_4, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Terminal Digits | Displayed | Terminal digit plot generated. |
if("Terminal Digits" %in% names(result[["images"]])) {
result[["images"]][["Terminal Digits"]]
}
Item 2.1: Excessively homogeneous distribution of binary baseline variables, i.e. loss of independence between consecutive variables
In RCTs we expect binary baseline data to occur in a manner independent of previous values (i.e., to occur randomly). The runs test examines whether baseline data occurs in a random manner based on row order. Statistically significant (p < 0.05) chi-squared tests may be indicative of an integrity issue. Note: if row order is not sorted chronologically by randomisation date and time this test may be invalid.
item_2_1 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "2.1", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_2_1, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Consecutive Baseline Binary | Pass | No significant differences using χ² test. |
if(!is.null(result[["detail_tables"]][["2.1"]])) knitr::kable(result[["detail_tables"]][["2.1"]], row.names = FALSE)
| Variable | TotalAdjacentPairs | ObservedConsecutivePairs | ObservedNonConsecutivePairs | ExpectedConsecutivePairs | ExpectedNonConsecutivePairs | PValue | Significant |
|---|---|---|---|---|---|---|---|
| sex | 119 | 54 | 65 | 60 | 59 | 0.5165 | FALSE |
Item 2.2: Excessive imbalances between groups in continuous baseline variables.
Evaluates mean and standard deviation for key prognostic factors that are continuous, split by treatment group.
item_2_2 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "2.2", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_2_2, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Excessive Imbalances (Numeric) | Pass | No significant differences between groups. |
if(!is.null(result[["detail_tables"]][["2.2"]])) knitr::kable(result[["detail_tables"]][["2.2"]], row.names = FALSE)
| Variable | Group 1 MeanSD | Group 2 MeanSD | PValue |
|---|---|---|---|
| mat_age | 29 (7) | 30 (7) | 0.2887585 |
| GA_weeks | NA | NA | 0.2230359 |
| 28 | 3 (6.0%) | 6 (8.6%) | NA |
| 29 | 3 (6.0%) | 3 (4.3%) | NA |
| 30 | 3 (6.0%) | 10 (14%) | NA |
| 31 | 6 (12%) | 10 (14%) | NA |
| 32 | 19 (38%) | 13 (19%) | NA |
| 33 | 16 (32%) | 28 (40%) | NA |
| birthweight | 1,835 (421) | 1,757 (361) | 0.3928871 |
Item 2.3: Excessive imbalances in baseline categorical variables between groups.
This item assesses whether counts of baseline categorical variables are significantly different (p<0.05) between groups.
item_2_3 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "2.3", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_2_3, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Excessive Imbalances (Categorical) | Pass | No significant differences between groups. |
if(!is.null(result[["detail_tables"]][["2.3"]])) knitr::kable(result[["detail_tables"]][["2.3"]], row.names = FALSE)
| VariableOrLevel | Group 1 | Group 2 | PValue |
|---|---|---|---|
| sex | NA | NA | 0.6655309 |
| 1 | 27 (54%) | 35 (50%) | NA |
| 2 | 23 (46%) | 35 (50%) | NA |
Item 2.4: Significant difference in variance of continuous baseline variables between groups.
This item uses Levene’s test, which checks whether there is a significant difference in variability between groups.
item_2_4 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "2.4", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_2_4, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Differential Variability | Pass | No significant differences using Levene test. |
if(!is.null(result[["detail_tables"]][["2.4"]])) knitr::kable(result[["detail_tables"]][["2.4"]], row.names = FALSE)
| Variable | DF1 | DF2 | FStatistic | PValue | Significant |
|---|---|---|---|---|---|
| mat_age | 1 | 118 | 0.05673 | 0.8122 | FALSE |
| GA_weeks | 1 | 118 | 2.45300 | 0.1200 | FALSE |
| birthweight | 1 | 118 | 0.34280 | 0.5593 | FALSE |
Item 3.1: No association between variables known to be highly correlated.
This item plots the correlation between selected continuous variables and calculates the Pearson correlation coefficient (R) and associated p value. Check whether expected correlations are present.
item_3_1 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "3.1", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_3_1, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Unexpectedly Uncorrelated | Potential integrity issue | GA_weeks, birthweight |
correlation_plots <- setdiff(names(result[["images"]]), c("Terminal Digits", "Cumulative Allocation", "Days"))
for(plot_name in correlation_plots) {
print(result[["images"]][[plot_name]])
}
Item 4.1: Individual enrolment dates do not fit within study start and end dates.
This item examines whether randomisation dates for each individual fall within the enrolment period.
item_4_1 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "4.1", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_4_1, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Implausible Randomisation Date | Potential integrity issue | Participants 38, 49 |
item_4_1_dates <- result[["detail_tables"]][["4.1"]]
if(!is.null(item_4_1_dates)) {
knitr::kable(item_4_1_dates, row.names = FALSE)
}
| Study Start Date | Minimum Randomisation Date | Study End Date | Maximum Randomisation Date |
|---|---|---|---|
| 2019-03-02 | 2019-03-02 | 2020-08-09 | 2020-08-20 |
Item 4.2: Dates (or visits) are not in logical order.
Requires study-specific repeated visits or event-date variables; for example, a follow-up date occurring before enrollment.
item_4_2 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "4.2", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_4_2, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Logical Date Order | Skipped | This item needs to be checked mannually. Study-specific repeated visit or event-date variables are required. |
Item 5.1: Non-random allocation patterns: plot.
The plot below shows the cumulative number of allocated participants to each treatment arm by date of randomisation. We expect the cumulative number of randomised participants in each group to be similar if 1:1 allocation is used. Assess whether cumulative lines for treatment groups deviate from each other drastically. Note: the graphs will only appear when the date of randomisation is provided.
item_5_1 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "5.1", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_5_1, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Cumulative Allocation | Displayed | Cumulative allocation plot generated. |
if("Cumulative Allocation" %in% names(result[["images"]])) {
result[["images"]][["Cumulative Allocation"]]
}
Item 5.2: Non-random allocation patterns: statistical test
This item evaluates randomness of allocation using two approaches: a runs test and a chi-squared test comparing observed adjacent intervention runs with the expected number under random allocation. A statistically significant result (p<0.05) from either test may be indicative of an issue with randomisation.
item_5_2 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "5.2", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_5_2, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Allocation Pattern | Pass | Intervention treatment_cat has no statistically significant result using the adjacent-pairs chi-squared test or runs test. Allocation order was evaluated after sorting by randomisation date. |
item_5_2_test <- result[["detail_tables"]][["5.2"]]
if(!is.null(item_5_2_test)) {
knitr::kable(item_5_2_test, row.names = FALSE)
}
| Test | Variable | Statistic | PValue | Significant | Details | OrderBasis |
|---|---|---|---|---|---|---|
| Adjacent-pairs chi-squared test | treatment_cat | NA | 1.0000 | FALSE | Observed consecutive pairs: 61; expected consecutive pairs: 61; total adjacent pairs: 119 | Sorted by randomisation date |
| Runs test | treatment_cat | -0.06288 | 0.9499 | FALSE | Observed runs: 59; expected runs: 59.33 | Sorted by randomisation date |
Item 5.3: Unexpected imbalance in randomisation day of week.
The table below reports two chi-squared tests: one assessing whether randomisation is distributed evenly across weekdays overall, and one assessing whether randomisation day differs by intervention group. The graph below shows the number of participants randomised on each day of the week by group. We expect numbers to be balanced between groups for each weekday, and fewer enrolments on the weekend for non-urgent interventions. Note: the graph will only appear when the date of randomisation is provided.
item_5_3 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "5.3", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_5_3, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Allocation | Pass | No significant difference of allocations on days using Simulated chi-squared test used because expected counts were sparse (10000 replicates) . |
item_5_3_test <- result[["detail_tables"]][["5.3"]]
if(!is.null(item_5_3_test)) {
knitr::kable(item_5_3_test, row.names = FALSE)
}
| Test | Method | Statistic | DF | PValue | Significant |
|---|---|---|---|---|---|
| Chi-squared goodness-of-fit test of randomisation day overall | Pearson’s chi-squared test | 8.100 | 6 | 0.2309 | FALSE |
| Chi-squared test of randomisation day by intervention group | Simulated chi-squared test used because expected counts were sparse (10000 replicates) | 9.118 | NA | 0.1727 | FALSE |
if("Days" %in% names(result[["images"]])) {
result[["images"]][["Days"]]
}
Item 6.1: Inconsistent or illogical values across variables within individual participants.
Derive logic rules for each variable to be collected, e.g. date of hospital discharge = date of admission + days in hospital; if number of transfusions ≥1, then any transfusion = yes. Incorporate these rules into the package so that any breaches are displayed in the output
item_6_1 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "6.1", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_6_1, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Implausible Values | Pass | No values of mat_age less than 10 |
| Implausible Values | Pass | No values of mat_age greater than 50 |
| Implausible Values | Pass | No values of GA_weeks less than 22 |
| Implausible Values | Pass | No values of GA_weeks greater than 37 |
| Implausible Values | Potential integrity issue | 5 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 8 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 11 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 20 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 21 randomisation date on Sunday |
| Implausible Values | Potential integrity issue | 28 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 29 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 34 randomisation date on Sunday |
| Implausible Values | Potential integrity issue | 35 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 42 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 45 randomisation date on Sunday |
| Implausible Values | Potential integrity issue | 46 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 49 randomisation date on Sunday |
| Implausible Values | Potential integrity issue | 54 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 60 randomisation date on Sunday |
| Implausible Values | Potential integrity issue | 61 randomisation date on Sunday |
| Implausible Values | Potential integrity issue | 63 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 72 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 79 randomisation date on Sunday |
| Implausible Values | Potential integrity issue | 83 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 86 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 111 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 114 randomisation date on Sunday |
| Implausible Values | Potential integrity issue | 118 randomisation date on Saturday |
| Implausible Values | Potential integrity issue | 119 randomisation date on Saturday |
Item 7.1: IPD do not correspond to publications or reports.
The table below shows summary statistics for each variable provided in the IPD dataset, e.g. mean, median, range, etc. Manually cross‐check these against any published trial reports, including appendices and supplements. Record any inconsistencies identified, for example, discrepancies in summary variable values between IPD and publication, inclusion of participants in IPD that do not meet eligibility criteria in publication, published variables that are missing from IPD dataset. If data are provided for excluded participants, check whether reasons for exclusion are consistent with publication.
item_7_1 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "7.1", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_7_1, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| External Consistency | Displayed | Clinical summary table generated for comparison with publications or reports. |
if(!is.null(result[["summary_table"]])) {
result[["summary_table"]]
}
| Characteristic | 1 N = 501 |
2 N = 701 |
|---|---|---|
| sex | ||
| 1 | 27 (54%) | 35 (50%) |
| 2 | 23 (46%) | 35 (50%) |
| mat_age | 29 (7) | 30 (7) |
| GA_weeks | ||
| 28 | 3 (6.0%) | 6 (8.6%) |
| 29 | 3 (6.0%) | 3 (4.3%) |
| 30 | 3 (6.0%) | 10 (14%) |
| 31 | 6 (12%) | 10 (14%) |
| 32 | 19 (38%) | 13 (19%) |
| 33 | 16 (32%) | 28 (40%) |
| birthweight | 1,835 (421) | 1,757 (361) |
| IVH | ||
| 0 | 35 (73%) | 52 (74%) |
| 1 | 13 (27%) | 18 (26%) |
| Unknown | 2 | 0 |
| NEC | ||
| 0 | 46 (92%) | 67 (96%) |
| 1 | 4 (8.0%) | 3 (4.3%) |
| CLD | ||
| 0 | 40 (80%) | 45 (64%) |
| 1 | 1 (2.0%) | 16 (23%) |
| 2 | 9 (18%) | 7 (10%) |
| 3 | 0 (0%) | 2 (2.9%) |
| hospital_days | 30 (20) | 36 (24) |
| inf_death | ||
| 0 | 49 (98%) | 65 (94%) |
| 1 | 1 (2.0%) | 4 (5.8%) |
| Unknown | 0 | 1 |
| 1 n (%); Mean (SD) | ||
Item 8.1: Too few missing data or missing data are overly similar between groups.
The table below shows missingness by intervention group for outcome variables, including the percentage missing in each group.
item_8_1 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "8.1", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_8_1, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Missing Values by Intervention | Pass | No significant difference of missing values between allocations using χ² test. |
if(!is.null(result[["detail_tables"]][["8.1"]])) knitr::kable(result[["detail_tables"]][["8.1"]], row.names = FALSE)
| Variable | Missing Count 1 | Total 1 | Missing Percent 1 | Missing Count 2 | Total 2 | Missing Percent 2 | PValue | Significant |
|---|---|---|---|---|---|---|---|---|
| IVH | 2 | 50 | 4 | 0 | 70 | 0.0 | 0.3349 | FALSE |
| NEC | 0 | 50 | 0 | 0 | 70 | 0.0 | NA | NA |
| CLD | 0 | 50 | 0 | 0 | 70 | 0.0 | NA | NA |
| hospital_days | 0 | 50 | 0 | 0 | 70 | 0.0 | NA | NA |
| inf_death | 0 | 50 | 0 | 1 | 70 | 1.4 | 1.0000 | FALSE |
Item 8.2: Implausible event rates: outcomes and demographics.
The table below shows events and totals for dichotomous baseline variables and dichotomous common and rare outcomes, split by intervention group.
item_8_2 <- result[["check_table"]][result[["check_table"]][["ItemNumber"]] == "8.2", c("Item description", "Status", "Details"), drop = FALSE]
knitr::kable(item_8_2, row.names = FALSE)
| Item description | Status | Details |
|---|---|---|
| Implausible Event Rates | Displayed | Events and totals table generated for dichotomous baseline and outcome variables by intervention. |
if(!is.null(result[["detail_tables"]][["8.2"]])) knitr::kable(result[["detail_tables"]][["8.2"]], row.names = FALSE)
| Variable | EventLevel | Events 1 | Total 1 | Percent 1 | Events 2 | Total 2 | Percent 2 |
|---|---|---|---|---|---|---|---|
| sex | 2 | 23 | 50 | 46.0 | 35 | 70 | 50.0 |
| IVH | 1 | 13 | 48 | 27.1 | 18 | 70 | 25.7 |
| NEC | 1 | 4 | 50 | 8.0 | 3 | 70 | 4.3 |
| inf_death | 1 | 1 | 50 | 2.0 | 4 | 69 | 5.8 |
This vignette was executed on the following computing system:
sessionInfo()
## R version 4.5.0 (2025-04-11)
## Platform: aarch64-apple-darwin20
## Running under: macOS 26.5
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
##
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: Australia/Sydney
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] readxl_1.4.5 integrity_1.0.1 testthat_3.3.2
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.6 xfun_0.56 bslib_0.10.0 ggplot2_4.0.2
## [5] rstatix_0.7.3 lattice_0.22-9 vctrs_0.7.1 tools_4.5.0
## [9] generics_0.1.4 tibble_3.3.1 pkgconfig_2.0.3 Matrix_1.7-3
## [13] RColorBrewer_1.1-3 S7_0.2.1 desc_1.4.3 gt_1.3.0
## [17] lifecycle_1.0.5 compiler_4.5.0 farver_2.1.2 stringr_1.6.0
## [21] brio_1.1.5 janitor_2.2.1 carData_3.0-6 snakecase_0.11.1
## [25] litedown_0.9 htmltools_0.5.9 sass_0.4.10 yaml_2.3.12
## [29] Formula_1.2-5 pillar_1.11.1 car_3.1-5 ggpubr_0.6.3
## [33] jquerylib_0.1.4 tidyr_1.3.2 cachem_1.1.0 abind_1.4-8
## [37] nlme_3.1-168 commonmark_2.0.0 tidyselect_1.2.1 digest_0.6.39
## [41] stringi_1.8.7 gtsummary_2.5.0 dplyr_1.2.0 purrr_1.2.1
## [45] labeling_0.4.3 splines_4.5.0 rprojroot_2.1.1 fastmap_1.2.0
## [49] grid_4.5.0 cli_3.6.6 magrittr_2.0.4 cards_0.7.1
## [53] dichromat_2.0-0.1 pkgbuild_1.4.8 broom_1.0.12 withr_3.0.2
## [57] scales_1.4.0 backports_1.5.0 cardx_0.3.2 lubridate_1.9.5
## [61] timechange_0.4.0 rmarkdown_2.30 otel_0.2.0 ggsignif_0.6.4
## [65] cellranger_1.1.0 evaluate_1.0.5 knitr_1.51 markdown_2.0
## [69] mgcv_1.9-4 rlang_1.2.0 glue_1.8.0 xml2_1.5.2
## [73] pkgload_1.5.2 rstudioapi_0.18.0 jsonlite_2.0.0 R6_2.6.1
## [77] fs_2.1.0