In this example we’re going to summarise cohort diagnostics results for cohorts of individuals with an ankle sprain, ankle fracture, forearm fracture, or a hip fracture using the Eunomia synthetic data.
Again, we’ll begin by creating our study cohorts.
library(CDMConnector)
library(CohortConstructor)
library(CodelistGenerator)
library(CohortCharacteristics)
library(CohortSurvival)
library(PhenotypeR)
library(dplyr)
library(ggplot2)
con <- DBI::dbConnect(duckdb::duckdb(), 
                      CDMConnector::eunomiaDir("synpuf-1k", "5.3"))
cdm <- CDMConnector::cdmFromCon(con = con, 
                                cdmName = "Eunomia Synpuf",
                                cdmSchema   = "main",
                                writeSchema = "main", 
                                achillesSchema = "main")
cdm$injuries <- conceptCohort(cdm = cdm,
  conceptSet = list(
    "ankle_sprain" = 81151,
    "ankle_fracture" = 4059173,
    "forearm_fracture" = 4278672,
    "hip_fracture" = 4230399
  ),
  name = "injuries")We can run cohort diagnostics analyses for each of our overall cohorts like so:
Cohort diagnostics builds on CohortCharacteristics and CohortSurvival R packages to perform the following analyses on our cohorts:
survival = TRUE,
summarises the survival until the event of death (if death table is
present in the cdm) usingThe analyses cohort characteristics, cohort age distribution, cohort large scale characteristics, and cohort survival will also be performed (by default) in a matched cohort. The matched cohort will be created based on year of birth and sex (see matchCohorts() function in CohortConstructor package). This can help us to compare the results in our cohorts to those obtain in the matched cohort, representing the general population. Notice that the analysis will be performed in: (1) the original cohort, (2) individuals in the original cohorts that have a match (named the sampled cohort), and (3) the matched cohort.
As the matched process can be computationally expensive, specially
when the cohorts are very big, we can reduce the matching analysis to a
subset of participants from the original cohort using the
matchedSample parameter. Alternatively, if we do not want
to create the matched cohorts, we can use
matchedSample = 0.
The output of cohortDiagnostics() will be a summarised
result table.
We will now use different functions to visualise the results generated by CohortDiagnostics. Notice that these functions are from CohortCharacteristics and CohortSurvival R packages packages. ### Cohort counts