Background on Intersectionality
Disaggregation and disproportionate impact (DI) analysis allows
analysts to identify student groups in need of support, helping the
institution prioritize resources in order to close equity gaps. As can
be seen in the Scaling DI vignette, one could repeat DI
calculations over various success variables, group (disaggregation)
variables, and cohort variables using the di_iterate
function from the DisImpact
package. For example, one could
choose to repeat the disaggregation by multiple demographic variables
(eg, ethnicity, gender, low income status, foster youth status,
undocumented status, and LGBTQIA+ status), and for each of the
disaggregation, identify the groups that are disproportionately impacted
on each outcome.
Conducting a DI analysis as described is a good first step in understanding student needs. It however ignores the concept of intersectionality, that considering each demographic variable individually leaves out the intersections of identity, where the level of disproportionate impact may be compounded. For example, “men of color” and “African American LGBTQIA+” communities be even more disproportionately impacted on outcomes than what’s reported when each variable is disaggregated on thier own (ethnicity, gender, and LGBTQIA+).
This vignette describes how one might account for intersectionality
using the DisImpact
package.
Intersectionality Using DisImpact
First, let’s conduct a DI analysis on the student_equity
data set using a few demographic variables, as described in the Scaling
DI vignette.
# Load some necessary packages
library(dplyr)
library(stringr)
library(ggplot2)
library(scales)
library(forcats)
library(DisImpact)
# Load student equity data set
data(student_equity)
# Caclulate DI over several scenarios
<- di_iterate(data=student_equity
df_di_summary success_vars=c('Math', 'English', 'Transfer')
, group_vars=c('Ethnicity', 'Gender')
, cohort_vars=c('Cohort_Math', 'Cohort_English', 'Cohort')
, scenario_repeat_by_vars=c('Ed_Goal', 'College_Status')
, )
Incorporating intersectionality is actually quite straightforward
using the DisImpact
impact. First, create a new variable
that captures the intersection of interest. Then pass this as any other
demographic variable to the group_vars
argument of
di_iterate
. The following code illustrates the intersection
of ethnicity and gender.
# Create new variable
<- student_equity %>%
student_equity_intersection mutate(`Ethnicity + Gender`=paste0(Ethnicity, ', ', Gender))
# Check
table(student_equity_intersection$`Ethnicity + Gender`, useNA='ifany')
##
## Asian, Female Asian, Male Asian, Other
## 2936 2950 114
## Black, Female Black, Male Black, Other
## 1021 953 26
## Hispanic, Female Hispanic, Male Hispanic, Other
## 2002 1920 78
## Multi-Ethnicity, Female Multi-Ethnicity, Male Multi-Ethnicity, Other
## 509 467 24
## Native American, Female Native American, Male Native American, Other
## 105 91 4
## White, Female White, Male White, Other
## 3285 3385 130
# Run DI, then selet rows of interest (for Ethnicity + Gender, remove the Other gender)
<- di_iterate(data=student_equity_intersection # Specify new data set
df_di_summary_intersection success_vars=c('Math', 'English', 'Transfer')
, group_vars=c('Ethnicity', 'Gender', 'Ethnicity + Gender') # Add new column name
, cohort_vars=c('Cohort_Math', 'Cohort_English', 'Cohort')
, scenario_repeat_by_vars=c('Ed_Goal', 'College_Status')
, %>%
) filter(!(disaggregation=='Ethnicity + Gender') | !str_detect(group, ', Other')) # Remove Ethnicity + Gender groups that correspond to
Visualizing in Dashboard Platform
Once a DI summary data set with intersections of interest is available, it could be used in dashboard development as described in the Scaling DI vignette.
# Disaggregation: Ethnicity
%>%
df_di_summary_intersection filter(Ed_Goal=='- All', College_Status=='- All', success_variable=='Math', disaggregation=='Ethnicity') %>%
select(cohort, group, n, pct, di_indicator_ppg, di_indicator_prop_index, di_indicator_80_index) %>%
as.data.frame
## cohort group n pct di_indicator_ppg
## 1 2017 Asian 1406 0.8968706 0
## 2 2017 Black 421 0.7862233 1
## 3 2017 Hispanic 815 0.7325153 1
## 4 2017 Multi-Ethnicity 211 0.8293839 0
## 5 2017 Native American 45 0.9333333 0
## 6 2017 White 1500 0.8773333 0
## 7 2018 Asian 2212 0.9235986 0
## 8 2018 Black 684 0.7441520 1
## 9 2018 Hispanic 1386 0.7366522 1
## 10 2018 Multi-Ethnicity 369 0.7940379 1
## 11 2018 Native American 68 0.8088235 0
## 12 2018 White 2576 0.8819876 0
## 13 2019 Asian 1429 0.9083275 0
## 14 2019 Black 411 0.7834550 1
## 15 2019 Hispanic 786 0.7404580 1
## 16 2019 Multi-Ethnicity 225 0.8000000 0
## 17 2019 Native American 47 0.8297872 0
## 18 2019 White 1558 0.8896021 0
## 19 2020 Asian 573 0.9301920 0
## 20 2020 Black 180 0.7333333 1
## 21 2020 Hispanic 304 0.7171053 1
## 22 2020 Multi-Ethnicity 99 0.7575758 0
## 23 2020 Native American 14 0.6428571 0
## 24 2020 White 610 0.8819672 0
## di_indicator_prop_index di_indicator_80_index
## 1 0 0
## 2 0 0
## 3 0 1
## 4 0 0
## 5 0 0
## 6 0 0
## 7 0 0
## 8 0 0
## 9 0 1
## 10 0 0
## 11 0 0
## 12 0 0
## 13 0 0
## 14 0 0
## 15 0 0
## 16 0 0
## 17 0 0
## 18 0 0
## 19 0 0
## 20 0 1
## 21 0 1
## 22 0 0
## 23 1 1
## 24 0 0
# Disaggregation: Gender
%>%
df_di_summary_intersection filter(Ed_Goal=='- All', College_Status=='- All', success_variable=='Math', disaggregation=='Ethnicity') %>%
select(cohort, group, n, pct, di_indicator_ppg, di_indicator_prop_index, di_indicator_80_index) %>%
as.data.frame
## cohort group n pct di_indicator_ppg
## 1 2017 Asian 1406 0.8968706 0
## 2 2017 Black 421 0.7862233 1
## 3 2017 Hispanic 815 0.7325153 1
## 4 2017 Multi-Ethnicity 211 0.8293839 0
## 5 2017 Native American 45 0.9333333 0
## 6 2017 White 1500 0.8773333 0
## 7 2018 Asian 2212 0.9235986 0
## 8 2018 Black 684 0.7441520 1
## 9 2018 Hispanic 1386 0.7366522 1
## 10 2018 Multi-Ethnicity 369 0.7940379 1
## 11 2018 Native American 68 0.8088235 0
## 12 2018 White 2576 0.8819876 0
## 13 2019 Asian 1429 0.9083275 0
## 14 2019 Black 411 0.7834550 1
## 15 2019 Hispanic 786 0.7404580 1
## 16 2019 Multi-Ethnicity 225 0.8000000 0
## 17 2019 Native American 47 0.8297872 0
## 18 2019 White 1558 0.8896021 0
## 19 2020 Asian 573 0.9301920 0
## 20 2020 Black 180 0.7333333 1
## 21 2020 Hispanic 304 0.7171053 1
## 22 2020 Multi-Ethnicity 99 0.7575758 0
## 23 2020 Native American 14 0.6428571 0
## 24 2020 White 610 0.8819672 0
## di_indicator_prop_index di_indicator_80_index
## 1 0 0
## 2 0 0
## 3 0 1
## 4 0 0
## 5 0 0
## 6 0 0
## 7 0 0
## 8 0 0
## 9 0 1
## 10 0 0
## 11 0 0
## 12 0 0
## 13 0 0
## 14 0 0
## 15 0 0
## 16 0 0
## 17 0 0
## 18 0 0
## 19 0 0
## 20 0 1
## 21 0 1
## 22 0 0
## 23 1 1
## 24 0 0
# Disaggregation: Ethnicity + Gender
%>%
df_di_summary_intersection filter(Ed_Goal=='- All', College_Status=='- All', success_variable=='Math', disaggregation=='Ethnicity + Gender') %>%
select(cohort, group, n, pct, di_indicator_ppg, di_indicator_prop_index, di_indicator_80_index) %>%
as.data.frame
## cohort group n pct di_indicator_ppg
## 1 2017 Asian, Female 660 0.9075758 0
## 2 2017 Asian, Male 721 0.8862691 0
## 3 2017 Black, Female 212 0.7594340 1
## 4 2017 Black, Male 202 0.8168317 0
## 5 2017 Hispanic, Female 413 0.7312349 1
## 6 2017 Hispanic, Male 385 0.7272727 1
## 7 2017 Multi-Ethnicity, Female 95 0.8526316 0
## 8 2017 Multi-Ethnicity, Male 112 0.8125000 0
## 9 2017 Native American, Female 28 0.8928571 0
## 10 2017 Native American, Male 14 1.0000000 0
## 11 2017 White, Female 739 0.8646820 0
## 12 2017 White, Male 734 0.8869210 0
## 13 2018 Asian, Female 1111 0.9225923 0
## 14 2018 Asian, Male 1056 0.9270833 0
## 15 2018 Black, Female 340 0.7500000 1
## 16 2018 Black, Male 334 0.7395210 1
## 17 2018 Hispanic, Female 681 0.7577093 1
## 18 2018 Hispanic, Male 673 0.7161961 1
## 19 2018 Multi-Ethnicity, Female 195 0.8358974 0
## 20 2018 Multi-Ethnicity, Male 163 0.7423313 1
## 21 2018 Native American, Female 33 0.9090909 0
## 22 2018 Native American, Male 34 0.7058824 0
## 23 2018 White, Female 1234 0.8800648 0
## 24 2018 White, Male 1285 0.8840467 0
## 25 2019 Asian, Female 704 0.9232955 0
## 26 2019 Asian, Male 698 0.8954155 0
## 27 2019 Black, Female 213 0.7934272 0
## 28 2019 Black, Male 194 0.7731959 1
## 29 2019 Hispanic, Female 386 0.7616580 1
## 30 2019 Hispanic, Male 390 0.7230769 1
## 31 2019 Multi-Ethnicity, Female 111 0.8018018 0
## 32 2019 Multi-Ethnicity, Male 109 0.7889908 0
## 33 2019 Native American, Female 22 0.8181818 0
## 34 2019 Native American, Male 25 0.8400000 0
## 35 2019 White, Female 753 0.9003984 0
## 36 2019 White, Male 782 0.8785166 0
## 37 2020 Asian, Female 283 0.9363958 0
## 38 2020 Asian, Male 280 0.9214286 0
## 39 2020 Black, Female 90 0.6222222 1
## 40 2020 Black, Male 87 0.8390805 0
## 41 2020 Hispanic, Female 147 0.7278912 1
## 42 2020 Hispanic, Male 149 0.7248322 1
## 43 2020 Multi-Ethnicity, Female 59 0.7966102 0
## 44 2020 Multi-Ethnicity, Male 39 0.6923077 0
## 45 2020 Native American, Female 8 0.6250000 0
## 46 2020 Native American, Male 6 0.6666667 0
## 47 2020 White, Female 295 0.8677966 0
## 48 2020 White, Male 304 0.8947368 0
## di_indicator_prop_index di_indicator_80_index
## 1 0 0
## 2 0 0
## 3 0 1
## 4 0 0
## 5 0 1
## 6 0 1
## 7 0 0
## 8 0 0
## 9 0 0
## 10 0 0
## 11 0 0
## 12 0 0
## 13 0 0
## 14 0 0
## 15 0 1
## 16 0 1
## 17 0 1
## 18 0 1
## 19 0 0
## 20 0 1
## 21 0 0
## 22 0 1
## 23 0 0
## 24 0 0
## 25 0 0
## 26 0 0
## 27 0 1
## 28 0 1
## 29 0 1
## 30 0 1
## 31 0 0
## 32 0 1
## 33 0 0
## 34 0 0
## 35 0 0
## 36 0 0
## 37 0 0
## 38 0 0
## 39 1 1
## 40 0 0
## 41 0 1
## 42 0 1
## 43 0 1
## 44 0 1
## 45 1 1
## 46 1 1
## 47 0 0
## 48 0 0
# Disaggregation: Ethnicity
%>%
df_di_summary_intersection filter(Ed_Goal=='- All', College_Status=='- All', success_variable=='Math', disaggregation=='Ethnicity') %>%
select(cohort, group, n, pct, di_indicator_ppg, di_indicator_prop_index, di_indicator_80_index) %>%
mutate(group=factor(group) %>% fct_reorder(desc(pct))) %>%
ggplot(data=., mapping=aes(x=factor(cohort), y=pct, group=group, color=group)) +
geom_point(aes(size=factor(di_indicator_ppg, levels=c(0, 1), labels=c('Not DI', 'DI')))) +
geom_line() +
xlab('Cohort') +
ylab('Rate') +
theme_bw() +
scale_color_manual(values=c('#1b9e77', '#d95f02', '#7570b3', '#e7298a', '#66a61e', '#e6ab02'), name='Ethnicity') +
labs(size='Disproportionate Impact') +
scale_y_continuous(labels = percent, limits=c(0, 1)) +
ggtitle('Dashboard drop-down selections:', subtitle=paste0("Ed Goal = '- All' | College Status = '- All' | Outcome = 'Math' | Disaggregation = 'Ethnicity'"))
## Warning: Using size for a discrete variable is not advised.
# Disaggregation: Gender
%>%
df_di_summary_intersection filter(Ed_Goal=='- All', College_Status=='- All', success_variable=='Math', disaggregation=='Gender') %>%
select(cohort, group, n, pct, di_indicator_ppg, di_indicator_prop_index, di_indicator_80_index) %>%
mutate(group=factor(group) %>% fct_reorder(desc(pct))) %>%
ggplot(data=., mapping=aes(x=factor(cohort), y=pct, group=group, color=group)) +
geom_point(aes(size=factor(di_indicator_ppg, levels=c(0, 1), labels=c('Not DI', 'DI')))) +
geom_line() +
xlab('Cohort') +
ylab('Rate') +
theme_bw() +
scale_color_manual(values=c('#e7298a', '#66a61e', '#e6ab02'), name='Gender') +
labs(size='Disproportionate Impact') +
scale_y_continuous(labels = percent, limits=c(0, 1)) +
ggtitle('Dashboard drop-down selections:', subtitle=paste0("Ed Goal = '- All' | College Status = '- All' | Outcome = 'Math' | Disaggregation = 'Gender'"))
## Warning: Using size for a discrete variable is not advised.
# Disaggregation: Ethnicity + Gender
%>%
df_di_summary_intersection filter(Ed_Goal=='- All', College_Status=='- All', success_variable=='Math', disaggregation=='Ethnicity + Gender') %>%
select(cohort, group, n, pct, di_indicator_ppg, di_indicator_prop_index, di_indicator_80_index) %>%
mutate(group=factor(group) %>% fct_reorder(desc(pct))) %>%
ggplot(data=., mapping=aes(x=factor(cohort), y=pct, group=group, color=group)) +
geom_point(aes(size=factor(di_indicator_ppg, levels=c(0, 1), labels=c('Not DI', 'DI')))) +
geom_line() +
xlab('Cohort') +
ylab('Rate') +
theme_bw() +
scale_color_manual(values=c('#a6cee3', '#1f78b4', '#b2df8a', '#33a02c', '#fb9a99', '#e31a1c', '#fdbf6f', '#ff7f00', '#cab2d6', '#6a3d9a', '#ffff99', '#b15928'), name='Ethnicity + Gender') +
labs(size='Disproportionate Impact') +
scale_y_continuous(labels = percent, limits=c(0, 1)) +
ggtitle('Dashboard drop-down selections:', subtitle=paste0("Ed Goal = '- All' | College Status = '- All' | Outcome = 'Math' | Disaggregation = 'Ethnicity + Gender'"))
## Warning: Using size for a discrete variable is not advised.
Appendix: R and R Package Versions
This vignette was generated using an R session with the following packages. There may be some discrepancies when the reader replicates the code caused by version mismatch.
sessionInfo()
## R version 4.0.2 (2020-06-22)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19044)
##
## Matrix products: default
##
## locale:
## [1] LC_COLLATE=C
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] forcats_0.5.0 scales_1.1.1 ggplot2_3.3.2 stringr_1.4.0
## [5] knitr_1.39 dplyr_1.0.8 DisImpact_0.0.21
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.8.3 highr_0.9 pillar_1.7.0 bslib_0.3.1
## [5] compiler_4.0.2 jquerylib_0.1.4 sets_1.0-21 prettydoc_0.4.1
## [9] tools_4.0.2 digest_0.6.25 gtable_0.3.0 jsonlite_1.7.0
## [13] evaluate_0.15 lifecycle_1.0.1 tibble_3.1.6 fstcore_0.9.12
## [17] pkgconfig_2.0.3 rlang_1.0.1 DBI_1.1.0 cli_3.2.0
## [21] parallel_4.0.2 yaml_2.3.5 xfun_0.30 fastmap_1.1.0
## [25] withr_2.5.0 duckdb_0.5.0 generics_0.1.2 vctrs_0.3.8
## [29] sass_0.4.1 grid_4.0.2 tidyselect_1.1.2 data.table_1.14.3
## [33] glue_1.6.1 R6_2.3.0 fansi_1.0.2 rmarkdown_2.14
## [37] farver_2.0.3 tidyr_1.2.0 purrr_0.3.4 blob_1.2.1
## [41] magrittr_2.0.2 htmltools_0.5.2 ellipsis_0.3.2 fst_0.9.8
## [45] assertthat_0.2.1 colorspace_1.4-1 collapse_1.8.8 labeling_0.3
## [49] utf8_1.2.2 stringi_1.4.6 munsell_0.5.0 crayon_1.5.0