This vignette demonstrates the causaldef framework on two famous datasets in causal inference: 1. Lalonde’s Job Training Data: Validating deficiency against an experimental application. 2. Right Heart Catheterization (RHC): Handling high-dimensional confounding and uncertainty quantification.
library(causaldef)
library(stats)
if (!exists("deparse1", envir = baseenv())) {
deparse1 <- function(expr, collapse = " ", width.cutoff = 500L, ...) {
paste(deparse(expr, width.cutoff, ...), collapse = collapse)
}
}The Lalonde dataset allows us to verify our methods because it contains both an experimental control group and observational comparison groups.
We define the “True” experiment (\(E_{target}\)) using the randomized data, and the “Observational” experiment (\(E_{obs}\)) using the CPS controls.
data("nsw_benchmark")
head(nsw_benchmark)
#> treat age education black hispanic married nodegree re74 re75 re78
#> 1 1 37 11 1 0 1 1 0 0 9930.0459
#> 2 1 22 9 0 1 0 1 0 0 3595.8940
#> 3 1 30 12 1 0 0 0 0 0 24909.4492
#> 4 1 27 11 1 0 0 1 0 0 7506.1460
#> 5 1 33 8 1 0 0 1 0 0 289.7899
#> 6 1 22 9 1 0 0 1 0 0 4056.4939
#> sample_id
#> 1 nsw_treated
#> 2 nsw_treated
#> 3 nsw_treated
#> 4 nsw_treated
#> 5 nsw_treated
#> 6 nsw_treated
# 1. The Experimental Benchmark (Gold Standard)
nsw_exp <- subset(nsw_benchmark, sample_id %in% c("nsw_treated", "nsw_control"))
# True Experimental Estimate (Unadjusted difference in means is valid due to randomization)
true_att <- mean(nsw_exp$re78[nsw_exp$treat == 1]) - mean(nsw_exp$re78[nsw_exp$treat == 0])
cat("True Experimental ATT:", round(true_att, 2), "\n")
#> True Experimental ATT: 1794.34
# 2. The Observational Challenge (NSW Treated + CPS Control)
nsw_obs <- subset(nsw_benchmark, sample_id %in% c("nsw_treated", "cps_control"))
# Naive Observational Estimate (Unadjusted)
naive_est <- mean(nsw_obs$re78[nsw_obs$treat == 1]) - mean(nsw_obs$re78[nsw_obs$treat == 0])
cat("Naive Observational Estimate:", round(naive_est, 2), "\n")
#> Naive Observational Estimate: -8497.52
cat("Bias:", round(naive_est - true_att, 2), "\n")
#> Bias: -10291.86The massive bias (negative effect instead of positive) confirms the difficulty of this problem.
We now calculate a deficiency proxy \(\delta(E_{obs}, E_{do})\) using the available covariates.
covariates <- c("age", "education", "black", "hispanic", "married", "nodegree", "re74", "re75")
# Specification
spec_lalonde <- causal_spec(
data = nsw_obs,
treatment = "treat",
outcome = "re78",
covariates = covariates
)
#> Warning: 14510 observations have extreme propensity scores
#> ✔ Created causal specification: n=16177, 8 covariate(s)
# Estimate deficiency using Propensity Score Weighting (IPTW)
res_lalonde <- estimate_deficiency(
spec_lalonde,
methods = c("unadjusted", "iptw"),
n_boot = 50 # Kept low for vignette speed
)
#> ℹ Estimating deficiency: unadjusted
#> ℹ Estimating deficiency: iptw
print(res_lalonde)
#>
#> -- Deficiency Proxy Estimates (PS-TV) ------
#>
#> Method Delta SE CI Quality
#> unadjusted 0.7985 0.0266 [0.7408, 0.844] Insufficient (Red)
#> iptw 0.0270 0.0046 [0.0213, 0.0382] Excellent (Green)
#> Note: delta is a propensity-score TV proxy (overlap/balance diagnostic).
#>
#> Best method: iptw (delta = 0.027 )
plot(res_lalonde)The proxy \(\delta\) summarizes the distance between the reweighted observational distribution and the target randomized experiment. A lower \(\delta\) indicates that the reweighting more closely reconstructed the experimental conditions under the PS-TV diagnostic.
This dataset involves high-dimensional confounding (50+ covariates). We use it to demonstrate the Confounding Frontier and Policy Regret Bounds.
data("rhc")
# Convert treatment to binary (0/1) for causaldef
# Assuming 'swang1' is the treatment column, usually "RHC" vs "No RHC"
# Check levels first (simulated check here, dataset structure assumed from documentation)
if (is.factor(rhc$swang1)) {
rhc$treat_bin <- as.numeric(rhc$swang1) - 1 # Assuming factor levels order
} else {
rhc$treat_bin <- rhc$swang1
}
# Outcome: 30-day survival (inverse of dth30) or just standard outcome
# Let's say outcome is dth30 (binary).
if (is.factor(rhc$dth30)) {
rhc$outcome_bin <- as.numeric(rhc$dth30) - 1
} else {
rhc$outcome_bin <- rhc$dth30
}
# Select a subset of covariates for demonstration (to keep it fast)
# Real analysis would use all 50+.
rhc_covars <- c("age", "sex", "race", "aps1", "cat1")
# Note: 'aps1' is APACHE III score
spec_rhc <- causal_spec(
data = rhc,
treatment = "treat_bin",
outcome = "outcome_bin",
covariates = rhc_covars
)
#> ✔ Created causal specification: n=5735, 5 covariate(s)res_rhc <- estimate_deficiency(spec_rhc, methods = "iptw", n_boot = 0)
#> ℹ Estimating deficiency: iptw
print(res_rhc)
#>
#> -- Deficiency Proxy Estimates (PS-TV) ------
#>
#> Method Delta SE Quality
#> iptw 0.0184 - Excellent (Green)
#> Note: delta is a propensity-score TV proxy (overlap/balance diagnostic).
#>
#> Best method: iptw (delta = 0.0184 )The “Safety Floor” tells us the minimum regret we risk by making a decision based on this imperfect observational evidence.
# Utility: Let's say preventing death has utility 1, death has utility 0.
# The outcome is death (1) or survival (0).
# We want to minimize outcome (death).
# This is equivalent to utility range [0, 1].
bounds_rhc <- policy_regret_bound(res_rhc, utility_range = c(0, 1), method = "iptw")
#> ℹ Transfer penalty: 0.0184 (delta = 0.0184)
print(bounds_rhc)
#>
#> -- Policy Regret Bounds -------------------------------------------------
#>
#> * Deficiency delta: 0.0184
#> * Delta mode: point
#> * Delta method: iptw
#> * Delta selection: pre-specified method
#> * Utility range: [0, 1]
#> * Transfer penalty: 0.0184 (additive regret upper bound)
#> * Minimax floor: 0.0092 (worst-case lower bound)
#>
#> Note: this is a plug-in bound using a deficiency proxy rather than an identified exact deficiency.
#>
#> Interpretation: Transfer penalty is 1.8 % of utility range given deltaThe result typically shows a low safety floor (e.g., < 0.05), suggesting that the observational findings are actionable unless the decision hinges on a very small utility difference.
Sensitivity analysis: If we missed a confounder \(U\) correlated with treatment by \(\alpha\) and outcome by \(\gamma\), how much would our deficiency increase?
frontier <- confounding_frontier(spec_rhc, grid_size = 30)
#> ℹ Computing benchmarks for observed covariates...
#> ✔ Computed confounding frontier: 30x30 grid
plot(frontier) The plot shows the “safe” region (low confounding) versus “unsafe” region. If we suspect unmeasured confounders (like specific genetic factors) have strength \(|\alpha \gamma| > 0.1\), the yellow/red zones indicate high deficiency.