Introduction to causaldef

causaldef implements Le Cam deficiency theory to provide decision-theoretic diagnostics for causal inference.

Unlike standard sensitivity analysis tools that output abstract bias parameters (like E-values or partial R²), causaldef quantifies the information gap (\(\delta\)) between the data you have throughout your observational study and the data you wish you had (a perfect randomized trial).

The Core Workflow

Specify: Define your causal problem using causal_spec().
Estimate: Calculate a deficiency proxy \(\delta\) for your adjustment strategy using estimate_deficiency().
Diagnose: Test assumptions with nc_diagnostic() or confounding_frontier().
Decide: Compute safety bounds for decision making using policy_regret_bound() with a pre-specified adjustment method.

Example: Gene Perturbation Analysis

We’ll use a dataset of gene expression outcomes from a CRISPR knockout experiment. This case study illustrates how to detect unobserved confounding using negative controls.

library(causaldef)
library(ggplot2) # Ensure ggplot2 is loaded

# DAG Helper (using ggplot2 only)
plot_dag <- function(coords, edges, title = NULL) {
  edges_df <- merge(edges, coords, by.x = "from", by.y = "name")
  colnames(edges_df)[c(3,4)] <- c("x_start", "y_start")
  edges_df <- merge(edges_df, coords, by.x = "to", by.y = "name")
  colnames(edges_df)[c(5,6)] <- c("x_end", "y_end")
  
  ggplot2::ggplot(coords, ggplot2::aes(x = x, y = y)) +
    ggplot2::geom_segment(data = edges_df, ggplot2::aes(x = x_start, y = y_start, xend = x_end, yend = y_end),
                 arrow = ggplot2::arrow(length = ggplot2::unit(0.3, "cm"), type = "closed"), 
                 color = "gray40", size = 1, alpha = 0.8) +
    ggplot2::geom_point(size = 14, color = "white", fill = "#8FBC8F", shape = 21, stroke = 1.5) + # DarkSeaGreen
    ggplot2::geom_text(ggplot2::aes(label = name), fontface = "bold", size = 2.5, color = "white") +
    ggplot2::ggtitle(title) + 
    ggplot2::theme_void(base_size = 14) +
    ggplot2::theme(plot.title = ggplot2::element_text(hjust = 0.5, face = "bold", margin = ggplot2::margin(b = 10))) +
    ggplot2::coord_fixed()
}

data(gene_perturbation)

head(gene_perturbation)
#>   sample_id batch library_size knockout_status target_expression
#> 1        S1     3       948877        Knockout             8.643
#> 2        S2     4       984900         Control            11.545
#> 3        S3     4       868332         Control            12.325
#> 4        S4     2      1052105         Control            12.941
#> 5        S5     2       880271         Control             8.289
#> 6        S6     2      1001246        Knockout             9.400
#>   housekeeping_gene
#> 1             6.594
#> 2             9.424
#> 3             8.568
#> 4             9.224
#> 5             6.380
#> 6             6.706

# Visualize the Structural Assumption
coords <- data.frame(
  name = c("Unobserved", "Knockout", "TargetExpr", "Housekp"),
  x = c(0, -1.5, 1.5, 0),
  y = c(1, 0, 0, -1)
)
edges <- data.frame(
  from = c("Unobserved", "Unobserved", "Unobserved", "Knockout"),
  to =   c("Knockout", "TargetExpr", "Housekp", "TargetExpr")
)
plot_dag(coords, edges, title = "Gene Perturbation DAG")

1. Specification

We are interested in the effect of knockout_status on target_expression, adjusting for technical confounders batch and library_size.

spec <- causal_spec(
  data = gene_perturbation,
  treatment = "knockout_status",
  outcome = "target_expression",
  covariates = c("batch", "library_size"),
  negative_control = "housekeeping_gene"
)
#> ✔ Created causal specification: n=500, 2 covariate(s)

print(spec)
#> 
#> -- Causal Specification --------------------------------------------------
#> 
#> * Treatment: knockout_status ( binary )
#> * Outcome: target_expression ( continuous )
#> * Covariates: batch, library_size 
#> * Sample size: 500 
#> * Estimand: ATE 
#> * Negative control: housekeeping_gene

2. Deficiency Estimation

How close can we get to a randomized experiment using these covariates? The following analysis compares the distributional distance (\(\delta\)) of the unadjusted observational data against the reweighted data using Inverse Probability Weighting (IPTW). Compare the deficiency values to determine the information gap.

# Compare unadjusted vs IPTW
results <- estimate_deficiency(
  spec,
  methods = c("unadjusted", "iptw"),
  n_boot = 50  # Low for vignette speed
)
#> ℹ Inferred treatment value: Knockout
#> ℹ Estimating deficiency: unadjusted
#> ℹ Estimating deficiency: iptw

print(results)
#> 
#> -- Deficiency Proxy Estimates (PS-TV) ------
#> 
#>      Method  Delta     SE               CI            Quality
#>  unadjusted 0.1424 0.0303 [0.1399, 0.2473] Insufficient (Red)
#>        iptw 0.0368 0.0112  [0.031, 0.0703]  Excellent (Green)
#> Note: delta is a propensity-score TV proxy (overlap/balance diagnostic).
#> 
#> Best method: iptw (delta = 0.0368 )
plot(results, type = "bar")

Interpreting the PS-TV Deficiency Proxy (\(\delta\)):

The current proxy \(\delta\) quantifies how far the reweighted propensity-score distribution is from the target population:

\(\delta\) = 0: Perfect identification – The weighted propensity distribution matches the target population perfectly (TV distance = 0).
\(\delta\) < 0.05: Excellent – The distribution is indistinguishable from a randomized trial for most practical purposes.
0.05 \(\leq\) \(\delta\) < 0.15: Good – Reasonable approximation, but some distributional mismatch exists.
\(\delta\) \(\geq\) 0.15: Caution – Substantial information gap (TV distance > 0.15). The “safety floor” for decision making will be significant.

What to do with these results:

Choose the method with the lowest \(\delta\) (here, IPTW is better than unadjusted)
Check the confidence interval (if bootstrap was used) to assess uncertainty
If \(\delta\) is large, consider: (a) adding more covariates, (b) using a different adjustment method, or (c) acknowledging the limitation

3. Diagnose with Negative Controls

Is the adjustment sufficient? We use the housekeeping gene (which shouldn’t change) to test for residual confounding.

nc_test <- nc_diagnostic(spec, method = "iptw")
#> ℹ Using kappa = 1 (conservative). Consider domain-specific estimation or sensitivity analysis via kappa_range.
#> ✔ No evidence against causal assumptions (p = 0.099502 )
print(nc_test)
#> 
#> -- Negative Control Diagnostic ----------------------------------------
#> 
#> * screening statistic (weighted corr): 0.0821 
#> * delta_NC (association proxy): 0.0821 
#> * delta bound (under kappa alignment): 0.0821 (kappa = 1 )
#> * screening p-value: 0.099502 
#> * screening method: weighted_permutation_correlation 
#> 
#> RESULT: NOT REJECTED. This is a screening result, not proof that confounding is absent.
#> NOTE: Your effect estimate must exceed the Noise Floor (delta_bound) to be meaningful.

Interpreting Negative Control Results:

The negative control diagnostic tests whether your adjustment removes ALL confounding:

p-value > 0.05 (Not Falsified): [PASS] – No evidence of residual confounding. Your adjustment appears adequate.
p-value \(\leq\) 0.05 (Falsified): [FAIL] – Evidence of residual confounding detected. Your covariates may not capture all confounders.
delta_nc: Measures residual association with the negative control. Smaller is better.
delta_bound: Upper bound on your true deficiency given the negative control results.

If the test fails:

Add more covariates that might capture hidden confounding
Try different adjustment methods (e.g., AIPW instead of IPTW)
Consider sensitivity analysis to quantify the impact of unmeasured confounding
Acknowledge the limitation in your conclusions

4. Decision Making

Suppose we need to decide whether to proceed with this gene target. What is the risk of being wrong due to remaining confounding?

Using policy_regret_bound(), we can compute two decision-theoretic quantities from a pre-specified deficiency proxy:

a transfer penalty (upper bound term), and
a minimax safety floor (lower bound).

# Utility scale: 0 to 10 (e.g., normalized expression fold-change value)
bounds <- policy_regret_bound(results, utility_range = c(0, 10), method = "iptw")
#> ℹ Transfer penalty: 0.3676 (delta = 0.0368)

print(bounds)
#> 
#> -- Policy Regret Bounds -------------------------------------------------
#> 
#> * Deficiency delta: 0.0368 
#> * Delta mode: point 
#> * Delta method: iptw 
#> * Delta selection: pre-specified method 
#> * Utility range: [0, 10]
#> * Transfer penalty: 0.3676 (additive regret upper bound)
#> * Minimax floor: 0.1838 (worst-case lower bound)
#> 
#> Note: this is a plug-in bound using a deficiency proxy rather than an identified exact deficiency.
#> 
#> Interpretation: Transfer penalty is 3.7 % of utility range given delta

Interpreting Policy Regret Bounds:

The bounds answer: “How much regret could confounding add to my decision, and what regret is unavoidable?”

Transfer penalty = \(M\cdot\delta\): additive worst-case inflation term in the regret transfer bound
- If transfer penalty = 0.2 on a 0–10 utility scale, confounding could add up to 0.2 units of regret in the worst case.
Minimax safety floor = \((M/2)\cdot\delta\): irreducible worst-case regret when \(\delta>0\)
- Even the best possible algorithm cannot guarantee worst-case regret smaller than this floor without stronger assumptions or randomized data.

When results comes from estimate_deficiency(), these are plug-in bounds based on the PS-TV proxy rather than an identified exact deficiency.

Decision Rule:

If transfer penalty < acceptable risk threshold: Proceed (conservative guarantee).
If minimax safety floor > acceptable risk threshold: Stop — provably unsafe without stronger assumptions or randomized data.
Otherwise: Borderline — improve adjustment (reduce \(\delta\)), run sensitivity analysis, or randomize.

Example: If your gene target could save 5 units of benefit, but the transfer penalty is 0.5 units (and the minimax safety floor is 0.25 units), you have roughly a 10:1 (and 20:1) benefit-to-risk ratio — often acceptable.

5. Effect Estimation

Finally, once we are confident in our adjustment strategy (low deficiency, passed diagnostics), we can estimate the causal effect.

# Estimate ATE using the best performing method (IPTW)
effect <- estimate_effect(results, target_method = "iptw")
print(effect)
#> 
#> -- Causal Effect Estimate ----------------------
#> Method:    iptw 
#> Type:      ATE 
#> Contrast:  Knockout vs Control 
#> Estimate:  -1.7085

Interpreting the Effect Estimate:

estimate: The average treatment effect (ATE) – the causal impact of the treatment
mu1: Expected outcome under treatment
mu0: Expected outcome under control
type: Type of estimand (ATE, RMST difference, etc.)

Reporting Guidelines:

Report the effect with the method used: “Using IPTW adjustment, the estimated ATE is…”
Reference the proxy: “The adjustment achieved \(\delta\) = 0.XX, indicating [excellent/good/moderate] approximation to an RCT under the PS-TV diagnostic”
Mention diagnostic results: “The negative control diagnostic [passed/failed], suggesting [adequate/inadequate] control of confounding”
State the bounds: “Transfer penalty is XX; minimax floor is YY on the chosen utility scale”

Complete Statement Example:
“Using IPTW adjustment (deficiency \(\delta\) = 0.01), we estimate the knockout reduces target expression by 2.3 units (95% CI: [1.8, 2.8]). The negative control diagnostic passed (p = 0.24). On a 0–10 utility scale (M = 10), the transfer penalty is 0.10 units and the minimax safety floor is 0.05 units, well below our risk tolerance threshold.”