Complete Workflow: From Data to Decision

Deniz Akdemir

2026-03-26

Overview

This vignette demonstrates the complete causaldef workflow, from data specification through to policy decision-making. We show how Le Cam deficiency theory translates abstract statistical concepts into actionable clinical insights.

The workflow consists of four stages:

  1. Specify → Define the causal problem
  2. Estimate → Compute deficiency for different adjustment strategies
  3. Diagnose → Validate assumptions using negative controls and sensitivity analysis
  4. Decide → Compute policy regret bounds and make informed decisions

Part 1: Gene Perturbation Study (Continuous Outcome)

We begin with the gene_perturbation dataset, which simulates a CRISPR knockout experiment. This illustrates the core workflow for continuous outcomes.

1.1 Data Description

Variables:

Causal Structure:

      [Batch, Library Size]
             |
             v
   [Knockout] -----> [Target Expression]
             \
              \---X--> [Housekeeping Gene]  (no causal effect)

The housekeeping gene is affected by the same technical variations but NOT by the knockout, making it an ideal negative control.

1.2 Step 1: Specification

1.3 Step 2: Deficiency Estimation

We compare three adjustment strategies:

  1. Unadjusted: Ignores technical confounders
  2. IPTW: Reweights samples to balance batch and library size
  3. AIPW: Augmented IPTW (doubly robust)

Interpretation:

1.4 Step 3: Diagnose with Negative Control

The negative control diagnostic tests whether our adjustment removes ALL confounding, not just the measured confounders.

Decision Logic:

Result Interpretation Action
falsified = FALSE No evidence of residual confounding Proceed with analysis
falsified = TRUE Residual confounding detected Add covariates or acknowledge limitations

1.5 Step 4: Policy Decision

Suppose we’re deciding whether to pursue this gene target for drug development. The utility is measured on a scale where: - 0 = no promise (no effect on expression) - 10 = maximum promise (strong effect)

Regret Bounds:

policy_regret_bound() reports:

Decision Rule: - If transfer_penalty < acceptable risk threshold → Proceed with confidence - If transfer_penalty > acceptable risk threshold → Seek more evidence

1.6 Effect Estimation

Finally, we estimate the causal effect using the best-performing method:

Complete Report:

Gene Perturbation Analysis Report

Treatment Effect (IPTW-adjusted): -1.71 log2 expression units Deficiency (δ): 0.037 Negative Control Test: PASSED (p = 0.109) Transfer Penalty: 0.368 on [0, 10] scale Minimax Safety Floor: 0.184 on [0, 10] scale

Conclusion: Strong evidence supporting treatment effect.


Part 2: Hematopoietic Cell Transplantation (Survival Outcome)

Next, we analyze the hct_outcomes dataset, which mimics a retrospective registry study comparing conditioning regimens in HCT.

2.1 Data Description

Clinical Context:

The key challenge is confounding by indication: doctors assign treatment based on patient status, making naive comparisons biased.

2.2 Step 1: Survival Specification

Executable survival-effect code in this section requires a compatible survival runtime. In the current support matrix, that means R >= 4.0.

2.3 Step 2: Deficiency Estimation

Clinical Interpretation:

The deficiency tells us how much our observational evidence differs from what an RCT would provide:

2.4 Step 3: Confounding Frontier

Beyond point estimates, we can map a sensitivity analysis showing how deficiency varies with hypothetical unmeasured confounding:

Reading the Frontier Map:

If an unmeasured confounder would need extreme strength (beyond observed benchmarks) to substantially increase δ, conclusions are robust.

2.5 Step 4: Policy Regret and RMST Effect

Clinical Regret Bounds:

If δ = 0.05 with M = 24 months: - Transfer penalty = 24 × 0.05 = 1.2 months - Minimax safety floor = 0.5 × 24 × 0.05 = 0.6 months

This means even with perfect decision-making, the observational evidence can inflate regret by up to about 1.2 months on the 0–24 month utility scale, and there is an unavoidable worst-case floor of about 0.6 months when \(\delta>0\).

2.6 Complete Decision Framework

delta_iptw <- deficiency_hct$estimates["iptw"]
transfer_penalty <- bounds_hct$transfer_penalty
minimax_floor <- bounds_hct$minimax_floor
rmst_diff <- effect_hct$estimate

# Decision logic
if (delta_iptw < 0.05) {
    evidence_quality <- "HIGH (approximates RCT)"
} else if (delta_iptw < 0.15) {
    evidence_quality <- "MODERATE (acceptable with caveats)"
} else {
    evidence_quality <- "LOW (substantial bias risk)"
}

# Benefit-to-risk ratio
if (!is.na(rmst_diff) && !is.na(transfer_penalty) && transfer_penalty > 0) {
    benefit_to_risk <- abs(rmst_diff) / transfer_penalty
    recommendation <- ifelse(benefit_to_risk > 2,
        "Recommend treatment with stronger effect",
        "Evidence inconclusive; consider shared decision-making"
    )
} else {
    benefit_to_risk <- NA
    recommendation <- "Unable to calculate benefit-to-risk ratio"
}

cat(sprintf(
    "
## HCT Treatment Decision Report

**RMST Difference (IPTW):** %.2f months (%s favored)
**Deficiency:** %.3f
**Evidence Quality:** %s
**Transfer Penalty:** %.2f months
**Minimax Safety Floor:** %.2f months

**Benefit-to-Risk Ratio:** %.1f:1
**Recommendation:** %s

### Clinical Translation

The observational evidence suggests %s conditioning provides approximately
%.1f additional months of relapse-free survival within the first 2 years.

However, the transfer penalty is %.1f months and the minimax safety floor is %.1f months
on the 0--24 month utility scale. Clinicians should weigh this against individual patient factors.
",
    abs(rmst_diff),
    ifelse(rmst_diff > 0, "Myeloablative", "Reduced"),
    delta_iptw,
    evidence_quality,
    transfer_penalty,
    minimax_floor,
    benefit_to_risk,
    recommendation,
    ifelse(rmst_diff > 0, "myeloablative", "reduced-intensity"),
    abs(rmst_diff),
    transfer_penalty,
    minimax_floor
))

Part 3: Comparative Analysis Across Studies

3.1 When Is Observational Evidence Sufficient?

Study δ (IPTW) Transfer Penalty Evidence Quality
Gene Perturbation Low (~0.01) Very Low Excellent
HCT Outcomes Moderate (~0.05-0.15) 1-4 months Acceptable with caveats

3.2 General Workflow Summary

┌─────────────────────────────────────────────────────────────────┐
│  SPECIFY: causal_spec() / causal_spec_survival()                │
│           ↓ Define treatment, outcome, covariates, NC           │
├─────────────────────────────────────────────────────────────────┤
│  ESTIMATE: estimate_deficiency()                                │
│            ↓ Compare unadjusted, IPTW, AIPW, TMLE, etc.        │
│            ↓ Select method with lowest δ                        │
├─────────────────────────────────────────────────────────────────┤
│  DIAGNOSE: nc_diagnostic() + confounding_frontier()             │
│            ↓ Test whether assumptions are falsified             │
│            ↓ Map sensitivity to unmeasured confounding          │
├─────────────────────────────────────────────────────────────────┤
│  DECIDE: policy_regret_bound() + estimate_effect()              │
│          ↓ Compute transfer penalty / minimax floor             │
│          ↓ Report effect with uncertainty qualification         │
└─────────────────────────────────────────────────────────────────┘

References

  1. Akdemir, D. (2026). Constraints on Causal Inference as Experiment Comparison: A Framework for Identification, Transportability, and Policy Learning. DOI: 10.5281/zenodo.18367347

  2. Le Cam, L., & Yang, G. L. (2000). Asymptotics in Statistics: Some Basic Concepts. Springer.

  3. VanderWeele, T. J., & Ding, P. (2017). Sensitivity Analysis in Observational Research: Introducing the E-value. Annals of Internal Medicine.