This vignette demonstrates how to use negative control outcomes to screen for residual confounding and compute a corresponding sensitivity bound using the causaldef package. Negative controls provide an empirical diagnostic for whether your adjustment strategy may have failed to remove confounding.
A negative control outcome (\(Y'\)) is a variable that:
The key insight is:
If your adjustment strategy correctly removes confounding, then the residual association between \(A\) and \(Y'\) should be zero.
If you observe a non-zero association between \(A\) and \(Y'\) after adjustment, this indicates that confounding remains and your causal estimates may be biased.
thm:nc_bound)The causaldef package combines two ingredients:
thm:nc_bound):\[\delta(\hat{K}) \leq \kappa \cdot \delta_{NC}(\hat{K})\]
where: - \(\delta(\hat{K})\) is the true deficiency (what we want to know) - \(\delta_{NC}(\hat{K})\) is a negative-control association proxy (what we can measure) - \(\kappa\) is an alignment constant reflecting how well \(Y'\) proxies for \(Y\)’s confounding
Let’s create a dataset where we have: - An unmeasured confounder \(U\) - An observed covariate \(W\) (correlated with \(U\)) - Binary treatment \(A\) - Outcome \(Y\) affected by \(A\) and \(U\) - Negative control \(Y'\) affected only by \(U\) (not \(A\))
library(causaldef)
set.seed(42)
n <- 500
# Unmeasured confounder
U <- rnorm(n)
# Observed covariate (partially captures U)
W <- 0.7 * U + rnorm(n, sd = 0.5)
# Treatment assignment (confounded by U via W)
ps_true <- plogis(0.3 + 0.8 * U)
A <- rbinom(n, 1, ps_true)
# True causal effect
beta_true <- 2.0
# Outcome (affected by A and U)
Y <- 1 + beta_true * A + 1.5 * U + rnorm(n)
# Negative control outcome (affected by U only, NOT by A)
Y_nc <- 0.5 + 1.2 * U + rnorm(n, sd = 0.8)
# Create data frame
df <- data.frame(W = W, A = A, Y = Y, Y_nc = Y_nc)We specify the causal problem including the negative control:
spec <- causal_spec(
data = df,
treatment = "A",
outcome = "Y",
covariates = "W",
negative_control = "Y_nc"
)
#> ✔ Created causal specification: n=500, 1 covariate(s)
print(spec)
#>
#> -- Causal Specification --------------------------------------------------
#>
#> * Treatment: A ( binary )
#> * Outcome: Y ( continuous )
#> * Covariates: W
#> * Sample size: 500
#> * Estimand: ATE
#> * Negative control: Y_ncNow we test whether our IPTW adjustment successfully removes confounding:
The diagnostic returns:
screening$statistic: Weighted residual association between \(A\) and \(Y'\) after adjustmentp_value: Permutation p-value for that residual associationdelta_nc: The observed negative-control association proxydelta_bound: Upper bound on true deficiency (\(\kappa \times \delta_{NC}\))falsified: Whether the residual-association screening test rejectsIf \(W\) fully captures \(U\), the negative control test will NOT falsify:
When \(W\) is a poor proxy for \(U\), falsification occurs:
The best negative control outcomes have:
| Domain | Treatment | Outcome | Possible Negative Control |
|---|---|---|---|
| Cardiovascular | Statin use | CVD events | Accidental injuries |
| Oncology | Chemotherapy | Tumor response | Hospital-acquired infections |
| Economics | Job training | Earnings in 1978 | Earnings in 1974 (pre-treatment) |
| Epidemiology | Vaccination | Flu incidence | Unrelated disease incidence |
The negative control diagnostic complements deficiency estimation:
# Step 1: Estimate deficiency
def_results <- estimate_deficiency(
spec,
methods = c("unadjusted", "iptw", "aipw"),
n_boot = 100
)
print(def_results)
# Step 2: Run negative control diagnostic on best method
best_method <- names(which.min(def_results$estimates))
nc_check <- nc_diagnostic(spec, method = best_method, n_boot = 100)
# Step 3: Compute policy bounds if assumptions not falsified
if (!nc_check$falsified) {
bounds <- policy_regret_bound(
def_results,
utility_range = c(-5, 10),
method = best_method
)
print(bounds)
} else {
warning("Causal assumptions falsified. Consider additional covariates.")
}The alignment constant \(\kappa\) affects the bound’s tightness. The default \(\kappa = 1\) is conservative. You can estimate \(\kappa\) from domain knowledge:
| Function | Purpose |
|---|---|
nc_diagnostic() |
Screen for residual association and compute a sensitivity bound |
delta_nc |
Observable negative-control association proxy |
delta_bound |
Upper bound on true deficiency |
falsified |
Screening rejection of residual association |
Negative control diagnostics provide a data-driven way to assess causal assumptions. Use them alongside deficiency estimation for robust causal inference.
Akdemir, D. (2026). Constraints on Causal Inference as Experiment Comparison. DOI: 10.5281/zenodo.18367347. See thm:nc_bound (Negative Control Sensitivity Bound).
Lipsitch, M., Tchetgen, E., & Cohen, T. (2010). Negative controls: A tool for detecting confounding and bias. Epidemiology, 21(3), 383-388.
Shi, X., Miao, W., & Tchetgen Tchetgen, E. (2020). A selective review of negative control methods. Current Epidemiology Reports, 7, 190-202.