rdhte: Empirical Illustration

Overview

rdhte estimates heterogeneous treatment effects in sharp Regression Discontinuity (RD) designs along one or more pretreatment covariates. The package exposes three main commands:

rdhte() – point estimation and robust bias-corrected inference of group-conditional RD effects.
rdbwhte() – automatic per-cell bandwidth selection.
rdhte_lincom() – linear-combination tests over the estimated CATEs.

This vignette illustrates each feature on a real-data extract from Granzier, Pons, and Tricaud (2023, American Economic Journal: Applied), bundled with the package as rdhte_dataset.

The data and the design

Granzier, Pons, and Tricaud study coordination and bandwagon effects in French two-round elections, where candidates must clear a qualifying-vote threshold in the first round to advance to the runoff. The institutional rule creates a sharp regression-discontinuity design on every candidate’s first-round margin against that threshold: candidates just above the cutoff advance, those just below do not, and small differences in first-round support produce a discrete change in runoff participation. The authors use the design to ask whether being just-qualified causally affects subsequent voter and candidate behavior, and document substantive heterogeneity across ideology and candidate-strength dimensions.

The bundled extract has 39,534 candidate-race observations with the following variables:

y – 0/1 indicator for advancing to the runoff (the outcome).
x – first-round margin against the qualifying threshold (the running variable; cutoff at zero).
cluster_var – district identifier, used for cluster-robust inference.
w_left – 0/1 indicator for left-of-center parties.
w_ideology – unordered party-ideology bucket (4 categories).
w_strength – continuous proxy for ex-ante candidate strength (average prior national-level vote share).
w_strong – 0/1 indicator for above-median strength.
w_strength_qrt – ordered quartile of w_strength (4 levels).

These covariates support every covariate-incorporation pattern rdhte exposes: binary cells, multi-level (unordered and ordered) factor cells, factor-by-factor interactions, continuous slopes, and binary x continuous interactions.

library(rdhte)
library(rdrobust)
library(broom)
data(rdhte_dataset)
attach(rdhte_dataset)

Single binary variable

When covs.hte is binary (or a factor with two levels), rdhte reports one CATE per cell.

summary(rd_left <- rdhte(y = y, x = x, covs.hte = factor(w_left),
                          cluster = cluster_var))
#> Sharp RD Heterogeneous Treatment Effects: Subgroups., Cluster-Adjusted
#> 
#> Number of Obs.                39534
#> BW type                       mserd
#> Kernel                   Triangular
#> VCE method                      CR1
#> 
#> Number of Obs.                19723        19811
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> 
#> ==================================================================================================================
#>                      Point        Robust Inference
#> factor(w_left)    Estimate           z    Pr(>|z|)      [ 95% C.I. ]             Nh-       Nh+        h-        h+
#> ------------------------------------------------------------------------------------------------------------------
#>              0       0.021       1.523       0.128  [-0.003 , 0.027]            4352      4422     0.077     0.077
#>              1       0.089       6.821       0.000   [0.062 , 0.112]            5036      4840     0.117     0.117
#> ==================================================================================================================

How large is the advantage for left-of-center candidates? Test the difference with rdhte_lincom:

rdhte_lincom(model = rd_left,
             linfct = c("`factor(w_left)1` - `factor(w_left)0` = 0"))
#> $individual
#>                              hypothesis estimate z_stat p_value conf.low
#> 1 `factor(w_left)1` - `factor(w_left)0`    0.068  5.056       0    0.046
#>   conf.high
#> 1     0.105
#> 
#> $joint
#>   statistic X1L p_value
#> 1    25.563   1       0

Forcing a common bandwidth across cells (bw.joint = TRUE) can distort the cell-specific inference – here it makes the right-of-center effect (incorrectly) statistically significant:

summary(rdhte(y = y, x = x, covs.hte = w_left, cluster = cluster_var,
              bw.joint = TRUE))
#> Sharp RD Heterogeneous Treatment Effects: Subgroups., Cluster-Adjusted
#> 
#> Number of Obs.                39534
#> BW type                       mserd
#> Kernel                   Triangular
#> VCE method                      CR1
#> 
#> Number of Obs.                19723        19811
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> 
#> ================================================================================================================
#>                    Point        Robust Inference
#>       w_left    Estimate           z    Pr(>|z|)      [ 95% C.I. ]             Nh-       Nh+        h-        h+
#> ----------------------------------------------------------------------------------------------------------------
#>            0       0.023       2.272       0.023   [0.002 , 0.029]            5208      5364     0.097     0.097
#>            1       0.088       6.437       0.000   [0.062 , 0.116]            4325      4183     0.097     0.097
#> ================================================================================================================

A 0/1 numeric variable is auto-promoted to a factor; coefficient names then carry the variable name as a prefix:

summary(rd_left2 <- rdhte(y = y, x = x, covs.hte = w_left,
                           cluster = cluster_var))
#> Sharp RD Heterogeneous Treatment Effects: Subgroups., Cluster-Adjusted
#> 
#> Number of Obs.                39534
#> BW type                       mserd
#> Kernel                   Triangular
#> VCE method                      CR1
#> 
#> Number of Obs.                19723        19811
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> 
#> ================================================================================================================
#>                    Point        Robust Inference
#>       w_left    Estimate           z    Pr(>|z|)      [ 95% C.I. ]             Nh-       Nh+        h-        h+
#> ----------------------------------------------------------------------------------------------------------------
#>            0       0.021       1.523       0.128  [-0.003 , 0.027]            4352      4422     0.077     0.077
#>            1       0.089       6.821       0.000   [0.062 , 0.112]            5036      4840     0.117     0.117
#> ================================================================================================================
rdhte_lincom(model = rd_left2,
             linfct = c("`w_left1` - `w_left0` = 0"))
#> $individual
#>          hypothesis estimate z_stat p_value conf.low conf.high
#> 1 w_left1 - w_left0    0.068  5.056       0    0.046     0.105
#> 
#> $joint
#>   statistic X1L p_value
#> 1    25.563   1       0

Single categorical variable – unordered

A multi-level factor produces one CATE per level. The reference category is the factor’s first level.

summary(rd_ideology <- rdhte(y = y, x = x,
                              covs.hte = factor(w_ideology),
                              cluster = cluster_var))
#> Sharp RD Heterogeneous Treatment Effects: Subgroups., Cluster-Adjusted
#> 
#> Number of Obs.                39534
#> BW type                       mserd
#> Kernel                   Triangular
#> VCE method                      CR1
#> 
#> Number of Obs.                19723        19811
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> 
#> ======================================================================================================================
#>                          Point        Robust Inference
#> factor(w_ideology)    Estimate           z    Pr(>|z|)      [ 95% C.I. ]             Nh-       Nh+        h-        h+
#> ----------------------------------------------------------------------------------------------------------------------
#>                  1       0.089       6.819       0.000   [0.062 , 0.112]            5036      4840     0.117     0.117
#>                  2      -0.024      -0.853       0.393  [-0.195 , 0.077]             181       156     0.056     0.056
#>                  3       0.026       1.897       0.058  [-0.001 , 0.032]            3816      4202     0.086     0.086
#>                  4       0.007       0.884       0.377  [-0.012 , 0.032]             713       384     0.090     0.090
#> ======================================================================================================================

Joint test that the three non-reference ideology cells are all indistinguishable from zero:

rdhte_lincom(model = rd_ideology,
             linfct = c("`factor(w_ideology)2` = 0",
                        "`factor(w_ideology)3` = 0",
                        "`factor(w_ideology)4` = 0"))
#> $individual
#>            hypothesis estimate z_stat p_value conf.low conf.high
#> 1 factor(w_ideology)2   -0.024 -0.853   0.393   -0.195     0.077
#> 2 factor(w_ideology)3    0.026  1.897   0.058   -0.001     0.032
#> 3 factor(w_ideology)4    0.007  0.884   0.377   -0.012     0.032
#> 
#> $joint
#>   statistic X3L p_value
#> 1     5.116   3   0.163

Single categorical variable – ordered

For an ordered categorical variable, wrapping in factor() keeps the per-level CATE interpretation:

summary(rdhte(y = y, x = x, covs.hte = factor(w_strength_qrt),
              cluster = cluster_var))
#> Sharp RD Heterogeneous Treatment Effects: Subgroups., Cluster-Adjusted
#> 
#> Number of Obs.                39534
#> BW type                       mserd
#> Kernel                   Triangular
#> VCE method                      CR1
#> 
#> Number of Obs.                19723        19811
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> 
#> ==========================================================================================================================
#>                              Point        Robust Inference
#> factor(w_strength_qrt)    Estimate           z    Pr(>|z|)      [ 95% C.I. ]             Nh-       Nh+        h-        h+
#> --------------------------------------------------------------------------------------------------------------------------
#>                      0       0.016       1.803       0.071  [-0.001 , 0.032]            2583      2283     0.094     0.094
#>                      1       0.039       2.219       0.026   [0.003 , 0.055]            2227      2300     0.085     0.085
#>                      2       0.054       3.490       0.000   [0.025 , 0.090]            2040      2329     0.105     0.105
#>                      3       0.094       6.652       0.000   [0.069 , 0.127]            3436      3486     0.143     0.143
#> ==========================================================================================================================

The per-quartile CATE increases monotonically: candidates with higher average strength have larger treatment effects.

A bare numeric variable (no factor()) is treated as continuous – useful when the ordering is meaningful but you want the linear-in-W parametrization:

summary(rdhte(y = y, x = x, covs.hte = w_strength_qrt,
              cluster = cluster_var))
#> Sharp RD Heterogeneous Treatment Effects: Continuous., Cluster-Adjusted
#> 
#> Number of Obs.                39534
#> BW type                       mserd
#> Kernel                   Triangular
#> VCE method                      CR1
#> 
#> Number of Obs.                19723        19811
#> Eff. Number of Obs.            9533         9547
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> BW est. (h)                   0.097        0.097
#> 
#> ==========================================================================
#>                      Point        Robust Inference
#>                   Estimate           z    Pr(>|z|)      [ 95% C.I. ]      
#> --------------------------------------------------------------------------
#>              T       0.014       1.389       0.165  [-0.005 , 0.029]      
#> T:w_strength_qrt       0.026       3.997       0.000   [0.013 , 0.037]      
#> ==========================================================================

Two binary variables – interaction

Interactions of two factors produce one CATE per joint cell. The ordering of cells follows R’s interaction() convention.

summary(rdhte(y = y, x = x,
              covs.hte = factor(w_left):factor(w_strong),
              cluster = cluster_var))
#> Sharp RD Heterogeneous Treatment Effects: Subgroups., Cluster-Adjusted
#> 
#> Number of Obs.                39534
#> BW type                       mserd
#> Kernel                   Triangular
#> VCE method                      CR1
#> 
#> Number of Obs.                19723        19811
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> 
#> ===================================================================================================================================
#>                                       Point        Robust Inference
#> factor(w_left):factor(w_strong)    Estimate           z    Pr(>|z|)      [ 95% C.I. ]             Nh-       Nh+        h-        h+
#> -----------------------------------------------------------------------------------------------------------------------------------
#>                             0:1       0.003       0.004       0.997  [-0.016 , 0.016]            2348      2257     0.070     0.070
#>                             0:2       0.044       3.014       0.003   [0.014 , 0.064]            2316      2643     0.104     0.104
#>                             1:1       0.061       3.838       0.000   [0.029 , 0.090]            2066      1960     0.097     0.097
#>                             1:2       0.114       6.462       0.000   [0.082 , 0.153]            3107      3055     0.143     0.143
#> ===================================================================================================================================

The same model can be expressed by pre-building the joint cell variable:

interactions <- 1*(w_left==0)*(w_strong==1) +
                2*(w_left==0)*(w_strong==2) +
                3*(w_left==1)*(w_strong==1) +
                4*(w_left==1)*(w_strong==2)
summary(rdhte(y = y, x = x, covs.hte = factor(interactions),
              cluster = cluster_var))
#> Sharp RD Heterogeneous Treatment Effects: Subgroups., Cluster-Adjusted
#> 
#> Number of Obs.                39534
#> BW type                       mserd
#> Kernel                   Triangular
#> VCE method                      CR1
#> 
#> Number of Obs.                19723        19811
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> 
#> ========================================================================================================================
#>                            Point        Robust Inference
#> factor(interactions)    Estimate           z    Pr(>|z|)      [ 95% C.I. ]             Nh-       Nh+        h-        h+
#> ------------------------------------------------------------------------------------------------------------------------
#>                    1       0.003       0.004       0.997  [-0.016 , 0.016]            2348      2257     0.070     0.070
#>                    2       0.044       3.014       0.003   [0.014 , 0.064]            2316      2643     0.104     0.104
#>                    3       0.061       3.838       0.000   [0.029 , 0.090]            2066      1960     0.097     0.097
#>                    4       0.114       6.462       0.000   [0.082 , 0.153]            3107      3055     0.143     0.143
#> ========================================================================================================================

Single continuous variable

When covs.hte is continuous, rdhte switches to a linear-in-W parametrization of the CATE function:

\[ \tau(w) = \beta_T + \beta_{T:W}\,w. \]

summary(rd_continuous <- rdhte(y = y, x = x, covs.hte = w_strength,
                                kernel = "uni",
                                cluster = cluster_var))
#> Sharp RD Heterogeneous Treatment Effects: Continuous., Cluster-Adjusted
#> 
#> Number of Obs.                39534
#> BW type                       mserd
#> Kernel                      Uniform
#> VCE method                      CR1
#> 
#> Number of Obs.                19723        19811
#> Eff. Number of Obs.            9514         9529
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> BW est. (h)                   0.096        0.096
#> 
#> ========================================================================
#>                    Point        Robust Inference
#>                 Estimate           z    Pr(>|z|)      [ 95% C.I. ]      
#> ------------------------------------------------------------------------
#>            T      -0.055      -2.507       0.012  [-0.129 , -0.016]     
#> T:w_strength       0.262       3.927       0.000   [0.151 , 0.451]      
#> ========================================================================

To aid interpretation, the slope coefficient is precisely a partial-linear regression slope. With the uniform kernel and a fixed bandwidth, the rdhte point estimates match plain lm() on the same in-bandwidth subset:

trt <- (x > 0)
new.data <- data.frame(y, x, w_strength, trt)
using.lm <- coef(lm(y ~ trt * x * w_strength,
                    data = new.data[abs(new.data$x) < rd_continuous$h[1], ]))
rd_continuous$Estimate[1] - using.lm["trtTRUE"]
#> T 
#> 0
rd_continuous$Estimate[2] - using.lm["trtTRUE:w_strength"]
#> T:w_strength 
#>            0

(Inference, however, requires the robust bias correction in rdhte and cannot be obtained from the lm() fit alone.)

Interaction effect: binary x continuous

A fully interacted specification gives a separate intercept and slope for each level of the categorical covariate.

summary(rd_interaction <- rdhte(y = y, x = x,
                                 covs.hte = "w_left*w_strength",
                                 cluster = cluster_var))
#> Sharp RD Heterogeneous Treatment Effects: Continuous., Cluster-Adjusted
#> 
#> Number of Obs.                39534
#> BW type                       mserd
#> Kernel                   Triangular
#> VCE method                      CR1
#> 
#> Number of Obs.                19723        19811
#> Eff. Number of Obs.            9533         9547
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> BW est. (h)                   0.097        0.097
#> 
#> ===============================================================================
#>                           Point        Robust Inference
#>                        Estimate           z    Pr(>|z|)      [ 95% C.I. ]      
#> -------------------------------------------------------------------------------
#>                   T      -0.033      -1.364       0.173  [-0.094 , 0.017]      
#>            T:w_left      -0.059      -0.275       0.783  [-0.306 , 0.230]      
#>        T:w_strength       0.141       1.779       0.075  [-0.014 , 0.295]      
#> T:w_left:w_strength       0.274       0.742       0.458  [-0.395 , 0.876]      
#> ===============================================================================

Wrapping the binary in factor() is also allowed but shifts the baseline category and therefore the reported intercepts:

summary(rdhte(y = y, x = x, covs.hte = "factor(w_left)*w_strength",
              cluster = cluster_var))
#> Sharp RD Heterogeneous Treatment Effects: Continuous., Cluster-Adjusted
#> 
#> Number of Obs.                39534
#> BW type                       mserd
#> Kernel                   Triangular
#> VCE method                      CR1
#> 
#> Number of Obs.                19723        19811
#> Eff. Number of Obs.            9533         9547
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> BW est. (h)                   0.097        0.097
#> 
#> =======================================================================================
#>                                   Point        Robust Inference
#>                                Estimate           z    Pr(>|z|)      [ 95% C.I. ]      
#> ---------------------------------------------------------------------------------------
#>                           T      -0.092      -0.571       0.568  [-0.338 , 0.186]      
#>           T:factor(w_left)0       0.059       0.275       0.783  [-0.230 , 0.306]      
#>                T:w_strength       0.141       1.779       0.075  [-0.014 , 0.295]      
#> T:factor(w_left)1:w_strength       0.274       0.742       0.458  [-0.395 , 0.876]      
#> =======================================================================================

Each cell-specific coefficient may look insignificant individually while the joint test still rejects:

rdhte_lincom(model = rd_interaction,
             linfct = c("`T` = 0",
                        "`T:w_left` = 0",
                        "`T:w_strength` = 0",
                        "`T:w_left:w_strength` = 0"))
#> $individual
#>            hypothesis estimate z_stat p_value conf.low conf.high
#> 1                   T   -0.033 -1.364   0.173   -0.094     0.017
#> 2            T:w_left   -0.059 -0.275   0.783   -0.306     0.230
#> 3        T:w_strength    0.141  1.779   0.075   -0.014     0.295
#> 4 T:w_left:w_strength    0.274  0.742   0.458   -0.395     0.876
#> 
#> $joint
#>   statistic X4L p_value
#> 1    47.392   4       0

With the bandwidth fixed, the fully interacted model matches results from category-specific estimation. Use rdhte_lincom to read off the per-cell intercept and slope:

summary(rd_interaction <- rdhte(y = y, x = x,
                                 covs.hte = "w_left*w_strength",
                                 h = 0.1, cluster = cluster_var))
#> Sharp RD Heterogeneous Treatment Effects: Continuous., Cluster-Adjusted
#> 
#> Number of Obs.                39534
#> BW type                      Manual
#> Kernel                   Triangular
#> VCE method                      CR1
#> 
#> Number of Obs.                19723        19811
#> Eff. Number of Obs.            9829         9843
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> BW est. (h)                   0.100        0.100
#> 
#> ===============================================================================
#>                           Point        Robust Inference
#>                        Estimate           z    Pr(>|z|)      [ 95% C.I. ]      
#> -------------------------------------------------------------------------------
#>                   T      -0.031      -1.434       0.152  [-0.095 , 0.015]      
#>            T:w_left      -0.061      -0.266       0.791  [-0.303 , 0.231]      
#>        T:w_strength       0.138       1.875       0.061  [-0.007 , 0.297]      
#> T:w_left:w_strength       0.281       0.727       0.467  [-0.397 , 0.866]      
#> ===============================================================================
summary(rdhte(y = y[w_left == 0], x = x[w_left == 0],
              covs.hte = w_strength[w_left == 0], h = 0.1,
              cluster = cluster_var[w_left == 0]))
#> Sharp RD Heterogeneous Treatment Effects: Continuous., Cluster-Adjusted
#> 
#> Number of Obs.                21938
#> BW type                      Manual
#> Kernel                   Triangular
#> VCE method                      CR1
#> 
#> Number of Obs.                10287        11651
#> Eff. Number of Obs.            5371         5530
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> BW est. (h)                   0.100        0.100
#> 
#> ===================================================================================
#>                               Point        Robust Inference
#>                            Estimate           z    Pr(>|z|)      [ 95% C.I. ]      
#> -----------------------------------------------------------------------------------
#>                       T      -0.031      -1.434       0.152  [-0.095 , 0.015]      
#> T:w_strength[w_left == 0]       0.138       1.875       0.061  [-0.007 , 0.297]      
#> ===================================================================================
summary(rdhte(y = y[w_left == 1], x = x[w_left == 1],
              covs.hte = w_strength[w_left == 1], h = 0.1,
              cluster = cluster_var[w_left == 1]))
#> Sharp RD Heterogeneous Treatment Effects: Continuous., Cluster-Adjusted
#> 
#> Number of Obs.                17596
#> BW type                      Manual
#> Kernel                   Triangular
#> VCE method                      CR1
#> 
#> Number of Obs.                 9436         8160
#> Eff. Number of Obs.            4458         4313
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> BW est. (h)                   0.100        0.100
#> 
#> ===================================================================================
#>                               Point        Robust Inference
#>                            Estimate           z    Pr(>|z|)      [ 95% C.I. ]      
#> -----------------------------------------------------------------------------------
#>                       T      -0.093      -0.571       0.568  [-0.337 , 0.185]      
#> T:w_strength[w_left == 1]       0.419       1.213       0.225  [-0.233 , 0.992]      
#> ===================================================================================
rdhte_lincom(model = rd_interaction,
             linfct = c("`T` + `T:w_left` = 0",
                        "`T:w_strength` + `T:w_left:w_strength` = 0"))
#> $individual
#>                               hypothesis estimate z_stat p_value conf.low
#> 1                         T + `T:w_left`   -0.093 -0.571   0.568   -0.337
#> 2 `T:w_strength` + `T:w_left:w_strength`    0.419  1.213   0.225   -0.233
#>   conf.high
#> 1     0.185
#> 2     0.992
#> 
#> $joint
#>   statistic X2L p_value
#> 1     42.13   2       0

Standalone bandwidth selection (`rdbwhte`)

rdbwhte runs the same data-driven bandwidth selectors as rdhte but returns the bandwidths without estimating the CATEs. Useful when you want to inspect or fix h and then explore alternative variance estimators or compare specifications at a common bandwidth.

bw <- rdbwhte(y = y, x = x, covs.hte = factor(w_ideology),
              cluster = cluster_var)
bw$h
#>            [,1]       [,2]
#> [1,] 0.11696532 0.11696532
#> [2,] 0.05645238 0.05645238
#> [3,] 0.08564972 0.08564972
#> [4,] 0.09037549 0.09037549

bw.joint = TRUE forces a single shared bandwidth across cells:

bw_joint <- rdbwhte(y = y, x = x, covs.hte = factor(w_ideology),
                    cluster = cluster_var, bw.joint = TRUE)
bw_joint$h
#>            [,1]       [,2]
#> [1,] 0.09650206 0.09650206
#> [2,] 0.09650206 0.09650206
#> [3,] 0.09650206 0.09650206
#> [4,] 0.09650206 0.09650206

Efficiency-improving covariates (`covs.eff`)

covs.eff adds covariates to the local-polynomial regression to shrink standard errors without changing the identification of the CATE. They enter additively (and as covs.eff x W interactions in the heterogeneity-aware paths) but never with the treatment indicator or the running-variable polynomial. Useful when you have strong pretreatment predictors of the outcome.

m_no_eff <- rdhte(y = y, x = x, covs.hte = factor(w_ideology),
                   cluster = cluster_var)
m_eff    <- rdhte(y = y, x = x, covs.hte = factor(w_ideology),
                   covs.eff = w_strength, cluster = cluster_var)
data.frame(cell       = m_no_eff$W.lev,
           se_no_eff  = round(m_no_eff$se.rb, 4),
           se_eff     = round(m_eff$se.rb,    4),
           pct_change = round(100 * (m_eff$se.rb - m_no_eff$se.rb)
                              / m_no_eff$se.rb, 1))
#>                     cell se_no_eff se_eff pct_change
#> factor(w_ideology)1    1    0.0128 0.0129        0.6
#> factor(w_ideology)2    2    0.0691 0.0689       -0.3
#> factor(w_ideology)3    3    0.0083 0.0084        0.6
#> factor(w_ideology)4    4    0.0113 0.0117        3.0

Plotting

plot() is a post-estimation method for categorical covs.hte: one point per cell at the conventional point estimate, with the robust bias-corrected CI as an error bar.

plot(rd_ideology)

rdhte forest plot, ideology buckets

sort = TRUE reorders cells along the x-axis by their point estimate – often the more useful default for ranking-style narratives:

plot(rd_ideology, sort = TRUE)

sorted forest plot

The returned object is a ggplot, so additional layers compose naturally. For example, a custom title with horizontal orientation:

p <- plot(rd_ideology, sort = TRUE,
          title = "Heterogeneity by ideology bucket",
          ylab  = "Sharp RD ITT")

custom-themed forest plot

if (requireNamespace("ggplot2", quietly = TRUE)) {
  p + ggplot2::theme_minimal(base_size = 11) +
      ggplot2::coord_flip()
}

custom-themed forest plot

Building publication-ready tables

rdhte exposes the per-cell results through tidy() and a one-row summary through glance() (both broom-compatible). Combined with knitr::kable() / xtable::xtable() the same numbers can be turned into Markdown, HTML, or LaTeX tables for papers.

(A) per-cell table from a single call

tidy() returns one row per cell with the conventional point estimate, robust BC standard error, z / p-value, BC confidence interval, and per-side h and Nh. Renaming or formatting columns before kable() gives a clean publication panel:

tab_A <- tidy(rd_ideology)
tab_A
#>   term     estimate  std.error  statistic      p.value      conf.low  conf.high
#> 1    1  0.088652653 0.01279375  6.8186555 9.189647e-12  0.0621608620 0.11231142
#> 2    2 -0.024026038 0.06914491 -0.8533574 3.934611e-01 -0.1945268631 0.07651622
#> 3    3  0.025632358 0.00831174  1.8974042 5.777461e-02 -0.0005199808 0.03206144
#> 4    4  0.007008009 0.01133009  0.8840432 3.766729e-01 -0.0121902781 0.03222286
#>   estimate.bc     h.left    h.right n.eff.left n.eff.right
#> 1  0.08723614 0.11696532 0.11696532       5036        4840
#> 2 -0.05900532 0.05645238 0.05645238        181         156
#> 3  0.01577073 0.08564972 0.08564972       3816        4202
#> 4  0.01001629 0.09037549 0.09037549        713         384

knitr::kable(
  tab_A[, c("term", "estimate", "std.error", "conf.low", "conf.high",
            "n.eff.left", "n.eff.right", "h.left", "h.right")],
  digits = c(NA, 3, 3, 3, 3, 0, 0, 3, 3),
  col.names = c("Cell", "Estimate", "SE", "CI lo", "CI hi",
                "Nh-", "Nh+", "h-", "h+"),
  caption = "Per-cell RD effects by ideology bucket"
)

Per-cell RD effects by ideology bucket
Cell	Estimate	SE	CI lo	CI hi	Nh-	Nh+	h-	h+
1	0.089	0.013	0.062	0.112	5036	4840	0.117	0.117
2	-0.024	0.069	-0.195	0.077	181	156	0.056	0.056
3	0.026	0.008	-0.001	0.032	3816	4202	0.086	0.086
4	0.007	0.011	-0.012	0.032	713	384	0.090	0.090

glance() complements tidy() with a one-row description of the fit (N, polynomial orders, kernel, VCE, bandwidth selector):

glance(rd_ideology)
#>       n n.left n.right cutoff p q     kernel vce vce_select bwselect level
#> 1 39534  19723   19811      0 1 2 Triangular CR1        cr1    mserd    95
#>   n.terms covs.continuous
#> 1       4           FALSE

(B) cell x specification comparison

Fix the covs.hte argument and vary vce across columns to see how the choice of variance estimator moves the per-cell standard errors.

specs <- list(HC0 = list(vce = "hc0"),
              HC2 = list(vce = "hc2"),
              HC3 = list(vce = "hc3"),
              CR1 = list(vce = "cr1", cluster = cluster_var))

fit_one <- function(args) {
  args <- c(list(y = y, x = x, covs.hte = factor(w_ideology)), args)
  do.call(rdhte, args)
}

cells  <- sort(unique(w_ideology))
mat_pt <- sapply(specs, function(s) tidy(fit_one(s))$estimate)
mat_se <- sapply(specs, function(s) tidy(fit_one(s))$std.error)
rownames(mat_pt) <- rownames(mat_se) <- as.character(cells)

mat_pt
#>            HC0          HC2          HC3          CR1
#> 1  0.088623908  0.088625020  0.088625979  0.088652653
#> 2 -0.024247190 -0.023878639 -0.023577737 -0.024026038
#> 3  0.025784637  0.025791260  0.025797429  0.025632358
#> 4  0.007022683  0.007006824  0.006991369  0.007008009
mat_se
#>           HC0        HC2         HC3        CR1
#> 1 0.011736372 0.01174108 0.011746082 0.01279375
#> 2 0.069234017 0.07084168 0.072492654 0.06914491
#> 3 0.008274419 0.00827868 0.008283159 0.00831174
#> 4 0.011338034 0.01136898 0.011400526 0.01133009

A common publication layout interleaves the point estimate and its SE in parentheses below it:

n_cells <- nrow(mat_pt)
out     <- data.frame(matrix(NA_character_, nrow = 2 * n_cells,
                             ncol = ncol(mat_pt)))
for (i in seq_len(n_cells)) {
  out[2 * i - 1, ] <- sprintf("%.3f", mat_pt[i, ])
  out[2 * i,     ] <- sprintf("(%.3f)", mat_se[i, ])
}
out <- cbind(Cell = rep(rownames(mat_pt), each = 2), out)
out$Cell[seq(2, nrow(out), by = 2)] <- ""
colnames(out)[-1] <- colnames(mat_pt)
knitr::kable(out, caption = "Per-cell estimates by variance option (SE in parentheses)")

Per-cell estimates by variance option (SE in parentheses)
Cell	HC0	HC2	HC3	CR1
1	0.089	0.089	0.089	0.089
	(0.012)	(0.012)	(0.012)	(0.013)
2	-0.024	-0.024	-0.024	-0.024
	(0.069)	(0.071)	(0.072)	(0.069)
3	0.026	0.026	0.026	0.026
	(0.008)	(0.008)	(0.008)	(0.008)
4	0.007	0.007	0.007	0.007
	(0.011)	(0.011)	(0.011)	(0.011)

(C) LaTeX export

xtable or kableExtra can render the same data frame as LaTeX. The snippet below produces a paper-ready table; uncomment the writeLines() call to write to a file:

tex <- xtable::xtable(
  tab_A[, c("term", "estimate", "std.error", "conf.low", "conf.high")],
  digits  = c(0, 0, 3, 3, 3, 3),
  caption = "Per-cell RD effects by ideology bucket",
  label   = "tab:rdhte-ideology"
)
print(tex, include.rownames = FALSE, booktabs = TRUE,
      caption.placement = "top")
#> % latex table generated in R 4.6.0 by xtable 1.8-8 package
#> % Tue May 26 09:50:57 2026
#> \begin{table}[ht]
#> \centering
#> \caption{Per-cell RD effects by ideology bucket} 
#> \label{tab:rdhte-ideology}
#> \begin{tabular}{lrrrr}
#>   \toprule
#> term & estimate & std.error & conf.low & conf.high \\ 
#>   \midrule
#> 1 & 0.089 & 0.013 & 0.062 & 0.112 \\ 
#>   2 & -0.024 & 0.069 & -0.195 & 0.077 \\ 
#>   3 & 0.026 & 0.008 & -0.001 & 0.032 \\ 
#>   4 & 0.007 & 0.011 & -0.012 & 0.032 \\ 
#>    \bottomrule
#> \end{tabular}
#> \end{table}
# writeLines(capture.output(print(tex, include.rownames = FALSE,
#                                booktabs = TRUE,
#                                caption.placement = "top")),
#            con = "rdhte_table_A.tex")

Replicating rdrobust

Average effects from rdhte() and rdrobust() will not match under default settings, because the two packages use different defaults for rho and vce:

summary(rdhte(y = y, x = x))
#> Sharp RD Average Treatment Effect.
#> 
#> Number of Obs.                39534
#> BW type                       mserd
#> Kernel                   Triangular
#> VCE method                      HC3
#> 
#> Number of Obs.                19723        19811
#> Eff. Number of Obs.            9536         9550
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> BW est. (h)                   0.097        0.097
#> 
#> ========================================================================
#>                    Point        Robust Inference
#>                 Estimate           z    Pr(>|z|)      [ 95% C.I. ]      
#> ------------------------------------------------------------------------
#>            T       0.051       7.071       0.000   [0.035 , 0.062]      
#> ========================================================================
summary(rdrobust(y = y, x = x))
#> Call: rdrobust
#> 
#> Sharp RD estimates using local polynomial regression.
#> 
#> Number of Obs.                39534
#> BW type                       mserd
#> Kernel                   Triangular
#> VCE method                       NN
#> 
#>                                Left        Right
#> Number of Obs.                19723        19811
#> Eff. Number of Obs.            9546         9560
#> Order est. (p)                    1            1
#> Order bias (q)                    2            2
#> BW est. (h)                   0.097        0.097
#> BW bias (b)                   0.159        0.159
#> rho (h/b)                     0.608        0.608
#> Unique Obs.                   19660        19746
#> 
#> =====================================================================
#>                    Point    Robust Inference
#>                 Estimate         z     P>|z|      [ 95% C.I. ]       
#> ---------------------------------------------------------------------
#>      RD Effect     0.051     8.894     0.000     [0.039 , 0.062]     
#> =====================================================================

To replicate exactly with rdrobust, set rho = 1, choose a matching vce, and enforce the same bandwidth:

summary(rdhte(y = y, x = x, h = 0.1, vce = "hc3"))
#> Sharp RD Average Treatment Effect.
#> 
#> Number of Obs.                39534
#> BW type                      Manual
#> Kernel                   Triangular
#> VCE method                      HC3
#> 
#> Number of Obs.                19723        19811
#> Eff. Number of Obs.            9829         9843
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> BW est. (h)                   0.100        0.100
#> 
#> ========================================================================
#>                    Point        Robust Inference
#>                 Estimate           z    Pr(>|z|)      [ 95% C.I. ]      
#> ------------------------------------------------------------------------
#>            T       0.051       7.194       0.000   [0.035 , 0.062]      
#> ========================================================================
summary(rdrobust(y = y, x = x, h = 0.1, rho = 1, vce = "hc3"))
#> Call: rdrobust
#> 
#> Sharp RD estimates using local polynomial regression.
#> 
#> Number of Obs.                39534
#> BW type                      Manual
#> Kernel                   Triangular
#> VCE method                      HC3
#> 
#>                                Left        Right
#> Number of Obs.                19723        19811
#> Eff. Number of Obs.            9829         9843
#> Order est. (p)                    1            1
#> Order bias (q)                    2            2
#> BW est. (h)                   0.100        0.100
#> BW bias (b)                   0.100        0.100
#> rho (h/b)                     1.000        1.000
#> Unique Obs.                   19723        19811
#> 
#> =====================================================================
#>                    Point    Robust Inference
#>                 Estimate         z     P>|z|      [ 95% C.I. ]       
#> ---------------------------------------------------------------------
#>      RD Effect     0.051     7.194     0.000     [0.035 , 0.062]     
#> =====================================================================

The same trick reproduces the per-cell rdrobust estimates from a single rdhte call with per-cell bandwidths:

summary(rdhte(y = y, x = x, covs.hte = w_left,
              h = c(0.078, 0.116), vce = "hc3"))
#> Sharp RD Heterogeneous Treatment Effects: Subgroups.
#> 
#> Number of Obs.                39534
#> BW type                      Manual
#> Kernel                   Triangular
#> VCE method                      HC3
#> 
#> Number of Obs.                19723        19811
#> Order est. (p)                    1            1
#> Order bias  (q)                   2            2
#> 
#> ================================================================================================================
#>                    Point        Robust Inference
#>       w_left    Estimate           z    Pr(>|z|)      [ 95% C.I. ]             Nh-       Nh+        h-        h+
#> ----------------------------------------------------------------------------------------------------------------
#>            0       0.021       1.522       0.128  [-0.003 , 0.027]            4388      4453     0.078     0.078
#>            1       0.089       7.435       0.000   [0.064 , 0.110]            5010      4799     0.116     0.116
#> ================================================================================================================
summary(rdrobust(y = y[w_left == 1], x = x[w_left == 1],
                 h = 0.116, rho = 1, vce = "hc3"))
#> Call: rdrobust
#> 
#> Sharp RD estimates using local polynomial regression.
#> 
#> Number of Obs.                17596
#> BW type                      Manual
#> Kernel                   Triangular
#> VCE method                      HC3
#> 
#>                                Left        Right
#> Number of Obs.                 9436         8160
#> Eff. Number of Obs.            5010         4799
#> Order est. (p)                    1            1
#> Order bias (q)                    2            2
#> BW est. (h)                   0.116        0.116
#> BW bias (b)                   0.116        0.116
#> rho (h/b)                     1.000        1.000
#> Unique Obs.                    9436         8160
#> 
#> =====================================================================
#>                    Point    Robust Inference
#>                 Estimate         z     P>|z|      [ 95% C.I. ]       
#> ---------------------------------------------------------------------
#>      RD Effect     0.089     7.435     0.000     [0.064 , 0.110]     
#> =====================================================================

rdhte: Empirical Illustration

Overview

The data and the design

Single binary variable

Single categorical variable – unordered

Single categorical variable – ordered

Two binary variables – interaction

Single continuous variable

Interaction effect: binary x continuous

Standalone bandwidth selection (`rdbwhte`)

Efficiency-improving covariates (`covs.eff`)

Plotting

Building publication-ready tables

(A) per-cell table from a single call

(B) cell x specification comparison

(C) LaTeX export

Replicating rdrobust

See also

Reference

rdhte: Empirical Illustration

Overview

The data and the design

Single binary variable

Single categorical variable – unordered

Single categorical variable – ordered

Two binary variables – interaction

Single continuous variable

Interaction effect: binary x continuous

Standalone bandwidth selection (rdbwhte)

Efficiency-improving covariates (covs.eff)

Plotting

Building publication-ready tables

(A) per-cell table from a single call

(B) cell x specification comparison

(C) LaTeX export

Replicating rdrobust

See also

Reference

Standalone bandwidth selection (`rdbwhte`)

Efficiency-improving covariates (`covs.eff`)