exemplary usage of functions

Patrick Schratz

2016-10-11

Load example data

Data source: ?mgcv::predict.gam

library(oddsratio)
suppressPackageStartupMessages(library(mgcv))
set.seed(1234)
n <- 200
sig <- 2
dat <- suppressMessages(gamSim(1, n = n,scale = sig))
dat$x4 <- as.factor(c(rep("A", 50), rep("B", 50), rep("C", 50), rep("D", 50)))

fit.gam <- mgcv::gam(y ~ s(x0) + s(I(x1^2)) + s(x2) + offset(x3) + x4, data = dat)

GAM example

Calculate OR for specific increment step of continuous variable

To calculate specific increment steps of fit.gam, we take predictor x2 (randomly chosen) and specify for which values we want to calculate the odds ratio.
We can see that the odds of response y happening are 22 times higher when predictor x2 increases from 0.099 to 0.198 while holding all other predictors constant.

calc.oddsratio.gam(data = dat, model = fit.gam, 
                   pred = "x2", values = c(0.099, 0.198))
## Predictor: 'x2'
## 
## Odds ratio from '0.099' to '0.198': 23.32353

Usually, this calculation is done by setting all predictors to their mean value, predict the response, change the desired predictor to a new value and predict the response again. These actions results in two log odds values, respectively, which are transformed into odds by exponentiating them. Finally, the odds ratio can be calculated from these two odds values.

Calculate OR for level change of indicator variable

If the predictor is a indicator variable, i.e. consists of fixed levels, you can use the function in the same way by just putting in the respective levels you are interested in:

calc.oddsratio.gam(data = dat, model = fit.gam, 
                   pred = "x4", values = c("A", "B"))
## Predictor: 'x4'
## 
## Odds ratio from 'A' to 'B': 1.377537

Here, the change in odds of y happening if predictor x4 is changing from level A to B is rather small. In detail, an increase in odds of 37.8% is reported.

Calculate ORs for percentage increments of predictor distribution

To get an impression of odds ratio behaviour throughout the complete range of the smoothing function of the fitted GAM model, you can calculate odds ratios based on percentage breaks of the predictors distribution.
Here we slice predictor x2 into 5 parts by taking the predictor values of every 20% increment step.

calc.oddsratio.gam(data = dat, model = fit.gam, pred = "x2", 
                   percentage = 20, slice = TRUE)
## Predictor: 'x2'
## Steps:     5 (20%)
## 
## Odds ratio from 0.001(0%) to 0.2(20%): 2510.768
## Odds ratio from 0.2(20%) to 0.4(40%): 0.02870699
## Odds ratio from 0.4(40%) to 0.599(60%): 0.576121
## Odds ratio from 0.599(60%) to 0.799(80%): 0.06032289
## Odds ratio from 0.799(80%) to 0.998(100%): 0.4063187

We can see that there is a high odds ratio reported when increasing predictor x2 from 0.008 to 0.206 while all further predictor increases decrease the odds of response y happening substantially.

GLM example

Create example data.
Data source: http://www.ats.ucla.edu/stat/r/dae/logit.htm

dat <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv")
dat$rank <- factor(dat$rank)
fit.glm <- glm(admit ~ gre + gpa + rank, data = dat, family = "binomial")

Calculate odds ratio for continuous predictors

For GLMs, the odds ratio calculation is simpler because odds ratio changes correspond to fixed predictor increases throughout the complete value range of each predictor.

Hence, function calc.oddsratio.glm takes the increment steps of each predictor directly as an input in its parameter incr.

To avoid false predictor/value assignments, the combinations need to be given in a list.

Odds ratios of indicator variables are computed automatically and do always refer to the base factor level.

Indicator predictor rank has four levels. Subsequently, we will get three odds ratio outputs referring to the base factor level (here: rank1).

The output is interpreted as follows: “Having rank2 instead of rank1 while holding all other values constant results in a decrease in odds of 49.1% (1-0.509)”.

calc.oddsratio.glm(data = dat, model = fit.glm, incr = list(gre = 380, gpa = 5))
## Variable:   'gre'
## Increment:  '380'
## Odds ratio: 2.364
## 
## Variable:   'gpa'
## Increment:  '5'
## Odds ratio: 55.712
## 
## Variable:   'rank2'
## Increment:  'Indicator variable. Refer to base factor level!'
## Odds ratio: 0.509
## 
## Variable:   'rank3'
## Increment:  'Indicator variable. Refer to base factor level!'
## Odds ratio: 0.262
## 
## Variable:   'rank4'
## Increment:  'Indicator variable. Refer to base factor level!'
## Odds ratio: 0.212