This R package coga can help you to calculate density and distribution function of convolution of gamma distributions. The convolution of gamma distributions is the sum of series of independent gamma distributions. The algorithm of this package comes from Moschopoulos Peter G. (1985). The R coda in this vignette also can be considered as useful examples.
Assume that we have several random variables, \(X_1, ..., X_n\), and all random variables follow gamma distribution independently with shape parameters \(\alpha_i\) and scale parameters \(\beta_i\), where \(i = 1, ..., n\). Then, the density of \(Y = X_1 + ... + X_n\) can be expressed as:
\[g(y) = C \sum_{k=0}^{\infty} \lambda_k y^{\rho + k - 1} e^{-y/\beta_1} / (\Gamma(\rho + k) \beta_{1}^{\rho + k})\]
And the distribution function \(G(w)=Pr(Y<w)\) is expressed as:
\[G(w) = C \sum_{k=0}^{\infty} \lambda_k \int_{0}^{w} (y^{\rho + k - 1} e^{-y/\beta_1} / (\Gamma(\rho + k) \beta_{1}^{\rho + k})) dy\]
The integrate in this formula is incomplete gamma function and can be calculated by distribution function of gamma distribution.
More details about this algorithm can be found in paper of Moschopoulos Peter G. (1985).
Assume that we have two random variables, \(X_1\) and \(X_2\), where \(X_1\) is a gamma distribution with shape parameter \(3\), and rate parameter \(2\), and \(X_2\) is a gamma distribution with shape parameter \(4\), and rate parameter \(3\). The density and distribution funciton of \(Y = X_1 + X_2\) will be calculated.
Correctness check for density function:
y <- rcoga(1000000, c(3,4), c(2,3))
grid <- seq(0, 8, length.out=1000)
pdf <- dcoga(grid, shape=c(3, 4), rate=c(2, 3))
plot(density(y), col="blue")
lines(grid, pdf, col="red")
Correctness check for distribution function:
y <- rcoga(1000000, c(3,4), c(2,3))
grid <- seq(0, 8, length.out=1000)
cdf <- pcoga(grid, shape=c(3, 4), rate=c(2, 3))
plot(ecdf(y), col="blue")
lines(grid, cdf, col="red")
The ‘dcoga’ and ‘pcoga’ functions in this package ‘coga’ is based on Cpp code. The following experiment shows the advantage of Cpp code, which runs on a Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz computer.
grid <- seq(0, 15, length.out=10)
microbenchmark::microbenchmark(
dcoga(grid, shape=c(3,4,5), rate=c(2,3,4)),
coga:::dcoga.R(grid, shape=c(3,4,5), rate=c(2,3,4)),
pcoga(grid, shape=c(3,4,5), rate=c(2,3,4)),
coga:::pcoga.R(grid, shape=c(3,4,5), rate=c(2,3,4))
)
## Unit: milliseconds
## expr min
## dcoga(grid, shape = c(3, 4, 5), rate = c(2, 3, 4)) 1.310279
## coga:::dcoga.R(grid, shape = c(3, 4, 5), rate = c(2, 3, 4)) 30.424495
## pcoga(grid, shape = c(3, 4, 5), rate = c(2, 3, 4)) 4.553233
## coga:::pcoga.R(grid, shape = c(3, 4, 5), rate = c(2, 3, 4)) 37.759052
## lq mean median uq max neval
## 1.483728 1.750481 1.757259 1.889137 2.677676 100
## 32.744795 39.082002 35.437426 38.668515 102.428354 100
## 4.958614 9.088243 5.221953 5.796708 49.863904 100
## 40.370152 53.267497 43.868735 72.830472 95.992447 100
Note: In this example, ‘dcoga.R’, and ‘pcoga.R’ are the R version functions for density, and distribution functions of convolution of gamma distributions. We do not put these two R functions as export functions in package ‘coga’, but you can still use them by ‘coga:::dcoga’, and ‘coga:::pcoga’.
The convolution of two gamma distributions is a special situation of convolution of gamma distributions. The functions ‘dcoga2dim’ and ‘pcoga2dim’ can solve this problem with higher efficiency (they are much more faster than the general functions, ‘dcoga’ and ‘pcoga’.)
grid <- seq(0, 15, length.out=100)
microbenchmark::microbenchmark(
dcoga(grid, shape=c(3,4), rate=c(2,3)),
dcoga2dim(grid, 3, 4, 2, 3),
pcoga(grid, shape=c(3,4), rate=c(2,3)),
pcoga2dim(grid, 3, 4, 2, 3))
## Unit: microseconds
## expr min lq
## dcoga(grid, shape = c(3, 4), rate = c(2, 3)) 16481.804 18782.0325
## dcoga2dim(grid, 3, 4, 2, 3) 58.021 62.3715
## pcoga(grid, shape = c(3, 4), rate = c(2, 3)) 37958.314 39996.2400
## pcoga2dim(grid, 3, 4, 2, 3) 3815.581 3830.6490
## mean median uq max neval
## 27693.791 21054.628 41875.764 54540.518 100
## 72.482 71.025 76.619 144.368 100
## 57253.241 61935.891 67808.327 131291.390 100
## 4029.803 3842.713 4074.259 5741.707 100
Please take care of that R functions dcoga
, pcoga
, and rcoga
in this package can handle different lengths of parameter shape
and rate
by recycling shorter parameter. That means that dcoga(3, c(2,3), c(3,4,5,3,4))
and dcoga(3, c(2,3,2,3,2), c(3,4,5,3,4))
will give the same result. If the length of the longer parameter is not a multiple of the length of shorter one, these three R functions will give a Warning message.
[1] Moschopoulos, Peter G. “The distribution of the sum of independent gamma random variables.” Annals of the Institute of Statistical Mathematics 37.1 (1985): 541-544.
[2] Mathai, A.M.: Storage capacity of a dam with gamma type inputs. Ann. Inst. Statist.Math. 34, 591-597 (1982).