The “forecastHybrid” package provides functions to build composite models using multiple individual component models from the “forecast” package. These hybridModel objects can then be manipulated with many of the familiar functions from the “forecast” and “stats” packages including forecast(), plot(), accuracy(), residuals(), and fitted().

Installation

The stable release of the package is hosted on CRAN and can be installed as usual.

install.packages("forecastHybrid")

The latest development version can be installed using the “devtools” package.

devtools::install_github("ellisp/forecastHybrid/pkg")

Version updates to CRAN will be published frequently after new features are implemented, so the development version is not recommended unless you plan to modify the code.

Basic usage

First load the package.

library(forecastHybrid)

Quick start

If you don't have time to read the whole guide and want to get started immediatly with sane default settings to forecast the AirPassengers timeseries, run the following:

quickModel <- hybridModel(AirPassengers)
## Fitting the auto.arima model
## Fitting the ets model
## Fitting the thetam model
## Fitting the nnetar model
## Fitting the stlm model
## Fitting the tbats model
forecast(quickModel)
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 1961       446.9855 420.9284 474.8496 409.9071 485.2807
## Feb 1961       430.4071 403.0907 467.2731 393.9304 482.4035
## Mar 1961       479.2576 429.7726 546.6654 419.4892 567.4945
## Apr 1961       487.6008 445.6985 536.6806 425.6737 559.7470
## May 1961       495.4361 443.0128 543.7546 420.7317 569.5254
## Jun 1961       561.7787 498.8558 623.6934 471.1949 655.8100
## Jul 1961       635.5908 550.5136 709.2802 517.2336 746.2962
## Aug 1961       628.1985 543.3613 706.3394 507.8542 747.9724
## Sep 1961       539.6004 468.8133 623.1380 435.9204 662.0555
## Oct 1961       480.1686 405.1625 547.4338 374.8081 583.4094
## Nov 1961       417.9191 349.4447 482.6467 321.6167 515.9216
## Dec 1961       461.2823 388.8923 542.5572 356.0996 581.6255
## Jan 1962       476.2971 392.2920 566.9139 357.3798 609.3690
## Feb 1962       459.5533 381.4061 555.7585 345.6841 599.1017
## Mar 1962       510.0848 433.7353 646.8169 391.0899 699.2010
## Apr 1962       519.1891 418.8586 632.5213 375.7208 685.5378
## May 1962       527.6191 416.8132 638.8202 371.9359 694.1236
## Jun 1962       597.6059 469.7070 730.7697 416.9301 796.0092
## Jul 1962       674.1815 518.5862 826.6530 457.8729 902.6495
## Aug 1962       666.2657 511.9731 824.0686 449.6093 901.9486
## Sep 1962       573.6394 441.7638 725.7011 385.8483 796.1653
## Oct 1962       511.4666 381.7623 636.5597 331.6121 699.9445
## Nov 1962       446.6849 329.2070 560.4302 284.3727 617.6363
## Dec 1962       490.8818 366.2764 629.2067 314.6143 694.9604
plot(forecast(quickModel), main = "Forecast from auto.arima, ets, thetam, nnetar, stlm, and tbats model")

plot of chunk quickstart

Fitting a model

The workhorse function of the package is hybridModel(), a function that combines several component models from the “forecast” package. At a minimum, the user must supply a ts or numeric vector for y. In this case, the ensemble will include all six component models: auto.arima(), ets(), thetam(), nnetar(), stlm(), and tbats(). To instead use only a subset of these models, pass a character string to the models argument with the first letter of each model to include. For example, to build an ensemble model on the gas dataset with auto.arima(), ets(), and tbats() components, run

# Build a hybrid forecast on the gas dataset using auto.arima, ets, and tbats models.
# Each model is given equal weight 
hm1 <- hybridModel(y = gas, models = "aet", weights = "equal")
## Fitting the auto.arima model
## Fitting the ets model
## Fitting the tbats model

The individual component models are stored inside the hybridModel objects and can viewed in their respective slots, and all the regular methods from the “forecast” package could be applied to these individual component models.

# View the individual models 
hm1$auto.arima
## Series: structure(c(1709, 1646, 1794, 1878, 2173, 2321, 2468, 2416, 2184,  2121, 1962, 1825, 1751, 1688, 1920, 1941, 2311, 2279, 2638, 2448,  2279, 2163, 1941, 1878, 1773, 1688, 1783, 1984, 2290, 2511, 2712,  2522, 2342, 2195, 1931, 1910, 1730, 1688, 1899, 1994, 2342, 2553,  2712, 2627, 2363, 2311, 2026, 1910, 1762, 1815, 2005, 2089, 2617,  2828, 2965, 2891, 2532, 2363, 2216, 2026, 1804, 1773, 2015, 2089,  2627, 2712, 3007, 2880, 2490, 2237, 2205, 1984, 1868, 1815, 2047,  2142, 2743, 2775, 3028, 2965, 2501, 2501, 2131, 2015, 1910, 1868,  2121, 2268, 2690, 2933, 3218, 3028, 2659, 2406, 2258, 2057, 1889,  1984, 2110, 2311, 2785, 3039, 3229, 3070, 2659, 2543, 2237, 2142,  1962, 1910, 2216, 2437, 2817, 3123, 3345, 3112, 2659, 2469, 2332,  2110, 1910, 1941, 2216, 2342, 2923, 3229, 3513, 3355, 2849, 2680,  2395, 2205, 1994, 1952, 2290, 2395, 2965, 3239, 3608, 3524, 3018,  2648, 2363, 2247, 1994, 1941, 2258, 2332, 3323, 3608, 3957, 3672,  3155, 2933, 2585, 2384, 2057, 2100, 2458, 2638, 3292, 3724, 4652,  4379, 4231, 3756, 3429, 3461, 3345, 4220, 4874, 5064, 5951, 6774,  7997, 7523, 7438, 6879, 6489, 6288, 5919, 6183, 6594, 6489, 8040,  9715, 9714, 9756, 8595, 7861, 7753, 8154, 7778, 7402, 8903, 9742,  11372, 12741, 13733, 13691, 12239, 12502, 11241, 10829, 11569,  10397, 12493, 11962, 13974, 14945, 16805, 16587, 14225, 14157,  13016, 12253, 11704, 12275, 13695, 14082, 16555, 17339, 17777,  17592, 16194, 15336, 14208, 13116, 12354, 12682, 14141, 14989,  16159, 18276, 19157, 18737, 17109, 17094, 15418, 14312, 13260,  14990, 15975, 16770, 19819, 20983, 22001, 22337, 20750, 19969,  17293, 16498, 15117, 16058, 18137, 18471, 21398, 23854, 26025,  25479, 22804, 19619, 19627, 18488, 17243, 18284, 20226, 20903,  23768, 26323, 28038, 26776, 22886, 22813, 22404, 19795, 18839,  18892, 20823, 22212, 25076, 26884, 30611, 30228, 26762, 25885,  23328, 21930, 21433, 22369, 24503, 25905, 30605, 34984, 37060,  34502, 31793, 29275, 28305, 25248, 27730, 27424, 32684, 31366,  37459, 41060, 43558, 42398, 33827, 34962, 33480, 32445, 30715,  30400, 31451, 31306, 40592, 44133, 47387, 41310, 37913, 34355,  34607, 28729, 26138, 30745, 35018, 34549, 40980, 42869, 45022,  40387, 38180, 38608, 35308, 30234, 28801, 33034, 35294, 33181,  40797, 42355, 46098, 42430, 41851, 39331, 37328, 34514, 32494,  33308, 36805, 34221, 41020, 44350, 46173, 44435, 40943, 39269,  35901, 32142, 31239, 32261, 34951, 38109, 43168, 45547, 49568,  45387, 41805, 41281, 36068, 34879, 32791, 34206, 39128, 40249,  43519, 46137, 56709, 52306, 49397, 45500, 39857, 37958, 35567,  37696, 42319, 39137, 47062, 50610, 54457, 54435, 48516, 43225,  42155, 39995, 37541, 37277, 41778, 41666, 49616, 57793, 61884,  62400, 50820, 51116, 45731, 42528, 40459, 40295, 44147, 42697,  52561, 56572, 56858, 58363, 45627, 45622, 41304, 36016, 35592,  35677, 39864, 41761, 50380, 49129, 55066, 55671, 49058, 44503,  42145, 38698, 38963, 38690, 39792, 42545, 50145, 58164, 59035,  59408, 55988, 47321, 42269, 39606, 37059, 37963, 31043, 41712,  50366, 56977, 56807, 54634, 51367, 48073, 46251, 43736, 39975,  40478, 46895, 46147, 55011, 57799, 62450, 63896, 57784, 53231,  50354, 38410, 41600, 41471, 46287, 49013, 56624, 61739, 66600,  60054), .Tsp = c(1956, 1995.58333333333, 12), class = "ts") 
## ARIMA(2,1,1)(1,0,0)[12]                    
## 
## Coefficients:
##          ar1     ar2      ma1    sar1
##       0.5117  0.1824  -0.9638  0.8478
## s.e.  0.0502  0.0498   0.0134  0.0277
## 
## sigma^2 estimated as 3201509:  log likelihood=-4236.9
## AIC=8483.81   AICc=8483.94   BIC=8504.63
# See forecasts from the auto.arima model
plot(forecast(hm1$auto.arima))

plot of chunk individualModels

Model diagnostics

The hybridModel() function produces an S3 object of class forecastHybrid.

class(hm1) 
## [1] "hybridModel"
is.hybridModel(hm1)
## [1] TRUE

The print() and summary() methods print information about the ensemble model including the weights assigned to each individual component model.

print(hm1) 
## Hybrid forecast model comprised of the following models: auto.arima, ets, tbats
## ############
## auto.arima with weight 0.333 
## ############
## ets with weight 0.333 
## ############
## tbats with weight 0.333
summary(hm1)
##            Length Class  Mode     
## auto.arima  18    ARIMA  list     
## ets         19    ets    list     
## tbats       25    tbats  list     
## weights      3    -none- numeric  
## frequency    1    -none- numeric  
## x          476    ts     numeric  
## models       3    -none- character
## fitted     476    ts     numeric  
## residuals  476    ts     numeric

Two types of plots can be created for the created ensemble model: either a plot showing the actual and fitted value of each component model on the data or individual plots of the component models as created by their regular S3 plot() methods. Note that a plot() method does not exist in the “forecast” package for objects generated with stlm(), so this component model will be ignored when type = "models", but the other component models will be plotted regardless.

plot(hm1, type = "fit") 

plot of chunk plots

plot(hm1, type = "models")

plot of chunk plotsplot of chunk plotsplot of chunk plots

Since version 0.4.0, ggplot graphs are available. Note, however, that the nnetar, and tbats models do not have ggplot::autoplot() methods, so these are not plotted.

plot(hm1, type = "fit", ggplot = TRUE) 

plot of chunk plots_ggplot

plot(hm1, type = "models", ggplot = TRUE)

By default each component model is given equal weight in the final ensemble. Empirically this has been shown to give good performance in ensembles [see @Armstrong2001], but alternative combination methods are available: the inverse root mean square error (RMSE), inverse mean absolute error (MAE), and inverse mean absolute scaled error (MASE). To apply one of these weighting schemes of the component models, pass this value to the errorMethod argument and pass either "insample.errors" or "cv.errors" to the weights argument.

hm2 <- hybridModel(wineind, weights = "insample.errors", errorMethod = "MASE", models = "aenst")
## Warning in hybridModel(wineind, weights = "insample.errors", errorMethod =
## "MASE", : Using insample.error weights is not recommended for accuracy and
## may be deprecated in the future.
## Fitting the auto.arima model
## Fitting the ets model
## Fitting the nnetar model
## Fitting the stlm model
## Fitting the tbats model
hm2 
## Hybrid forecast model comprised of the following models: auto.arima, ets, nnetar, stlm, tbats
## ############
## auto.arima with weight 0.067 
## ############
## ets with weight 0.07 
## ############
## nnetar with weight 0.614 
## ############
## stlm with weight 0.088 
## ############
## tbats with weight 0.161

After the model is fit, these weights are stored in the weights attribute of the model. The user can view and manipulated these weights after the fit is complete. Note that the hybridModel() function automatically scales weights to sum to one, so a user should similar scale the weights to ensure the forecasts remain unbiased. Furthermore, the vector that replaces weights must retain names specifying the component model it corresponds to since weights are not assigned by position but rather by component name. Similarly, indiviudal components may also be replaced

hm2$weights 
## auto.arima        ets     nnetar       stlm      tbats 
## 0.06727048 0.06990493 0.61415737 0.08808064 0.16058657
newWeights <- c(0.1, 0.2, 0.3, 0.1, 0.3)
names(newWeights) <- c("auto.arima", "ets", "nnetar", "stlm", "tbats")
hm2$weights <- newWeights
hm2
## Hybrid forecast model comprised of the following models: auto.arima, ets, nnetar, stlm, tbats
## ############
## auto.arima with weight 0.1 
## ############
## ets with weight 0.2 
## ############
## nnetar with weight 0.3 
## ############
## stlm with weight 0.1 
## ############
## tbats with weight 0.3
hm2$weights[1] <- 0.2
hm2$weights[2] <- 0.1
hm2
## Hybrid forecast model comprised of the following models: auto.arima, ets, nnetar, stlm, tbats
## ############
## auto.arima with weight 0.2 
## ############
## ets with weight 0.1 
## ############
## nnetar with weight 0.3 
## ############
## stlm with weight 0.1 
## ############
## tbats with weight 0.3

This hybridModel S3 object can be manipulated with the same familiar interface from the “forecast” package, including S3 generic functions such as accuracy, forecast, fitted, and residuals.

# View the first 10 fitted values and residuals
head(fitted(hm1))
##           Jan      Feb      Mar      Apr      May      Jun
## 1956 1617.665 1689.662 1803.437 1857.528 2162.227 2325.729
head(residuals(hm1))
##           Jan      Feb      Mar      Apr      May      Jun
## 1956 1617.665 1689.662 1803.437 1857.528 2162.227 2325.729

In-sample errors and various accuracy measure can be extracted with the accuracy method. The “forecastHybrid” package creates an S3 generic from the accuracy method in the “forecast” package, so accuracy will continue to function as normal with objects from the “forecast” package, but now special functionality is created for hybridModel objects. To view the in-sample accuracy for the entire ensemble, a simple call can be made.

accuracy(hm1) 
##                ME     RMSE      MAE       MPE    MAPE        ACF1
## Test set 72.77448 1439.943 783.3084 0.4407286 3.44164 -0.09199131
##          Theil's U
## Test set 0.4757149

In addition to retrieving the ensemble's accuracy, the individual component models' accuracies can be easily viewed by using the individual = TRUE argument.

accuracy(hm1, individual = TRUE) 
## $auto.arima
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 151.1913 1779.854 1005.769 0.8861332 4.446548 0.5391395
##                      ACF1
## Training set -0.002784589
## 
## $ets
##                    ME     RMSE      MAE       MPE    MAPE      MASE
## Training set 41.67757 1451.139 788.3641 0.2856501 3.54687 0.4226001
##                    ACF1
## Training set -0.1370769
## 
## $tbats
##                    ME     RMSE      MAE       MPE     MAPE      MASE
## Training set 25.45452 1455.045 796.1146 0.1504024 3.501799 0.4610487
##                     ACF1
## Training set -0.07185595

Forecasting

Now's let's forecast future values. The forecast() function produce an S3 class forecast object for the next 48 periods from the ensemble model.

hForecast <- forecast(hm1, h = 48) 

Now plot the forecast for the next 48 periods. The prediction intervals are preserved from the individual component models and currently use the most extreme value from an individual model, producing a conservative estimate for the ensemble's performance.

plot(hForecast) 

plot of chunk plot_forecast

Advanced usage

The package aims to make fitting ensembles easy and quick, but it still allows advanced tuning of all the parameters available in the “forecast” package. This is possible through usage of the a.args, e.args, n.args, s.args, and t.args lists. These optional list arguments may be applied to one, none, all, or any combination of the included individual component models. Consult the documentation in the “forecast” package for acceptable arguments to pass in the auto.arima, ets, nnetar, stlm, and tbats functions.

hm2 <- hybridModel(y = gas, models = "aefnst",
                   a.args = list(max.p = 12, max.q = 12, approximation = FALSE),
                   n.args = list(repeats = 50),
                   s.args = list(robust = TRUE),
                   t.args = list(use.arma.errors = FALSE)) 
## Fitting the auto.arima model
## Fitting the ets model
## Fitting the thetam model
## Fitting the nnetar model
## Fitting the stlm model
## Fitting the tbats model

Since the lambda argument is shared between most of the models in the “forecast” framework, it is included as a special paramemeter that can be used to set the Box-Cox transform in all models instead of settings this individually. For example,

hm3 <- hybridModel(y = wineind, models = "ae", lambda = 0.15)
## Fitting the auto.arima model
## Fitting the ets model
hm3$auto.arima$lambda 
## [1] 0.15
## attr(,"biasadj")
## [1] FALSE
hm3$ets$lambda
## [1] 0.15
## attr(,"biasadj")
## [1] FALSE

Users can still apply the lambda argument through the tuning lists, but in this case the list-supplied argument overwrites the default used across all models. Compare the following two results.

hm4 <- hybridModel(y = wineind, models = "aens", lambda = 0.2,
                   a.args = list(lambda = 0.5),
                   n.args = list(lambda = 0.6)) 
## Fitting the auto.arima model
## Fitting the ets model
## Fitting the nnetar model
## Fitting the stlm model
hm4$auto.arima$lambda
## [1] 0.5
## attr(,"biasadj")
## [1] FALSE
hm4$ets$lambda
## [1] 0.2
## attr(,"biasadj")
## [1] FALSE
hm4$nnetar$lambda
## [1] 0.6
hm4$stlm$lambda
## [1] 0.2
## attr(,"biasadj")
## [1] FALSE

Note that lambda has no impact on thetam models, and that there is no f.args argument to provide parguments to thetam. Following forecast::thetaf on which thetam is based, there are no such arguments; it always runs with the defaults.

Covariates can also be supplied to auto.arima and nnetar models as is done in the “forecast” package. To do this, utilize the a.args and n.args lists. Note that the xreg may also be passed to a stlm model, but only when method = "arima" instead of the default method = "ets". Unlike the usage in the “forecast” package, the xreg argument should be passed as a dataframe, not a matrix. The stlm models require that the input series will be seasonal, so in the example below we will convert the input data to a ts object. If a xreg is used in training, it must also be supplied to the forecast() function in the xreg argument. Note that if the number of rows in the xreg to be used for the forecast does not match the supplied h forecast horizon, the function will overwrite h with the number of rows in xreg and issue a warning.

# Use the beaver1 dataset with the variable "activ" as a covariate and "temp" as the timeseries
# Divice this into a train and test set
trainSet <- beaver1[1:100, ] 
testSet <- beaver1[-(1:100), ]
trainXreg <- data.frame(trainSet$activ)
testXreg <- data.frame(testSet$activ)

# Create the model
beaverhm <- hybridModel(ts(trainSet$temp, f = 6),
                        models = "aenst",
                        a.args = list(xreg = trainXreg),
                        n.args = list(xreg = trainXreg),
                        s.args = list(xreg = trainXreg, method = "arima"))
## Fitting the auto.arima model
## Fitting the ets model
## Fitting the nnetar model
## Fitting the stlm model
## Fitting the tbats model
# Forecast future values
beaverfc <- forecast(beaverhm, xreg = testXreg)

# View the accuracy of the model
accuracy(beaverfc, testSet$temp)
##                        ME       RMSE        MAE         MPE      MAPE
## Training set 0.0006061344 0.07607102 0.05167769 0.001285375 0.1399761
## Test set     0.0721321277 0.10286135 0.08332239 0.195132742 0.2256133
##                  MASE       ACF1
## Training set 0.783475 0.01698471
## Test set     1.263234         NA