[the inference functions0]    The R Inference Functions

Contents

   
General properties
   
Samples
   
Summary
   
Rank
   
DIC

General properties [top]

These R functions are for making inferences about parameters of the model or about the fit of the model. The commands are divided into three groups: the first group 'Samples' concern an entire set of monitored values for a variable; the next group 'Summary' and 'Rank' are space-saving short-cuts that monitor running statistics; and the final group, DIC, concerns evaluation of the Deviance Information Criterion proposed by Spiegelhalter  et al . (2002) . Users should ensure their simulation has converged before using functions in the Summary, Rank or DIC groups. Note that if the MCMC simulation has an adaptive phase it will not be possible to make inference using values sampled before the end of this phase.

Samples... [top]

This command opens a non-modal dialog for analysing stored samples of variables produced by the MCMC simulation. The fields are:

The functions
samples.set, samples.clear, samples.stats, samples.history, samples.autoC, samples.density, samples.bgr, samples.correl act on a variable of interest. This variable of interest must be given as the node argument of above samples functions. It can either be the name of a variable in the model or an R object with the same name as a variable in the model. If the variable of interest is an array, slices of the array can be selected using the notation variable[lower0:upper0, lower1:upper1, ...]. A star '*' can be entered as shorthand for all the stored samples. The beg and end arguments can be used to select a slice of monitored values corresponding to iterations beg:end. Likewise the firstChain and lastChain arguments can be to select a sub group of chains to calculate statistics for. The thin argument can be used to only use every thin th value of the stored sample for statistics. If these parameters are left at their default values the whole sample for all chains will be used in calculating statistics.

WinBUGS generally automatically sets up a logical node to measure a quantity known as deviance ; this may be accessed, in the same way as any other variable of interest, by typing its name, i.e. "deviance", in the node field of the Sample Monitor Tool . The definition of deviance is -2 * log(likelihood): 'likelihood' is defined as p( y | theta ), where y comprises all stochastic nodes given values (i.e. data), and theta comprises the stochastic parents of y - 'stochastic parents' are the stochastic nodes upon which the distribution of y depends, when collapsing over all logical relationships.

samples.set: The function samples.set(node) is used to start recording a chain of values for the variable node.

samples.clear: The function samples.clear(node) is used to removes the stored values of the variable from computer memory.

samples.history: The function samples.history( node, beg = samples.get.beg(), end = samples.get.end(),firstChain = samples.get.firstChain(), lastChain = samples.get.lastChain(),
thin = samples.get.thin()
) plots out a complete trace for the variable.

The next four functions can only be executed if the MCMC simulation is not in an adaptive phase.

samples.density: The function samples.density( node, beg = samples.get.beg(), end = samples.get.end(), firstChain = samples.get.firstChain(), lastChain = samples.get.lastChain(),
thin = samples.get.thin()
) plots a smoothed kernel density estimate for the variable if it is continuous or a histogram if it is discrete.

samples.autoC: The function samples.autoC( node, beg = samples.get.beg(), end = samples.get.end(), firstChain = samples.get.firstChain(), lastChain = samples.get.lastChain(),
thin = samples.get.thin()
) plots the autocorrelation function of the variable.

sampless.stats: The function samples.stats( node, beg = samples.get.beg(), end = samples.get.end(), firstChain = samples.get.firstChain(), lastChain = samples.get.lastChain(),
thin = samples.get.thin()
) produces summary statistics for the variable, pooling over the chains selected. The required percentiles can be selected using the percentile selection box. The quantity reported in the MC error column gives an estimate of s / N 1/2 , the Monte Carlo standard error of the mean. The batch means method outlined by Roberts (1996; p.50) is used to estimate s.

samples.bgr: The function samples.bgr( node, beg = samples.get.beg(), end = samples.get.end(), firstChain = samples.get.firstChain(), lastChain = samples.get.lastChain(),
thin = samples.get.thin()
) calculates the Gelman-Rubin convergence statistic, as modified by Brooks and Gelman (1998). The width of the central 80% interval of the pooled runs is green, the average width of the 80% intervals within the individual runs is blue, and their ratio R (= pooled / within) is red - for plotting purposes the pooled and within interval widths are normalised to have an overall maximum of one. The statistics are calculated in bins of length 50: R would generally be expected to be greater than 1 if the starting values are suitably over-dispersed. Brooks and Gelman (1998) emphasise that one should be concerned both with convergence of R to 1, and with convergence of both the pooled and within interval widths to stability.

The following low level functions can be used to perform calculations on stored samples.

samples.set.beg :The function samples.set.beg(beg) is used to set the first iteration of the stored sample used for calculating statistics to beg .

samples.set.end:
The function samples.set.end(end) is used to set the last iteration of the stored sample used for calculating statistics to end.

samples.set.thin: The function samples.set.thin(thin) is used to set numerical field used to select every k th iteration of each chain to contribute to the statistics being calculated, where k is the value of the field. Note the difference between this and the thinning facility of the update function: when thinning via the update function we are permanently discarding samples as the MCMC simulation runs, whereas here we have already generated (and stored) a suitable number of (posterior) samples and may wish to discard some of them only temporarily. Thus, setting k > 1 here will not have any impact on the storage (memory) requirements; if you wish to reduce the number of samples actually stored (to free-up memory) you should thin via the update function.

samples.set.firstChain: The function samples.set.firstChain(firstChain) is used to set the first chain of the stored sample used for calculating statistics to be firstChain .

samples.set.lastChain: The function samples.set.lastChain(lastChain) is used to set the last chain of the stored sample used for calculating statistics to be lastChain .

samples.get.beg :The function samples.get.beg() returns the first iteration of the stored sample used for calculating statistics.

samples.get.end:
The function samples.get.end() returns the last iteration of the stored sample used for calculating statistics to end.

samples.get.thin: The function samples.set.thin(thin) returns the thin parameter.

samples.get.firstChain: The function samples.get.firstChain() returns the first chain of the stored sample used for calculating statistics.

samples.get.lastChain: The function samples.get.lastChain() returns the last chain of the stored sample used for calculating statistics to be lastChain .

The next three functions have the implicit arguments beg =  samples.get.beg(), end = samples.get.end(), thin =  samples.get.thin(), firstChain = samples.get.firstChain(), lastChain =  samples.get.lastChain(). They can be used to retrieve stored samples for a a set of nodes.

samples.size: The samples.size(node) function returns the size of the stored sample for the scalar quantity node.

samples.sample: The samples.samples(node) function returns an array of stored values for the scalar quanity node.

samples.monitors: The samples.monitors(node) function returns a list of scalar names as strings that have a monitor set for them and with the monitors having some stored values between beg and end. node can be a vector quantity with sub ranges given to indices and node can be '*'.

Summary... [top]

The functions summary.set, summary.stats, summary.clear , are used to calculate running means, standard deviations and quantiles. The functions are less powerful and general than the samples functions, but they also require much less storage (an important consideration when many variables and/or long runs are of interest). They take a single argument node which can either be a name of a quantity in the model as a string or an R object with the same name as a quantity in the model.

summary.set: The function summary.set(node) creates a monitor that starts recording the running totals for node .

summary.stats: The function summary.stats(node) displays the running means, standard deviations, and 2.5%, 50% (median) and 97.5% quantiles for node . Note that these running quantiles are calculated via an approximate algorithm (see here for details) and should therefore be used with caution.

summary.clear: The function summary.clear(node) removes the monitor calculating running totals for node .

Rank... [top]

The functions rank.set, rank.stats, rank.clear , are used to calculate ranks of vector valued quantities in the model. They take a single argument node which can either be a name of a quantity in the model as a string or an R object with the same name as a quantity in the model.

rank.set: The function rank.set(node) creates a monitor that starts building running histograms to represent the rank of each component of node . An amount of storage proportional to the square of the number of components of node is allocated. Even when node has thousands of components this can require less storage than calculating the ranks explicitly in the model specification and storing their samples, and it is also much quicker.

rank.stats: The function rank.stats(node) displays summarises of the distribution of the ranks of each component of the variable node .

rank.clear: The function rank.clear(node) removes the monitor calculating running running histograms for node .

DIC... [top]

The DIC functions are used to evaluate the Deviance Information Criterion (DIC; Spiegelhalter  et al ., 2002 ) and related statistics - these can be used to assess model complexity and compare different models. Most of the examples packaged with WinBUGS contain an example of their usage.

It is important to note that DIC assumes the posterior mean to be a good estimate of the stochastic parameters. If this is not so, say because of extreme skewness or even bimodality, then DIC may not be appropriate. There are also circumstances, such as with mixture models, in which WinBUGS will not permit the calculation of DIC and so the menu option is greyed out. Please see the WinBUGS 1.4 web-page for current restrictions:

    http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml

dic.set:
The function dic.set() creates monitors that start calculating DIC and related statistics - the user should ensure that convergence has been achieved before pressing set as all subsequent iterations will be used in the calculation.

dic.clear: The function dic.set() deletes monitors that have ben created calculating DIC and related statistics

dic.stats: The function dic.stats() displays the calculated statistics, as described below; please see Spiegelhalter  et al . (2002) for full details; the section Tricks: Advanced Use of the BUGS Language also contains some comments on the use of DIC.

Dbar: this is the posterior mean of the deviance, which is exactly the same as if the node 'deviance' had been monitored (see here ). This deviance is defined as -2 * log(likelihood): 'likelihood' is defined as p( y | theta ), where y comprises all stochastic nodes given values (i.e. data), and theta comprises the stochastic parents of y - 'stochastic parents' are the stochastic nodes upon which the distribution of y depends, when collapsing over all logical relationships.

Dhat: this is a point estimate of the deviance (-2 * log(likelihood)) obtained by substituting in the posterior means theta.bar of theta : thus Dhat = -2 * log(p( y | theta.bar )).

pD: this is 'the effective number of parameters', and is given by pD = Dbar - Dhat . Thus pD is the posterior mean of the deviance minus the deviance of the posterior means.

DIC: this is the 'Deviance Information Criterion', and is given by DIC = Dbar + pD = Dhat + 2 * pD . The model with the smallest DIC is estimated to be the model that would best predict a replicate dataset of the same structure as that currently observed.