bagging                package:GeneTS                R Documentation

_B_a_g_g_e_d _V_e_r_s_i_o_n_s _o_f _C_o_v_a_r_i_a_n_c_e _a_n_d (_P_a_r_t_i_a_l) _C_o_r_r_e_l_a_t_i_o_n _M_a_t_r_i_x

_D_e_s_c_r_i_p_t_i_o_n:

     'bagged.cov', 'bagged.cor', and 'bagged.pcor' calculate the
     bootstrap aggregated (=bagged) versions of the covariance and
     (partial) covariance estimators.  

     Theses estimators are advantageous especially for small sample
     size problems. For example, the bagged correlation matrix
     typically remains positive definite even when the sample size is
     much smaller than the number of variables.

     In Schaefer and Strimmer (2004) the inverse of the bagged
     correlation matrix is used to estimate graphical Gaussian models
     from sparse microarray data -  see also 'ggm.estimate.pcor' for
     various strategies to estimate partial correlation coefficients.

_U_s_a_g_e:

     bagged.cov(x, R=1000, ...)
     bagged.cor(x, R=1000, ...)
     bagged.pcor(x, R=1000, ...)

_A_r_g_u_m_e_n_t_s:

       x: data matrix or data frame

       R: number of bootstrap replicates (default: 1000)

     ...: options passed to 'cov', 'cor', and 'partial.cor'  (e.g., to
          control handling of missing values) 

_D_e_t_a_i_l_s:

     Bagging was first suggested by Breiman (1996) as a means to
     improve and estimator using the bootstrap. The bagged estimate is
     simply the mean of the bootstrap sampling distribution. Thus,
     bagging is essentially a variance reduction method. The bagged
     estimate may also be interpreted as (approximate) posterior mean
     estimate assuming some implicit prior.

_V_a_l_u_e:

     A symmetric matrix.

_A_u_t_h_o_r(_s):

     Juliane Schaefer (<URL:
     http://www.stat.uni-muenchen.de/~schaefer/>) and Korbinian
     Strimmer (<URL: http://www.stat.uni-muenchen.de/~strimmer/>).

_R_e_f_e_r_e_n_c_e_s:

     Breiman, L. (1996). Bagging predictors. _Machine Learning_, *24*,
     123-140.

     Schaefer, J., and Strimmer, K. (2004).  An empirical Bayes
     approach to inferring large-scale gene association networks.
     _Bioinformatics_ in press.

_S_e_e _A_l_s_o:

     'cov', 'cor', 'partial.cor', 'ggm.estimate.pcor', 'robust.boot'.

_E_x_a_m_p_l_e_s:

     # load GeneTS library
     library(GeneTS)

     # small example data set 
     data(caulobacter)
     dat <- caulobacter[,1:15]
     dim(dat)

     # bagged estimates
     b.cov <- bagged.cov(dat)
     b.cor <- bagged.cor(dat)
     b.pcor <- bagged.pcor(dat)

     # total squared difference
     sum( (b.cov - cov(dat))^2  )
     sum( (b.cor - cor(dat))^2  )
     sum( (b.pcor - partial.cor(dat))^2  )

     # positive definiteness of bagged correlation
     is.positive.definite(cor(dat))
     is.positive.definite(b.cor)

