For each gene in sca, fits the hurdle model in formula
(linear for et>0), logistic for et==0 vs et>0.
Return an object of class ZlmFit
containing slots giving the coefficients, variance-covariance matrices, etc.
After each gene, optionally run the function on the fit named by 'hook'
zlm(formula, sca, method = "bayesglm", silent = TRUE, ebayes = TRUE, ebayesControl = NULL, force = FALSE, hook = NULL, parallel = TRUE, LMlike, onlyCoef = FALSE, ...)
formula | a formula with the measurement variable on the LHS and predictors present in colData on the RHS |
---|---|
sca | SingleCellAssay object |
method | character vector, either 'glm', 'glmer' or 'bayesglm' |
silent | Silence common problems with fitting some genes |
ebayes | if TRUE, regularize variance using empirical bayes method |
ebayesControl | list with parameters for empirical bayes procedure. See ebayes. |
force | Should we continue testing genes even after many errors have occurred? |
hook | a function called on the |
parallel | If TRUE and |
LMlike | if provided, then the model defined in this object will be used, rather than following the formulas. This is intended for internal use. |
onlyCoef | If TRUE then only an array of model coefficients will be returned (probably only useful for bootstrapping). |
... | arguments passed to the S4 model object upon construction. For example, |
a object of class ZlmFit
with methods to extract coefficients, etc.
OR, if data is a data.frame
just a list of the discrete and continuous fits.
The empirical bayes regularization of the gene variance assumes that the precision (1/variance) is drawn from a
gamma distribution with unknown parameters.
These parameters are estimated by considering the distribution of sample variances over all genes.
The procedure used for this is determined from
ebayesControl
, a named list with components 'method' (one of 'MOM' or 'MLE') and 'model' (one of 'H0' or 'H1')
method MOM uses a method-of-moments estimator, while MLE using the marginal likelihood.
H0 model estimates the precisions using the intercept alone in each gene, while H1 fits the full model specified by formula
ZlmFit-class, ebayes, GLMlike-class, BayesGLMlike-class
data(vbetaFA) zlmVbeta <- zlm(~ Stim.Condition, subset(vbetaFA, ncells==1)[1:10,])#> #>slotNames(zlmVbeta)#> [1] "coefC" "coefD" "vcovC" #> [4] "vcovD" "LMlike" "sca" #> [7] "deviance" "loglik" "df.null" #> [10] "df.resid" "dispersion" "dispersionNoshrink" #> [13] "priorDOF" "priorVar" "converged" #> [16] "hookOut"#A matrix of coefficients coef(zlmVbeta, 'D')['CCL2',]#> (Intercept) Stim.ConditionUnstim #> -3.8329217 -0.5108005#An array of covariance matrices vcov(zlmVbeta, 'D')[,,'CCL2']#> (Intercept) Stim.ConditionUnstim #> (Intercept) 0.1439196 -0.1254838 #> Stim.ConditionUnstim -0.1254838 0.9182853#> , , metric = lambda #> #> test.type #> primerid cont disc hurdle #> B3GAT1 0.9617702324 0.038250068 1.000020 #> BAX 7.2211565188 3.645901878 10.867058 #> BCL2 0.3766814067 2.202291748 2.578973 #> CCL2 0.8414775522 0.284135226 1.125613 #> CCL3 NA 3.548463195 NA #> CCL4 NA 2.012308210 NA #> CCL5 0.1746468538 0.862093478 1.036740 #> CCR2 5.3383734489 2.308408187 7.646782 #> CCR4 2.0437612666 0.003737042 2.047498 #> CCR5 0.0005534473 2.811952306 2.812506 #> #> , , metric = df #> #> test.type #> primerid cont disc hurdle #> B3GAT1 1 1 2 #> BAX 1 1 2 #> BCL2 1 1 2 #> CCL2 1 1 2 #> CCL3 1 1 2 #> CCL4 1 1 2 #> CCL5 1 1 2 #> CCR2 1 1 2 #> CCR4 1 1 2 #> CCR5 1 1 2 #> #> , , metric = Pr(>Chisq) #> #> test.type #> primerid cont disc hurdle #> B3GAT1 0.326741272 0.84494185 0.606524503 #> BAX 0.007204927 0.05620735 0.004367654 #> BCL2 0.539384664 0.13780573 0.275412150 #> CCL2 0.358974532 0.59400356 0.569608276 #> CCL3 NA 0.05960063 NA #> CCL4 NA 0.15602777 NA #> CCL5 0.676014596 0.35315351 0.595490308 #> CCR2 0.020860929 0.12867576 0.021853574 #> CCR4 0.152831343 0.95125460 0.359245545 #> CCR5 0.981231129 0.09356445 0.245059834 #>## Can also provide just a \code{data.frame} instead data<- data.frame(x=rnorm(500), z=rbinom(500, 1, .3)) logit.y <- with(data, x*2 + z*2); mu.y <- with(data, 10+10*x+10*z + rnorm(500)) y <- (runif(500)<exp(logit.y)/(1+exp(logit.y)))*1 y[y>0] <- mu.y[y>0] data$y <- y fit <- zlm(y ~ x+z, data) summary.glm(fit$disc)#> #> Call: #> NULL #> #> Deviance Residuals: #> Min 1Q Median 3Q Max #> -60.165 -1.166 1.019 1.175 10.305 #> #> Coefficients: #> Estimate Std. Error z value Pr(>|z|) #> (Intercept) -0.1700 0.1463 -1.162 0.245 #> x 2.4832 0.2276 10.911 < 2e-16 *** #> z 2.6332 0.3376 7.800 6.2e-15 *** #> --- #> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 #> #> (Dispersion parameter for binomial family taken to be 1) #> #> Null deviance: 671.36 on 499 degrees of freedom #> Residual deviance: 372.26 on 195 degrees of freedom #> AIC: 378.26 #> #> Number of Fisher Scoring iterations: 6 #>summary.glm(fit$cont)#> #> Call: #> NULL #> #> Deviance Residuals: #> Min 1Q Median 3Q Max #> -2.93103 -0.66362 0.00543 0.72475 3.12330 #> #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 9.98163 0.09703 102.87 <2e-16 *** #> x 9.98076 0.08343 119.63 <2e-16 *** #> z 10.11319 0.12710 79.57 <2e-16 *** #> --- #> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 #> #> (Dispersion parameter for gaussian family taken to be 1.07014) #> #> Null deviance: 17942.40 on 301 degrees of freedom #> Residual deviance: 319.97 on 299 degrees of freedom #> AIC: 882.5 #> #> Number of Fisher Scoring iterations: 2 #>