Test of association between a count response and
one or more covariate sets.
This test may be conceptualised as
a test of overall significance in regression analysis,
where the response variable is overdispersed, and where
the number of explanatory variables (p
)
exceeds the sample size (n
).
The negative binomial distribution accounts for overdispersion
and a random effect model accounts for high dimensionality
(p
>>n
).
omnibus(y, X, offset = NULL, group = NULL, mu = NULL, phi = NULL, perm = 1000, kind = 1)
y | response variable:
numeric vector of length |
---|---|
X | one covariate set:
numeric matrix with |
offset | numeric vector of length |
group | confounding variable:
factor of length |
mu | mean parameters:
numeric vector of length |
phi | dispersion parameter: non-negative real number |
perm | number of iterations: positive integer |
kind | computation : number between 0 and 1 |
The function returns a dataframe, with the p-value in the first column, and the test statistic in the second column.
The user can provide a common mu
for all samples
or sample-specific mu
, and a common phi
.
Setting phi
equal to zero is equivalent
to using the Poisson model.
If mu
is missing, then mu
is estimated from y
.
If phi
is missing, then mu
and phi
are estimated from y
.
The offset
is only taken into account
for estimating mu
or phi
.
By default the offset is rep(1,n)
.
The user can provide the confounding variable group
.
Note that each level of group
must appear at least twice
in order to allow stratified permutations.
Efficient alternatives to classical permutation (kind=1
)
are the method of control variates (kind=0
)
and permutation in chunks (0 < kind
< 1)
details.
A Rauschenberger, MA Jonker, MA van de Wiel, and RX Menezes (2016). "Testing for association between RNA-Seq and high-dimensional data", BMC Bioinformatics. 17:118. html pdf (open access)
RX Menezes, L Mohammadi, JJ Goeman, and JM Boer (2016). "Analysing multiple types of molecular profiles simultaneously: connecting the needles in the haystack", BMC Bioinformatics. 17:77. html pdf (open access)
S le Cessie, and HC van Houwelingen (1995). "Testing the fit of a regression model via score tests in random effects models", Biometrics. 51:600-614. html pdf (restricted access)
The function proprius
calculates
the contributions of individual samples or covariates
to the test statistic.
The function cursus
tests for association
between RNA-Seq and local genetic or epigenetic alternations
across the whole genome.
All other functions of the R package globalSeq
are internal
.
# simulate high-dimensional data n <- 30; p <- 100 y <- rnbinom(n,mu=10,size=1/0.25) X <- matrix(rnorm(n*p),nrow=n,ncol=p) # hypothesis testing omnibus(y,X)#> pvalue teststat covs #> 1 0.149 4.170036 100