toptable                package:limma                R Documentation

_T_a_b_l_e _o_f _T_o_p _G_e_n_e_s _f_r_o_m _L_i_n_e_a_r _M_o_d_e_l _F_i_t

_D_e_s_c_r_i_p_t_i_o_n:

     Extract a table of the top-ranked genes from a linear model fit.

_U_s_a_g_e:

     topTable(fit,coef=1,number=10,genelist=NULL,adjust.method="BH",sort.by="B",resort.by=NULL)
     toptable(fit,coef=1,number=10,genelist=NULL,A=NULL,eb=NULL,adjust.method="BH",sort.by="B",resort.by=NULL,...)

_A_r_g_u_m_e_n_t_s:

     fit: list containing a linear model fit produced by 'lmFit',
          'lm.series', 'gls.series' or 'mrlm'. For 'topTable', 'fit'
          should be an object of class 'MArrayLM' as produced by
          'lmFit'.

    coef: column number or column name specifying which coefficient or
          contrast of the linear model is of interest

  number: how many genes to pick out

genelist: data frame or character vector containing gene information.
          If not specified, this will be taken from the 'genes'
          component of 'fit'.

       A: matrix of A-values or vector of average A-values.

      eb: output list from 'ebayes(fit)'

adjust.method: method to use to adjust the p-values for multiple
          testing.  Options are '"bonferroni"', '"holm"', '"hochberg"',
          '"hommel"', '"fdr"' and '"none"'. If '"none"' then the
          p-values are not adjusted. A 'NULL' value will result in the
          default adjustment method, which is '"holm"'.

 sort.by: character string specifying statistic to rank genes by. 
          Possibilities are '"M"', '"A"', '"T"', '"t"', '"P"', '"p"' or
          '"B"'.

resort.by: character string specifying statistic to sort the selected
          genes by in the output data.frame.  Possibilities are '"M"',
          '"A"', '"T"', '"t"', '"P"', '"p"' or '"B"'.

     ...: any other arguments are passed to 'ebayes' if 'eb' is 'NULL'

_D_e_t_a_i_l_s:

     This function summarizes a linear model fit object produced by
     'lmFit', 'lm.series', 'gls.series' or 'mrlm' by selecting the
     top-ranked genes for any given contrast. 'topTable()' assumes that
     the linear model fit has already been processed by 'eBayes()'.
     Note that 'toptable' is the earlier interface and is being phased
     out.

     The p-values for the coefficient/contrast of interest are adjusted
     for multiple testing by a call to 'p.adjust'. The '"holm"' method
     is the default because it is conservative and valid for any type
     of dependence between the p-values. In most microarray contexts
     however the less conservative Benjamini and Hochberg method
     '"fdr"' may be more suitable. See 'help("p.adjust")' for more
     information. Note, if there is no good evidence for differential
     expression in the experiment, that it is quite possible for all
     the adjusted p-values to be large, even for all of them to be
     equal to one. It is quite possible for all the adjusted p-values
     to be equal to one if the smallest p-value is no smaller than
     '1/ngenes' where 'ngenes' is the number of genes with non-missing
     p-values. Note that p-values adjusted to control the false
     discovery rate are often called q-values.

     The 'sort.by' argument specifies the criterion used to select the
     top genes. The choices are: '"M"' to sort by the (absolute)
     coefficient representing the log-fold-change; '"A"' to sort by
     average expression level (over all arrays) in descending order;
     '"T"' or '"t"' for absolute t-statistic; '"P"' or '"p"' for
     p-values; or '"B"' for the 'lods' or B-statistic.

     Normally the genes appear in order of selection in the output
     table. If one wants the table to be in a different order, the
     'resort.by' argument may be used. For example, 'topTable(fit,
     sort.by="B", resort.by="M")' selects the top genes according to
     log-odds of differential expression and then orders the resulting
     genes by log-ratio in decreasing order. Or 'topTable(fit,
     sort.by="M", resort.by="M")' would select the genes by absolute
     log-ratio and then sort then by signed log-ratio from must
     positive to most negative.

_V_a_l_u_e:

     A dataframe with a row for the 'number' top genes and the
     following columns: 

genelist: if genelist was included as input

       M: estimate of the effect or the contrast, on the log2 scale

       t: moderated t-statistic

 P.Value: adjusted p-value or q-value

       B: log odds that the gene is differentially expressed

_A_u_t_h_o_r(_s):

     Gordon Smyth

_S_e_e _A_l_s_o:

     An overview of linear model and testing functions is given in
     06.LinearModels. See also 'p.adjust' in the 'stats' package.

_E_x_a_m_p_l_e_s:

     #  Simulate gene expression data,
     #  6 microarrays and 100 genes with first gene differentially expressed
     M <- matrix(rnorm(100*6,sd=0.3),100,6)
     M[1,1:3] <- M[1,1:3] + 2
     #  Design matrix includes two treatments, one for first 3 and one for last 3 arrays
     design <- cbind(First3Arrays=c(1,1,1,0,0,0),Last3Arrays=c(0,0,0,1,1,1))
     fit <- lmFit(M, design=design)
     fit <- eBayes(fit)
     topTable(fit)

