\name{1.Introduction}
\alias{1.Introduction}
\title{Introduction to the LIMMA Package}
\description{
LIMMA is a library for the analysis of gene expression microarray data, especially the use of linear models for analysing designed experiments and the assessment of differential expression.
LIMMA provides the ability to analyse comparisons between many RNA targets simultaneously.
The normalization and data analysis functions are for two-colour spotted microarrays.
The linear model and differential expression functions apply to all microarrays including Affymetrix and other multi-array oligonucleotide experiments.

There are three types of documentation available.
(1) The \emph{\link[limma:../doc/usersguide]{LIMMA User's Guide}} can be reached through the "Accompanying documentation" at the top of the LIMMA contents page.
(2) An overview of limma functions grouped by purpose is contained in the numbered chapters at the top of the \link[limma:00Index]{LIMMA contents page}, of which this page is the first.
(3) The \link[limma:00Index]{LIMMA contents page} gives an alphabetical index of detailed help topics.
}
\author{Gordon Smyth}
\references{
Smyth, G. K., Yang, Y.-H., Speed, T. P. (2003). Statistical issues in microarray data analysis. In: \emph{Functional Genomics: Methods and Protocols}, M. J. Brownstein and A. B. Khodursky (eds.), Methods in Molecular Biology Volume 224, Humana Press, Totowa, NJ, pages 111-136.
}
\keyword{documentation}

\eof
\name{2.Classes}
\alias{2.Classes}
\title{Classes Defined by this Package}

\description{

This package defines the following data classes.
\describe{
\item{ \code{\link[limma:rglist]{RGList}} }{
	A class used to store raw intensities as they are read in from an image analysis output file,
	usually by \code{read.maimages}.}

\item{ \code{\link[limma:malist]{MAList}} }{
	Intensities converted to M-values and A-values, i.e., to contrasts on the log-scale.
	Usually created from an \code{RGList} using \code{MA.RG} or \code{normalizeBetweenArrays}.}

\item{ \code{\link[limma:marraylm]{MArrayLM}} }{
	Store the result of fitting gene-wise linear models to the normalized intensities or log-ratios.
	Usually created by \code{lmFit}.}
}

Objects of these classes may be \link[limma:subsetting]{subsetted} and multiple data objects may be \link[limma:cbind]{combined}.

All of these classes belong to the virtual class \code{\link[limma:LargeDataObject]{LargeDataObject}}.
A \code{show} method is defined for \code{LargeDataOject}s which uses the utility function \code{\link{printHead}}.
}

\author{Gordon Smyth}
\keyword{documentation}

\eof
\name{3.ReadingData}
\alias{3.ReadingData}
\title{Reading Microarray Data from Files}

\description{
This help page gives an overview of LIMMA functions used to read data into R from files.
}

\section{Reading Target Information}{
The function \code{\link{readTargets}} is designed to help with organizing information about which RNA sample is hybridized to each channel on each array and which files store information for each array.
}

\section{Reading Intensity Data}{
The first step in a microarray data analysis is to read into R the intensity data for each array provided by an image analysis program.
This is done using the function \code{\link{read.maimages}}.

\code{\link{read.maimages}} optionally constructs quality weights for each spot using quality functions listed in \link{QualityWeights}.

\code{read.maimages} produces an \code{RGList} object and stores only the information required from each image analysis output file.
If you wish to read all the image analysis output files into R as individual data frames containing all the original columns, you may use \code{\link{read.series}}.
An \code{RGList} object can be extracted from the data frames at a later stage using the functions \code{\link{rg.spot}}, \code{\link{rg.genepix}} or \code{\link{rg.quantarray}}.

Another function, \code{\link{rg.series.spot}} is very similar to \code{\link{read.maimages}} with \code{source="spot"}.
This function will be removed in future versions of LIMMA.
}

\code{\link{read.maimages}} uses utility functions \code{\link{removeExt}}, \code{\link{read.matrix}}, \code{\link{read.imagene}} and \code{\link{readImageneHeaders}}.

\section{Reading the Gene List}{

Many image analysis program provide gene IDs as columns in the image analysis output files, for example ArrayVision, Imagene and the Stanford Microarray Database.
In other cases you may have gene name and annoation information in a separate file.
The function \code{\link{readGAL}} reads information from a GenePix Allocation List (gal) file.
It produces a data frame with known column names.
If the gene names consist of a short name followed by annotation information, then \code{\link{splitName}} may be used to separate the name and annotation information into separate vectors.

The functions \code{\link{readSpotTypes}} and \code{\link{controlStatus}} assist with separating control spots from ordinary genes in the analysis and data exploration.

The function \code{\link{getLayout}} extracts from the gal-file data frame the print layout information for a spotted array.
The functions \code{\link{gridr}}, \code{\link{gridc}}, \code{\link{spotr}} and \code{\link{spotc}} use the extracted layout to compute grid positions and spot positions within each grid for each spot.
The function \code{\link{printorder}} calculates the printorder, plate number and plate row and column position for each spot given information about the printing process.

If each gene is printed more than once of the arrays, then \code{\link{uniquegenelist}} will remove duplicate names from the gal-file or gene list.

\code{\link[limma:cbind]{cbind}} allows different \code{RGList} or \code{MAList} objects to be combined assuming the layout of the arrays to be the same.
\code{\link[limma:merge]{merge}} can combine data even when the order of the genes on the arrays has changed.
\code{merge} uses utility function \code{\link{makeUnique}}.
}

\author{Gordon Smyth}
\keyword{documentation}

\eof
\name{4.Normalization}
\alias{4.Normalization}
\title{Normalization of Microarray Data}

\description{
This page gives an overview of the LIMMA functions available to normalize data from spotted two-colour microarrays.
Smyth and Speed (2003) give an overview of the normalization techniques implemented in the functions.

Usually data from spotted microarrays will be normalized using \code{\link{normalizeWithinArrays}}.
A minority of data will also be normalized using \code{\link{normalizeBetweenArrays}} if diagnostic plots suggest a difference in scale between the arrays.

In rare circumstances, data might be normalized using \code{\link{normalizeForPrintorder}} before using \code{\link{normalizeWithinArrays}}.

If one is planning analysis of single-channel information from the microarrays rather than analysis of differential expression based on log-ratios, then the data should be normalized using a single channel-normalization technique.
Single channel normalization uses further options of the \code{\link{normalizeBetweenArrays}} function.
For more details see the \emph{\link[limma:../doc/usersguide]{LIMMA User's Guide}} which includes a section on single-channel normalization.

\code{normalizeWithinArrays} uses utility functions \code{\link{loessFit}} and \code{\link{normalizeRobustSpline}}.
\code{normalizeBetweenArrays} uses utility functions \code{\link{normalizeMedians}}, \code{\link{normalizeMedianDeviations}} and \code{\link{normalizeQuantiles}}, none of which need to be called directly by users.
}

\section{Backgound Correction}{
Usually one doesn't need to explicitly ask for background correction of the intensities because this is done by default by \code{\link{normalizeWithinArrays}},
which subtract the background from the foreground intensities before applying the normalization method.
This default background correction method can be over-ridden by using \code{\link{backgroundCorrect}} which offers a number of alternative
background correct methods to simple subtraction.
Simply use \code{backgroundCorrect} to correct the \code{RGList} before applying \code{normalizeWithinArrays}.

\code{\link{kooperberg}} is a Bayesian background correction tool designed specifically for GenePix data.
\code{kooperberg} is not currently used as the default method for GenePix data because it is computationally intensive.
It requires several columns of the Genepix data files which are not read in my read.maimages, so you will need to use \code{read.series} instead of \code{read.maimages} if you wish to use \code{kooperberg}.
}

\author{Gordon Smyth}
\references{
Smyth, G. K., and Speed, T. P. (2003). Normalization of cDNA microarray data. In: \emph{METHODS: Selecting Candidate Genes from DNA Array Screens: Application to Neuroscience}, D. Carter (ed.). To appear. \url{http://www.statsci.org/smyth/pubs/normalize.pdf}
}
\keyword{documentation}

\eof
\name{5.LinearModels}
\alias{5.LinearModels}
\title{Linear Models for Microarrays}

\description{
This page gives an overview of the LIMMA functions available to fit linear models and to interpret the results.

The core of this package is the fitting of gene-wise linear models to microarray data.
The basic idea is to estimate log-ratios between two or more target RNA samples simultaneously.
See the \emph{\link[limma:../doc/usersguide]{LIMMA User's Guide}} for several case studies.
}

\section{Forming the Design Matrix}{
The function \code{\link{designMatrix}} is provided to assist with creation of an appropriate design matrix for two-color microarray experiments using a common reference.
Design matrices for Affymetrix or single-color arrays can be easily created using the ordinary R command \code{\link[base]{model.matrix}}.
For the direct two-color designs the design matrix needs to be created by hand.
}

\section{Fitting Models}{

There are four functions in the package which fit linear models:

\describe{
\item{ \code{\link{lmFit}} }{
	This is a high level function which accepts objects and provides an entry point to the following three functions.}

\item{ \code{\link{lm.series}} }{
	Straightforward least squares fitting of a linear model for each gene.}

\item{ \code{\link{rlm.series}} }{
	An alternative to \code{lm.series} using robust regression as implemented by the \code{rlm} function in the MASS package.}

\item{ \code{\link{gls.series}} }{
	Generalized least squares taking into account correlations between duplicate spots (i.e., replicate spots on the same array).
	The functions \code{\link{duplicateCorrelation}} or \code{\link{dupcor.series}} are used to estimate the inter-duplicate correlation before using \code{gls.series}.}
}
Each of these functions accepts essentially the same argument list and produces a fitted model object of the same form.
The first function \code{lmFit} formally produces an object of class \code{\link[limma:marraylm]{MArrayLM}}.
The other three functions are lower level functions which produce similar output but in unclassed lists.

The main argument is the \bold{design matrix} which specifies which target RNA samples were applied to each channel on each array.
There is considerable freedom to choose the design matrix - there is always more than one choice which is correct provided it is interpreted correctly.
The fitted model object consists of coefficients, standard errors and residual standard errors for each gene.

All the functions which fit linear models use \code{\link{unwrapdups}} which provides an unified method for handling duplicate spots.
}

\section{Making Comparisons of Interest}{

Once a linear model has been fit using an appropriate design matrix, the command \code{\link{makeContrasts}} may be used to form a contrast matrix to make comparisons of interest.
The fit and the contrast matrix are used by \code{\link{contrasts.fit}} to compute fold changes and t-statistics for the contrasts of interest.
This is a way to compute all possible pairwise comparisons between treatments for example in an experiment which compares many treatments to a common reference.
}

\section{Assessing Differential Expression}{

After fitting a linear model, the standard errors are moderated using a simple empirical Bayes model using \code{\link{ebayes}} or \code{\link{eBayes}}.
A moderated t-statistic and a log-odds of differential expression is computed for each contrast for each gene.

\code{\link{ebayes}} and \code{\link{eBayes}} use internal functions \code{\link{fitFDist}}, \code{\link{tmixture.matrix}} and \code{\link{tmixture.vector}}.

The function \code{\link{zscoreT}} is sometimes used for computing z-score equivalents for t-statistics so as to place t-statistics with different degrees of freedom on the same scale.
\code{\link{zscoreGamma}} is used the same way with standard deviations instead of t-statistics.
These functions are for research purposes rather than for routine use.
}

\section{Summarizing Model Fits}{

After the above steps the results may be displayed or further processed using:
\describe{
\item{ \code{\link{toptable}} }{
	Presents a list of the genes most likely to be differentially expressed for a given contrast.}

\item{ \code{\link{classifyTests}} }{
	Uses nested F-tests to classify the genes as up, down or even over the contrasts in the linear model with special attention to genes which are significant in more than one contrast.
	\code{\link{classifyTestsT}} and \code{\link{classifyTestsP}} are simpler methods using cutoffs for the t-statistics or p-values individually.}

\item{ \code{\link{heatdiagram}} }{
	Allows visual comparison of the results across many different conditions in the linear model.
	Not the same as heatdiagrams produced by other packages!
	This function accepts a classification matrix produced by \code{classifyTests}.}

\item{ \code{\link{vennCounts}} }{
	Accepts output from \code{classifyTests} and counts the number of genes in each classification.}

\item{ \code{\link{vennDiagram}} }{
	Accepts output from \code{classifyTests} or \code{vennCounts} and produces a Venn diagram plot.}
}
}

\author{Gordon Smyth}
\references{
Smyth, G. K. (2003). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. \url{http://www.statsci.org/smyth/pubs/ebayes.pdf}

Smyth, G. K., Michaud, J., and Scott, H. (2003). The use of within-array duplicate spots for assessing differential expression in microarray experiments.
\url{http://www.statsci.org/smyth/pubs/dupcor.pdf}
}
\keyword{documentation}

\eof
\name{6.Diagnostics}
\alias{6.Diagnostics}
\title{Diagnostics and Quality Assessment}

\description{
This page gives an overview of the LIMMA functions available for microarray quality assessment and diagnostic plots.

This package provides an \code{\link[limma:anova-method]{anova}} method which is designed for assessing the quality of an array series or of a normalization method.
It is not designed to assess differential expression of individual genes.
\code{\link[limma:anova-method]{anova}} uses utility functions \code{\link{bwss}} and \code{\link{bwss.matrix}}.

Diagnostics plots can be produced by \code{\link{imageplot}}, \code{\link{plotMA}}, \code{\link{plotPrintTipLoess}}, \code{\link{plotPrintorder}} and \code{\link{plotDensities}}.

\code{plotPrintTipLoess} uses utility functions \code{\link{gridr}} and \code{\link{gridc}}.
}

\author{Gordon Smyth}
\keyword{documentation}

\eof
\name{LargeDataObject-class}
\docType{class}
\alias{LargeDataObject-class}
\alias{show,LargeDataObject-method}
\title{Large Data Object - class}

\description{
A virtual class including the data classes \code{RGList}, \code{MAList} and \code{MArrayLM}, all of which typically contain large quantities of numerical data in vector, matrices and data.frames.
}

\section{Methods}{
A \code{show} method is defined for objects of class \code{LargeDataObject} which uses \code{printHead} to print only the leading elements or rows of components or slots which contain large quantities of data.
}

\author{Gordon Smyth}

\seealso{
  \link{2.Classes} gives an overview of all the classes defined by this package.
}

\examples{
#  see normalizeBetweenArrays
}

\keyword{classes}
\keyword{data}

\eof
\name{PrintLayout}
\docType{class}
\alias{PrintLayout-class}
\title{Print Layout - class}

\description{
A list-based class for storing information about the process used to print spots on a microarray.

\code{PrintLayout} objects can be created using \code{\link{getLayout}}.
The \code{printer} component of an \code{RGList} or \code{MAList} object is of this class.
}

\section{Slots/List Components}{
Objects of this class contains no slots but should contain the following list components:
\tabular{ll}{
  \code{ngrid.r}:\tab number of grid rows on the arrays\cr
  \code{ngrid.c}:\tab number of grid columns on the arrays\cr
  \code{nspot.r}:\tab number of rows of spots in each grid\cr
  \code{nspot.c}:\tab number of columns of spots in each grid\cr
  \code{ndups}:\tab number of duplicates of each DNA clone, i.e., number of times print-head dips into each well of DNA\cr
  \code{spacing}:\tab number of spots between duplicate spots.  Only applicable if \code{ndups>1}.
  \code{spacing=1} for side-by-side spots by rows, \code{spacing=nspot.c} for side-by-side spots by columns, \code{spacing=ngrid.r*ngrid.c*nspot.r*nspot.c/2} for duplicate spots in top and bottom halves of each array.\cr
  \code{npins}:\tab actual number of pins or tips on the print-head\cr
  \code{start}:\tab character string giving position of the spot printed first in each grid.
  Choices are \code{"topleft"} or \code{"topright"} and partial matches are accepted.
}
}

\author{Gordon Smyth}

\seealso{
  \link{2.Classes} gives an overview of all the classes defined by this package.
}

\examples{
#  Settings for Swirl and ApoAI example data sets in User's Guide

printer <- list(ngrid.r=4, ngrid.c=4, nspot.r=22, nspot.c=24, ndups=1, spacing=1, npins=16, start="topleft")

#  Typical settings at the Australian Genome Research Facility

#  Full pin set, duplicates side-by-side on same row
printer <- list(ngrid.r=12, ngrid.c=4, nspot.r=20, nspot.c=20, ndups=2, spacing=1, npins=48, start="topright")

#  Half pin set, duplicates in top and lower half of slide
printer <- list(ngrid.r=12, ngrid.c=4, nspot.r=20, nspot.c=20, ndups=2, spacing=9600, npins=24, start="topright")
}

\keyword{classes}
\keyword{data}

\eof
\name{anova.MAList-method}
\docType{methods}
\alias{anova.MAList}
\title{ANOVA Table - method}
\description{
Analysis of variance method for objects of class \code{MAList}.
Produces an ANOVA table useful for quality assessment by decomposing between and within gene sums of squares for a series of replicate arrays.
This method produces a single ANOVA Table rather than one for each gene and is not used to identify differentially expressed genes.
}
\section{Usage}{
\code{anova(object,design=NULL,ndups=2,...)}
}
\section{Arguments}{
\describe{
  \item{\code{object}}{object of class \code{MAList}. Missing values in the M-values are not allowed.}
  \item{\code{design}}{numeric matrix containing the design matrix for linear model. The number of rows should agree with the number of columns of M. The number of columns will determine the number of coefficients estimated for each gene.}
  \item{\code{ndups}}{number of duplicate spots. Each gene is printed ndups times in adjacent spots on each array.}
  \item{\code{...}}{other arguments are not used}
}
}
\section{Details}{
This function aids in quality assessment of microarray data and in the comparison of normalization methodologies.
}
\section{Value}{
  An object of class \code{anova} containing rows for between genes, between arrays, gene x array interaction, and between duplicate with array sums of squares.
  Variance components are estimated for each source of variation.
}
\warning{
This function does not give valid results in the presence of missing M-values.
}
\seealso{
\code{\link{MAList-class}}, \code{\link{bwss.matrix}}, \code{\link[base:anova]{anova}}.

An overview of quality assessment and diagnostic functions in LIMMA is given by \link{6.Diagnostics}.
}
\author{Gordon Smyth}
\keyword{models}

\eof
\name{as.MAList}
\alias{as.MAList}
\title{Convert marrayNorm Object to an MAList Object}

\description{
Convert marrayNorm Object to an MAList Object
}

\usage{
as.MAList(object)
}

\arguments{
\item{object}{an \code{\link[marrayClasses:marrayNorm-class]{marrayNorm}} object}
}

\value{
Object of class \code{\link[limma:MAList]{MAList}}
}

\author{Gordon Smyth}

\seealso{
  \link{2.Classes} gives an overview of all the classes defined by this package.
}

\keyword{classes}
\keyword{data}

\eof
\name{backgroundCorrect}
\alias{backgroundCorrect}
\title{Correct Intensities for Background}
\description{
Apply background correction to microarray expression intensities.
}
\usage{
backgroundCorrect(RG, method="subtract")
}
\arguments{
  \item{RG}{an \code{\link[limma:rglist]{RGList}} object or a unclassed list containing the same components as an \code{RGList}}
  \item{method}{character string specifying correction method.  Possible values are \code{"none"}, \code{"subtract"}, \code{"half"}, \code{"minimum"} or \code{"edwards"}.}
}
\details{
If \code{method="none"} then the corrected intensities are equal to the foreground intensities, i.e., the background intensities are treated as zero.
If \code{method="subtract"} then this function simply subtracts the background intensities from the foreground intensities which is the usual background correction method.
If \code{method="half"} then any intensity which is less than 0.5 after background subtraction is reset to be equal to 0.5.
If \code{method="minimum"} then any intensity which is zero or negative after background subtraction is set equal to half the minimum of the positive corrected intensities for that array.
If \code{method="edwards"} a log-linear interpolation method is used to adjust lower intensities as in Edwards (2003).

Background correction (background subtraction) is also performed by the \code{\link{normalizeWithinArrays}} method for \code{RGList} objects, so it is not necessary to call \code{backgroundCorrect} directly unless one wants to use a method other than simple subtraction.
Calling \code{backgroundCorrect} before \code{normalizeWithinArrays} will over-ride the default background correction.
}
\value{
An \code{RGList} object in which components \code{R} and \code{G} are background corrected
and components \code{Rb} and \code{Gb} are removed.
}
\references{
Edwards, D. E. (2003). Non-linear normalization and background correction in one-channel cDNA microarray studies
\emph{Bioinformatics} 19, 825-833. 

Yang, Y. H., Buckley, M. J., Dudoit, S., and Speed, T. P. (2002). Comparison of methods for image analysis on cDNA microarray data. \emph{Journal of Computational and Graphical Statistics} 11, 108-136.

Yang, Y. H., Buckley, M. J., and Speed, T. P. (2001). Analysis of microarray images. \emph{Briefings in Bioinformatics} 2, 341-349.
}
\author{Gordon Smyth}
\examples{
RG <- new("RGList", list(R=c(1,2,3,4),G=c(1,2,3,4),Rb=c(2,2,2,2),Gb=c(2,2,2,2)))
backgroundCorrect(RG)
backgroundCorrect(RG, method="half")
backgroundCorrect(RG, method="minimum")
}
\seealso{
	\code{\link{read.maimages}}, \code{\link{normalizeWithinArrays}}
}
\keyword{models}

\eof
\name{bwss}
\alias{bwss}
\title{Between and within sums of squares}
\description{Sums of squares between and within groups. Allows for missing values.}
\usage{bwss(x,group)}
\arguments{
  \item{x}{a numeric vector giving the responses.}
  \item{group}{a vector or factor giving the grouping variable.}
}
\value{
  A list with components
  \item{bss}{sums of squares between the group means.}
  \item{wss}{sums of squares within the groups.}
  \item{bdf}{degrees of freedom corresponding to \code{bss}.}
  \item{wdf}{degrees of freedom corresponding to \code{wss}.}
}
\details{This is equivalent to one-way analysis of variance.}
\author{Gordon Smyth}
\seealso{\code{\link{bwss.matrix}}}
\keyword{models}

\eof
\name{bwss.matrix}
\alias{bwss.matrix}
\title{Between and within sums of squares for matrix}
\description{Sums of squares between and within the columns of a matrix. Allows for missing values. This function is called by the \code{\link[limma:anova-method]{anova}} method for \code{MAList} objects.}
\usage{bwss.matrix(x)}
\arguments{
  \item{x}{a numeric matrix.}
}
\value{
  A list with components
  \item{bss}{sums of squares between the column means.}
  \item{wss}{sums of squares within the column means.}
  \item{bdf}{degrees of freedom corresponding to \code{bss}.}
  \item{wdf}{degrees of freedom corresponding to \code{wss}.}
}
\details{This is equivalent to a one-way analysis of variance where the columns of the matrix are the groups.
If \code{x} is a matrix then \code{bwss.matrix(x)} is the same as \code{bwss(x,col(x))} except for speed of execution.}
\author{Gordon Smyth}
\seealso{\code{\link{bwss}}, \code{\link{anova.MAList}}}
\keyword{models}

\eof
\name{cbind}
\alias{cbind.RGList}
\alias{cbind.MAList}
\title{Combine RGList or MAList Objects}
\description{
Combine a series of \code{RGList} objects or combine a series of \code{MAList} objects.
}
\usage{
\method{cbind}{RGList}(\dots, deparse.level=1)
}
\arguments{
  \item{\dots}{\code{RGList} objects or \code{MAList} objects}
  \item{deparse.level}{not currently used, see \code{\link[base]{cbind}} in the base package}
}
\details{
The matrices of expression data from the individual objects are cbinded.
The data.frames of target information, if they exist, are rbinded.
The combined data object will preserve any additional components or attributes found in the first object to be combined.
}
\value{
An \code{\link[limma:rglist]{RGList}} or \code{\link[limma:malist]{MAList}} object holding data from all the arrays from the individual objects.
}
\author{Gordon Smyth}
\seealso{
  \code{\link[base]{cbind}} in the base package.
  
  \link{3.ReadingData} gives an overview of data input and manipulation functions in LIMMA.
}
\examples{
M <- A <- matrix(11:14,4,2)
rownames(M) <- rownames(A) <- c("a","b","c","d")
colnames(M) <- colnames(A) <- c("A1","A2")
MA1 <- new("MAList",list(M=M,A=A))

M <- A <- matrix(21:24,4,2)
rownames(M) <- rownames(A) <- c("a","b","c","d")
colnames(M) <- colnames(A) <- c("B1","B2")
MA2 <- new("MAList",list(M=M,A=A))

cbind(MA1,MA2)
}
\keyword{manip}

\eof
\name{classifyTests}
\alias{classifyTests}
\alias{classifyTestsT}
\alias{classifyTestsP}
\title{Treat Simultaneous T-Tests as Classification Problem}
\description{
Classify a series of related t-statistics as up, down or not significant.
}
\usage{
classifyTests(tstat, cor.matrix=NULL, design=NULL, contrasts=NULL, df=Inf, p.value=0.01)
classifyTestsT(tstat, t1=4, t2=3)
classifyTestsP(tstat, df=Inf, p.value=0.05, method="holm")
}
\arguments{
  \item{tstat}{numeric matrix of t-statistics or an \code{MArrayLM} object from which the t-statistics may be extracted.}
  \item{cor.matrix}{covariance matrix of each row of t-statistics.  Defaults to the identity matrix.}
  \item{design}{full rank numeric design matrix.  Not used if \code{cor.matrix} is specified.}
  \item{contrasts}{numeric matrix with columns specifying contrasts of the coefficients of interest.  Not used if \code{cor.matrix} is specified.}
  \item{df}{numeric vector giving the degrees of freedom for the t-statistics.
  May have length 1 or length equal to the number of rows of \code{tstat}.}
  \item{p.value}{numeric value between 0 and 1 giving the desired size of the test}
  \item{t1}{first critical value for absolute t-statistics}
  \item{t2}{second critical value for absolute t-statistics}
  \item{method}{character string specifying p-value adjustment method.  See \code{\link[base]{p.adjust}} for possible values.}
}
\value{
A list with components
  \item{classification}{numeric matrix with elements \code{-1}, \code{0} or \code{1} depending on whether each t-statistic is classified as significantly negative, not significant or significantly positive respectively}
  \item{Fstat}{numeric vector containing moderated F-statistics for testing the contrasts simultaneously zero}
}
\details{
\code{classifyTests} classifies using a nested F-test approach giving particular attention to correctly classifying genes which have two or more significant t-statistics, i.e., are differential expressed under two or more conditions.
\code{classifyTestsT} and \code{classifyTestsP} implement simpler classification schemes based on threshold or critical values for the individual t-statistics in the case of \code{classifyTestsT} or p-values obtained from the t-statistics in the case of \code{classifyTestsP}.

Rows of \code{tstat} correspond to genes and columns to coefficients or contrasts.
For each row of \code{tstat}, F-statistics are constructed from the t-statistics.
If the overall F-statistic is significant, then the function makes a best choice as to which t-statistics contributed to this result.
The methodology is based on the principle that any t-statistic should be called significant if the F-test is still significant for that row when all the larger t-statistics are set to the same absolute size as the t-statistic in question.

If \code{tstat} is an \code{MArrayLM} object, then all arguments except for \code{p.value} are extracted from it.

\code{cor.matrix} is the same as the correlation matrix of the coefficients from which the t-statistics are calculated.
If \code{cor.matrix} is not specified, then it is calculated from \code{design} and \code{contrasts} if at least \code{design} is specified or else defaults to the identity matrix.
In terms of \code{design} and \code{contrasts}, \code{cor.matrix} is obtained by standardizing the matrix
\code{ t(contrasts) \%*\% solve(t(design) \%*\% design) \%*\% contrasts }
to a correlation matrix.
}
\seealso{
An overview of linear model functions in limma is given by \link{5.LinearModels}.
}
\author{Gordon Smyth}
\examples{
tstat <- matrix(c(0,5,0, 0,2.5,0, -2,-2,2, 1,1,1), 4, 3, byrow=TRUE)
classifyTests(tstat)

# See also the examples for contrasts.fit and vennDiagram
}
\keyword{htest}

\eof
\name{contrasts.fit}
\alias{contrasts.fit}
\title{Contrast Information from Linear Model Fit}
\description{
Given an \code{lm.series} fit for a oneway model, compute estimated coefficients and standard errors for a given set of contrasts.
}
\usage{
contrasts.fit(fit,contrasts)
}
\arguments{
  \item{fit}{object produced by the function \code{lm.series} or equivalent. A list containing components \code{coefficients} and \code{stdev.unscaled}.}
  \item{contrasts}{matrix with columns containing contrasts. May be a vector if there is only one contrast.}
}
\value{
  An object of the same type as produced by \code{lm.series}. This is a list components
  \item{coefficients}{numeric matrix containing the estimated coefficients for each contrasts for each gene.}
  \item{stdev.unscaled}{numeric matrix conformal with \code{coef} containing the unscaled standard deviations for the coefficient estimators.}
  \item{...}{any other components input in \code{fit}}
}
\details{
This function accepts input from any of the functions \code{lm.series}, \code{rlm.series} or \code{gls.series}.
The design matrix used for this fit must have orthogonal columns.

The idea is to fit a saturated oneway model using of the above functions, then use \code{contrasts.fit} to obtain coefficients and standard errors for any number of contrasts of the coefficients of the oneway model.
}
\seealso{
An overview of linear model functions in limma is given by \link{5.LinearModels}.
}
\author{Gordon Smyth}
\examples{
#  Simulate gene expression data,
#  6 microarrays and 100 genes with one gene differentially expressed in first 3 arrays
M <- matrix(rnorm(100*6,sd=0.3),100,6)
M[1,1:3] <- M[1,1:3] + 2
#  Design matrix corresponds to oneway layout, columns are orthogonal
design <- cbind(First3Arrays=c(1,1,1,0,0,0),Last3Arrays=c(0,0,0,1,1,1))
fit <- lm.series(M,design=design)
#  Would like to consider original two estimates plus difference between first 3 and last 3 arrays
contrast.matrix <- cbind(First3=c(1,0),Last3=c(0,1),"Last3-First3"=c(-1,1))
fit2 <- contrasts.fit(fit,contrasts=contrast.matrix)
eb <- ebayes(fit2)
#  Large values of eb$t indicate differential expression
clas <- classifyTests(eb$t,design=design,contrasts=contrast.matrix,df=fit2$df+eb$df)
}
\keyword{htest}

\eof
\name{controlStatus}
\alias{controlStatus}
\title{Get Spot Status from Spot Types}
\description{
Determine the type (or status) of each spot in the gene list.
}
\usage{
controlStatus(types, genes)
}
\arguments{
  \item{types}{dataframe containing spot type specifiers, usually input using \code{readSpotTypes}}
  \item{genes}{dataframe containing gene IDs and Names, or an \code{RGList} or \code{MAList} containing this dataframe}
}
\details{
This function matches up the regular expressions associated with spot types with the gene list.
}
\value{
Character vector specifying the type (or status) of each spot on the array
}
\author{Gordon Smyth}
\seealso{
An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}
\keyword{IO}

\eof
\name{designMatrix}
\alias{designMatrix}
\title{Construct Design Matrix}
\description{
Construct a design matrix from the data.frame of target information.
Currently only for two color array experiments using a common reference.
}
\usage{
designMatrix(targets, ref) 
}
\arguments{
  \item{targets}{data.frame with columns \code{Cy3} and \code{Cy5} specifying which RNA was hybridized to each array}
  \item{ref}{character string giving name of reference RNA}
}

\value{
The design matrix.
}

\details{
There is a need for design matrices for direct two-color design and for single-channel experiments but, this version is only for reference designs.
}

\seealso{
An overview of linear model functions in limma is given by \link{5.LinearModels}.
}

\author{Gordon Smyth}

\examples{
targets <- data.frame(Cy3=c("Ref","Control","Ref","Treatment"),Cy5=c("Control","Ref","Treatment","Ref"))
designMatrix(targets, "Ref")
}

\keyword{regression}

\eof
\name{dim}
\alias{dim.RGList}
\alias{dim.MAList}
\alias{dim.MArrayLM}
\title{Retrieve the Dimensions of an RGList, MAList or MArrayLM Object}
\description{
Retreive the number of rows (genes) and columns (arrays) for an RGList, MAList or MArrayLM object.
}
\usage{
\method{dim}{RGList}(x)
}
\arguments{
  \item{x}{an object of class \code{RGList}, \code{MAList} or \code{MArrayLM}}
}
\details{
The matrices of expression data from the individual objects are cbinded.
The data.frames of target information, if they exist, are rbinded.
The combined data object will preserve any additional components or attributes found in the first object to be combined.
}
\value{
Numeric vector of length 2.
The first element is the number of rows (genes) and the second is the number of columns (arrays).
}
\author{Gordon Smyth}
\seealso{
  \code{\link[base]{dim}} in the base package.
  
  \link{2.Classes} gives an overview of data classes used in LIMMA.
}
\examples{
M <- A <- matrix(11:14,4,2)
rownames(M) <- rownames(A) <- c("a","b","c","d")
colnames(M) <- colnames(A) <- c("A1","A2")
MA <- new("MAList",list(M=M,A=A))
dim(M)
}
\keyword{array}

\eof
\name{dupcor.series}
\alias{duplicateCorrelation}
\alias{dupcor.series}
\title{Correlation Between Duplicates}
\description{Estimate the correlation between duplicate spots (replicate spots on the same array) from a series of arrays.}
\usage{
duplicateCorrelation(object,design=rep(1,ncol(M)),ndups=2,spacing=1,initial=0.8,trim=0.15,weights=NULL)
dupcor.series(M,design=rep(1,ncol(M)),ndups=2,spacing=1,initial=0.7,trim=0.15,weights=NULL)
}
\arguments{
  \item{object}{a numeric matrix of log-ratios or an \code{\link[limma:malist]{MAList}} object from which the log-ratios can be extracted.
  If \code{object} is an \code{MAList} then the arguments \code{design}, \code{ndups}, \code{spacing} and \code{weights} will be extracted from it if available and do not have to be specified as arguments.}
  \item{M}{a numeric matrix. Usually the log-ratios of expression for a series of cDNA microarrrays with rows corresponding to genes and columns to arrays.}
  \item{design}{the design matrix of the microarray experiment, with rows corresponding to arrays and columns to comparisons to be estimated. The number of rows must match the number of columns of \code{M}. Defaults to the unit vector meaning that the arrays are treated as replicates.} 
  \item{ndups}{a positive integer giving the number of times each gene is printed on an array. \code{nrow(M)} must be divisible by \code{ndups}.}
  \item{spacing}{the spacing between the rows of \code{M} corresponding to duplicate spots, \code{spacing=1} for consecutive spots}
  \item{initial}{a numeric value between -1 and 1 giving an initial estimate for the correlation.}
  \item{trim}{the fraction of observations to be trimmed from each end of \code{tanh(cor.genes)} when computing the trimmed mean.}
  \item{weights}{an optional numeric matrix of the same dimension as \code{M} containing weights for each spot. If smaller than \code{M} then it will be filled out the same size.}
}
\value{
  A list with components
  \item{cor}{the average estimated inter-duplicate correlation. The average is the 0.1 trimmed mean of the correlations for individual genes on the tanh-transformed scale.}
  \item{cor.genes}{a numeric vector of length \code{nrow(M)/ndups} giving the individual gene correlations.}
}
\details{
This function estimates the between-duplicate correlation using REML individually for each gene.
It also returns a robust average of the individual correlations which can be used as input for 
functions such as \code{gls.series}.

\code{duplicateCorrelation} is a more object-orientated version of \code{dupcor.series} but produces the same value.
}
\note{
This function may take long time to execute as it makes a call to \code{\link[nlme]{gls}} for each gene.
Execution could be speeded up greatly if it could be assumed that \code{M} contains no NAs.
}
\seealso{
These functions use \code{\link[nlme]{gls}} in the nlme package.

An overview of linear model functions in limma is given by \link{5.LinearModels}.
}
\author{Gordon Smyth}
\references{
Smyth, G. K., Michaud, J., and Scott, H. (2003). The use of within-array duplicate spots for assessing differential expression in microarray experiments.
\url{http://www.statsci.org/smyth/pubs/dupcor.pdf}
}
\examples{
#  See gls.series for an example
}
\keyword{multivariate}

\eof
\name{ebayes}
\alias{ebayes}
\alias{eBayes}
\title{Empirical Bayes Statistics for Differential Expression}
\description{Given a series of related parameter estimates and standard errors, compute moderated t-statistics and log-odds of differential expression by empirical Bayes shrinkage of the standard errors towards a common value.}
\usage{
ebayes(fit,proportion=0.01,std.coef=NULL)
eBayes(fit,proportion=0.01,std.coef=NULL)
}
\arguments{
  \item{fit}{a list object produced by \code{lm.series}, \code{gls.series}, \code{rlm.series} or \code{lmFit} containing components \code{coefficients}, \code{stdev.unscaled}, \code{sigma} and \code{df.residual}}
  \item{proportion}{assumed proportion of genes which are differentially expressed}
  \item{std.coef}{assumed standard deviation of log2 fold changes for differentially expressed genes. Normally this parameter is estimated from the data.}
}
\value{
\code{ebayes} produces an ordinary list with the following components.
\code{eBayes} adds the following components to \code{fit} to produce an augmented object, usually of class \code{MArrayLM}.
  \item{t}{numeric vector or matrix of penalized t-statistics}
  \item{p.value}{numeric vector of p-values corresponding to the t-statistics}
  \item{s2.prior}{estimated prior value for \code{sigma^2}}
  \item{df.prior}{degrees of freedom associated with \code{s2.prior}}
  \item{s2.post}{vector giving the posterior values for \code{sigma^2}}
  \item{lods}{numeric vector or matrix giving the log-odds of differential expression}
  \item{var.prior}{estimated prior value for the variance of the log2-fold-change for differentially expressed gene}
}
\details{This function is used to rank genes in order of evidence for differential expression.
The function accepts as input output from the functions \code{lm.series}, \code{rlm.series} or \code{gls.series}.
The estimates \code{s2.prior} and \code{df.prior} are computed by \code{fdist.fit}.
\code{s2.post} is the weighted average of \code{s2.prior} and \code{sigma^2} with weights proportional to \code{df.prior} and \code{df.residual} respectively.

The \code{lods} is sometimes known as the B-statistic.
}
\seealso{
\code{\link{fitFDist}}, \code{\link{tmixture.matrix}}.

An overview of linear model functions in limma is given by \link{5.LinearModels}.
}
\author{Gordon Smyth}
\references{
Lnnstedt, I. and Speed, T. P. (2002). Replicated microarray data. \emph{Statistica Sinica} \bold{12}, 31-46.

Smyth, G. K. (2003). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. http://www.statsci.org/smyth/pubs/ebayes.pdf
}
\examples{
#  Simulate gene expression data,
#  6 microarrays and 100 genes with one gene differentially expressed
M <- matrix(rnorm(100*6,sd=0.3),100,6)
M[1,] <- M[1,] + 1.6
fit <- lm.series(M)
eb <- ebayes(fit)
qqt(eb$t,df=eb$df+fit$df)
abline(0,1)
#  Points off the line may be differentially expressed
}
\keyword{htest}

\eof
\name{exprSet2-class}
\docType{class}
\alias{exprSet2-class}
\title{Expression Set - class}

\description{
A class for storing intensity values from microarray experiments.
This class is similar to \code{\link[Biobase:exprSet-class]{exprSet}}.

This class is not yet used in the limma package but is intended to unify single-channel and log-ratio analyses of spotted microarray data in the future.
}

\section{Slots}{
\describe{
	\item{expressions}{\code{matrix} containing intensity data on log-2 scale.
	For two-color arrays, odd columns will usually correspond to channel 1 (green) and even columns to channel 2 (red) for different arrays.}
	\item{weights}{\code{matrix} containing non-negative quality weights}
	\item{targets}{\code{data.frame} containing factors corresponding to the columns of \code{expressions}}
	\item{probes}{\code{data.frame} containing gene IDs or annotation information.
	Should have same number of rows as \code{expression}}
	\item{printer}{\code{list} containing information about the printing process}
	\item{notes}{\code{character}}
}
}

\section{Methods}{
\code{exprSet2} objects inherit a \code{\link[methods]{show}} method from the virtual class \code{\link[limma:LargeDataObject]{LargeDataObject}}, which means that \code{exprSet2} objects will print in a compact way.
}

\author{Gordon Smyth}

\seealso{
  \link{2.Classes} gives an overview of all the classes defined by this package.
  
  \code{\link[Biobase:exprSet-class]{exprSet}} is the corresponding class in the Biobase package.
}

\keyword{classes}
\keyword{data}

\eof
\name{fitFDist}
\alias{fitFDist}
\title{Moment Estimation of Scaled F-Distribution}
\description{
Moment estimation of the parameters of a scaled F-distribution given one of the degrees of freedom.
This function is called internally by \code{ebayes} and is not usually called directly by a user.
}
\usage{
fitFDist(x,df1)
}
\arguments{
  \item{x}{numeric vector or array of positive values representing a sample from an F-distribution.}
  \item{df1}{the first degrees of freedom of the F-distribution. May be an integer or a vector of the same length as \code{x}.}
}
\details{
The function estimates \code{scale} and \code{df2} under the assumption that \code{x} is distributed as \code{scale} times an F-distributed random variable on \code{df1} and \code{df2} degrees of freedom.
}
\value{
A list containing the components
  \item{scale}{scale factor for F-distribution}
  \item{df2}{the second degrees of freedom of the F-distribution}
}
\author{Gordon Smyth}
\seealso{
\code{\link{ebayes}}, \code{\link{trigammaInverse}}
}
\keyword{distribution}

\eof
\name{getLayout}
\alias{getLayout}
\title{Extract the Print Layout of an Array from the GAL File}
\description{
From the Block, Row and Column information in the GAL file, determine the number of grid rows and columns on the array and the number of spot rows and columns within each grid.
}
\usage{
getLayout(gal)
}
\arguments{
  \item{gal}{data.frame containing the GAL, i.e., giving the position and gene identifier of each spot}
}
\details{
A GAL file is a list of genes and associated information produced by an Axon microarray scanner.
This function assumes that the data.frame contains columns \code{Block}, \code{Column} and \code{Row}.
The number of tip columns is not determinable from the GAL but is assumed to be four.
}
\value{
A list with components
  \item{ngrid.r}{integer, number of grid rows on the arrays}
  \item{ngrid.c}{integer, number of grid columns on the arrays}
  \item{nspot.r}{integer, number of rows of spots in each grid}
  \item{nspot.c}{integer, number of columns of spots in each grid}
}
\author{Gordon Smyth}
\seealso{
\code{\link[marrayTools:marrayTools]{gpTools}}.

An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}
\examples{
# gal <- readGAL()
# layout <- getLayout(gal)
}
\keyword{IO}

\eof
\name{gls.series}
\alias{gls.series}
\title{Generalized Least Squares for Series of Microarrays}
\description{Fit linear models for each gene to a series of microarrays. Fit is by generalized least squares allowing for correlation between duplicate spots.}
\usage{gls.series(M,design=rep(1,ncol(M)),ndups=2,spacing=1,correlation=NULL,weights=NULL,...)}
\arguments{
  \item{M}{a numeric matrix. Usually the log-ratios of expression for a series of cDNA microarrrays with rows corresponding to genes and columns to arrays.}
  \item{design}{the design matrix of the microarray experiment, with rows corresponding to arrays and columns to comparisons to be estimated. The number of rows must match the number of columns of \code{M}. Defaults to the unit vector meaning that the arrays are treated as replicates.} 
  \item{ndups}{a positive integer giving the number of times each gene is printed on an array. \code{nrow(M)} must be divisible by \code{ndups}.}
  \item{spacing}{the spacing between the rows of \code{M} corresponding to duplicate spots, \code{spacing=1} for consecutive spots}
  \item{correlation}{the inter-duplicate correlation.}
  \item{weights}{an optional numeric matrix of the same dimension as \code{M} containing weights for each spot. If it is of different dimension to \code{M}, it will be filled out to the same size.}
  \item{...}{other optional arguments to be passed to \code{dupcor.series}.}
}
\value{
  A list with components
  \item{coefficients}{numeric matrix containing the estimated coefficients for each linear model. Same number of rows as \code{M}, same number of columns as \code{design}.}
  \item{stdev.unscaled}{numeric matrix conformal with \code{coef} containing the unscaled standard deviations for the coefficient estimators. The standard errors are given by \code{stdev.unscaled * sigma}.}
  \item{sigma}{numeric vector containing the residual standard deviation for each gene.}
  \item{df.residual}{numeric vector giving the degrees of freedom corresponding to \code{sigma}.}
  \item{correlation}{inter-duplicate correlation.}
}
\details{
Normally \code{dupcor.series} will be called before \code{gls.series} to estimate the inter-duplicate correlation. Then t-statistics will be formed to rank the genes in order of evidence for differential expression.
}
\seealso{
\code{\link{dupcor.series}}.

An overview of linear model functions in limma is given by \link{5.LinearModels}.
}
\author{Gordon Smyth}
\examples{
M <- matrix(rnorm(10*6),10,6)
cor.out <- dupcor.series(M)
#  See cor.out$cor for estimated common correlation
fit <- gls.series(M,correlation=cor.out$cor)
}
\keyword{models}
\keyword{regression}


\eof
\name{gridr}
\alias{gridr}
\alias{gridc}
\alias{spotr}
\alias{spotc}
\title{Row and Column Positions on Microarray}
\description{
Grid and spot row and column positions.
}
\usage{
gridr(layout)
gridc(layout)
spotr(layout)
spotc(layout)
}
\arguments{
  \item{layout}{list with the components \code{ngrid.r}, \code{ngrid.c}, \code{nspot.r} and \code{nspot.c}}
}
\value{
Vector of length \code{prod(unlist(layout))} giving the grid rows (\code{gridr}), grid columns (\code{gridc}), spot rows (\code{spotr}) or spot columns (\code{spotc}).
}
\author{Gordon Smyth}
\keyword{IO}

\eof
\title{Stemmed Heat Diagram}
\name{heatdiagram}
\alias{heatdiagram}
\alias{heatDiagram}
\description{
Creates a heat diagram showing the co-regulation of genes under one condition with a range of other conditions.
}
\usage{
heatdiagram(stat,coef,primary=1,names=NULL,treatments=colnames(stat),critical.primary=4,critical.other=3,limit=NULL,orientation="landscape",cex=1,low="green",high="red",ncolors=123,...)
heatDiagram(classification,coef,primary=1,names=NULL,treatments=colnames(coef),limit=NULL,orientation="landscape",cex=1,low="green",high="red",ncolors=123,...)
}
\arguments{
  \item{classification}{classification matrix, containing elements -1, 0 or 1, from \code{\link{classifyTests}}}
  \item{stat}{numeric matrix of test statistics. Rows correspond to genes and columns to treatments or contrasts between treatments.}
  \item{coef}{numeric matrix of the same size as \code{stat}. Holds the coefficients to be displayed in the plot.}
  \item{primary}{number or name of the column to be compared to the others. Genes are included in the diagram according to this column of \code{stat} and are sorted according to this column of \code{coef}. If \code{primary} is a name, then \code{stat} and \code{coef} must have the same column names.}
  \item{names}{optional character vector of gene names}
  \item{treatments}{optional character vector of treatment names}
  \item{critical.primary}{critical value above which the test statistics for the primary column are considered significant and included in the plot}
  \item{critical.other}{critical value above which the other test statistics are considered significant. Should usually be no larger than \code{critical.primary} although larger values are permitted.}
  \item{limit}{optional value for \code{coef} above which values will be plotted in extreme color. Defaults to \code{max(abs(coef))}.}
  \item{orientation}{\code{"portrait"} for upright plot or \code{"landscape"} for plot orientated to be wider than high. \code{"portrait"} is likely to be appropriate for inclusion in printed document while \code{"landscape"} may be appropriate for a presentation on a computer screen.}
  \item{low}{color associated with repressed gene regulation}
  \item{cex}{factor to increase or decrease size of column and row text}
  \item{high}{color associated with induced gene regulation}
  \item{ncolors}{number of distinct colors used for each of up and down regulation}
  \item{...}{any other arguments will be passed to the \code{image} function}
}
\details{
This function plots an image of gene expression profiles in which rows (or columns for portrait orientation) correspond to treatment conditions and columns (or rows) correspond to genes.
Only genes which are significantly differentially expressed in the primary condition are included.
Genes are sorted by differential expression under the primary condition.

Note: the plot produced by this function is unique to the limma package.
It should not be confused with "heatmaps" often used to display results from cluster analyses.
}
\value{An image is created on the current graphics device.
A dataframe containing the coefficients used in the plot is also invisibly returned.}
\author{Gordon Smyth}
\seealso{\code{\link[base]{image}}.}
\examples{
library(sma)
data(MouseArray)
MA <- stat.ma(mouse.data,layout=mouse.setup)
design <- cbind(c(1,1,1,0,0,0),c(0,0,0,1,1,1))
fit <- lm.series(MA$M,design=design)
contrasts.mouse <- cbind(c(1,0),c(0,1),c(-1,1))
colnames(contrasts.mouse) <- c("First3","Second3","Difference")
fit <- contrasts.fit(fit,contrasts=contrasts.mouse)
eb <- ebayes(fit)
heatdiagram(abs(eb$t),fit$coef,primary="Difference")
}
\keyword{hplot}

\eof
\name{helpMethods}
\alias{helpMethods}
\title{Prompt for Method Help Topics}
\description{
For any S4 generic function, find all methods defined in currently loaded packages.
Prompt the user to choose one of these to display the help document.
}
\usage{
helpMethods(genericFunction)
}
\arguments{
  \item{genericFunction}{a generic function or a character string giving the name of a generic function}
}
\author{Gordon Smyth}
\seealso{
\code{\link[methods]{showMethods}}
}
\examples{
\dontrun{helpMethods(show)}
}
\keyword{methods}

\eof
\title{Image Plot of Microarray Statistics}
\name{imageplot}
\alias{imageplot}
\description{
Creates an image of shades
of gray or colours, that represents the values of a statistic for each
spot on the array.
The statistic can be a log intensity ratio, quality
information such as spot size or shape, or a t-statistic.
This function can be used to explore whether there are any spatial effects in the data.
}
\usage{
imageplot(z, layout, low = NULL, high = NULL, ncolors = 123, zerocenter = NULL, 
zlim = NULL, mar=rep(1,4), ...)
}
\arguments{
  \item{z}{numeric vector or array. This vector can contain any spot 
statistics, such
as log intensity ratios, spot sizes or shapes, or t-statistics. Missing values 
are allowed and will result in blank spots on the image.}
  \item{layout}{a list specifying the dimensions of the spot matrix
and the grid matrix.}
  \item{low}{color associated with low values of \code{z}. May be specified as a character string 
such as \code{"green"}, \code{"white"} etc, or as a rgb vector in which \code{c(1,0,0)} is red, 
\code{c(0,1,0)} is green and \code{c(0,0,1)} is blue. The default value is \code{"green"} if \code{zerocenter=T} or \code{"white"} if \code{zerocenter=F}.}
  \item{high}{color associated with high values of \code{z}. The default value is \code{"red"} if \code{zerocenter=T} or \code{"blue"} if \code{zerocenter=F}.}
  \item{ncolors}{number of color shades used in the image including low and high.}
  \item{zerocenter}{should zero values of \code{z} correspond to a shade exactly halfway between the colors 
low and high? The default is TRUE if \code{z} takes positive and negative values, 
otherwise FALSE.}
  \item{zlim}{numerical vector of length 2 giving the extreme values of \code{z} to associate with 
colors \code{low} and \code{high}. By default \code{zlim} is the range of \code{z}. Any values of \code{z} outside 
the interval \code{zlim} will be truncated to the relevant limit.}
 \item{mar}{numeric vector of length 4 specifying the width of the margin around the plot.
 This argument is passed to \code{\link[base]{par}}.}
\item{...}{any other arguments will be passed to the function image}
}
\details{
The image follows the layout of an actual microarray slide with the bottom left corner representing the spot (1,1,1,1). 
This function is very similar to the \code{sma} function \code{plot.spatial} but is intended 
to display spatial patterns and artefacts rather than to highlight extreme 
values. The function differs from \code{plot.spatial} most noticeably in that all the 
spots are plotted and the image is plotted from bottom left rather than from 
top left.
}
\value{An image is created on the current graphics device.}
\author{Gordon Smyth}
\seealso{
\code{\link[marrayPlots]{maImage}}, \code{\link[base]{image}}.

An overview of diagnostic functions available in LIMMA is given in \link{6.Diagnostics}.
}
\examples{
M <- rnorm(8*4*16*16)
imageplot(M,layout=list(ngrid.r=8,ngrid.c=4,nspot.r=16,nspot.c=16))
}
\keyword{hplot}

\eof
\name{is.fullrank}
\alias{is.fullrank}

\title{Check for Full Column Rank}

\description{
Test whether a numeric matrix has full column rank.
}

\usage{
is.fullrank(x)
}

\arguments{
\item{x}{a numeric matrix for vector}
}

\value{\code{TRUE} or \code{FALSE}}

\details{
This function is used to check the integrity of design matrices in limma, for example after \link[limma:subsetting]{subsetting} operations.
}

\author{Gordon Smyth}

\examples{
# TRUE
is.fullrank(1)
is.fullrank(cbind(1,0:1))

# FALSE
is.fullrank(0)
is.fullrank(matrix(1,2,2))
}
\keyword{algebra}

\eof
\name{isNumeric}
\alias{isNumeric}

\title{Test for Numeric Argument}
\description{
Test whether argument is numeric or a data.frame with numeric columns.
}

\usage{
isNumeric(x)
}

\arguments{
\item{x}{any object}
}

\value{\code{TRUE} or \code{FALSE}}

\details{
This function is used to check the validity of arguments for numeric functions.
It is an attempt to emulate the behavior of internal generic math functions.

\code{isNumeric} differs from \code{is.numeric} in that data.frames with all columns numeric are accepted as numeric.
}

\author{Gordon Smyth}

\examples{
isNumeric(3)
isNumeric("a")
x <- data.frame(a=c(1,1),b=c(0,1))
isNumeric(x)   # TRUE
is.numeric(x)  # FALSE
}
\seealso{
   \code{\link[base]{is.numeric}}, \code{\link[base]{Math}}
}
\keyword{programming}

\eof
\title{Kooperberg Model-Based Background Correction}
\name{kooperberg}
\alias{kooperberg}
\description{
This function uses a Bayesian model to background correct
data from a series of microarray experiments.
It currently works only with GenePix data.
}

\usage{
kooperberg(names, fg="mean", bg="median", a=FALSE, layout)
}
\arguments{
\item{names}{character vector giving the names of data.frames containing GenePix data}
\item{fg}{character string giving foreground estimator.
Choices are \code{"mean"} or \code{"median"}.}
\item{bg}{character string giving foreground estimator.
Choices are \code{"mean"} or \code{"median"}.}
\item{a}{logical.  If \code{TRUE}, the 'a' parameters in the model (equation 3 and 4) are estimated for each slide.  If \code{FALSE} the 'a' parameters are set to unity.}
\item{layout}{list containing print layout with components \code{ngrid.r}, \code{ngrid.c}, \code{nspot.r} and \code{nspot.c}}
}

\details{
This function is for use with Genepix data and is designed to cope with the problem of large numbers of negative intensities and hence missing values on the log-intensity scale.
It avoids missing values in most cases and at the same time dampens down the variability of log-ratios for low intensity spots.
See Kooperberg et al (2003) for more details.

\code{kooperberg} serially extracts the foreground and background intensities, standard
deviations and number of pixels from GenePix data frames.
This information is used to compute empirical estimates of the model parameters
as described in equation 2 of Kooperberg et al (2003).

The foreground and background estimates extracted from the Genepix files may be based on means or medians of pixel values.
Setting \code{fg="mean"} uses the Genepix column \code{F635.Mean} for red foreground and
the Genepix column \code{F532.Mean} for green foreground.
Setting \code{fg="median"} uses columns \code{F635.Median} and \code{F532.Median}.
Similarly for the background, \code{bg="mean"} uses columns \code{B635.Mean} and \code{B532.Mean} while \code{bg="median"} uses columns \code{B635.Median} and \code{B532.Median}.
}

\value{
A list containing the components
\item{R}{matrix containing the background adjusted intensities for
the red channel for each spot for each array}
\item{G}{matrix containing the background adjusted intensities for the green channel for each spot for each array}
}

\author{Matthew Ritchie}

\references{
Kooperberg, C., Fazzio, T. G., Delrow, J. J., and Tsukiyama, T. (2002)
Improved background correction for spotted DNA microarrays.
\emph{Journal of Computational Biology} \bold{9}, 55-66.
}
	
\seealso{
\link{4.Normalization} gives an overview of normalization and background correction functions defined in the LIMMA package.
}

\examples{
#  This is example code for reading and background correcting GenePix data
#  given GenePix Results (gpr) files in the working directory (data not
#  provided).
\dontrun{
genepixFiles <- dir(pattern="\\\\.gpr") # get the names of the GenePix image analysis output files in the current directory
read.series(genepixFiles, suffix=NULL, skip=26, sep="\t") # read in GenePix files
layout <- list(ngrid.r=12, ngrid.c=4, nspot.r=26, nspot.c=26) # specify array layout
RGmodel <- kooperberg(genepixFiles, layout=layout) # model-based background correction
MA <- normalizeWithinArrays(RGModel, layout) # normalize the data
}
}

\keyword{models}

\eof
\name{lm.series}
\alias{lm.series}
\title{Linear Model for Series of Arrays}
\description{Fit linear model for each gene given a series of arrays}
\usage{lm.series(M,design=NULL,ndups=1,spacing=1,weights=NULL)}
\arguments{
  \item{M}{a numeric matrix containing log-ratios (M-values) for each spot on each array. Rows correspond to spots and columns to arrays.}
  \item{design}{a numeric matrix containing the design matrix for linear model. The number of rows should agree with the number of columns of M. The number of columns will determine the number of coefficients estimated for each gene.}
  \item{ndups}{number of duplicate spots. Each gene is printed ndups times in adjacent spots on each array.}
  \item{spacing}{the spacing between the rows of \code{M} corresponding to duplicate spots, \code{spacing=1} for consecutive spots}
  \item{weights}{an optional numeric matrix of the same dimension as \code{M} containing weights for each spot. If it is of different dimension to \code{M}, it will be filled out to the same size.}
}
\value{
  A list with components
  \item{coefficients}{numeric matrix containing the estimated coefficients for each linear model. Same number of rows as \code{M}, same number of columns as \code{design}.}
  \item{stdev.unscaled}{numeric matrix conformal with \code{coef} containing the unscaled standard deviations for the coefficient estimators. The standard errors are given by \code{stdev.unscaled * sigma}.}
  \item{sigma}{numeric vector containing the residual standard deviation for each gene.}
  \item{df.residual}{numeric vector giving the degrees of freedom corresponding to \code{sigma}.}
}
\details{
The linear model is fit for each gene by calling the function \code{lm.fit} or \code{lm.wfit} from the base library.
}
\author{Gordon Smyth}
\seealso{
\code{\link[base:lmfit]{lm.fit}}.

An overview of linear model functions in limma is given by \link{5.LinearModels}.
}
\examples{
#  Simulate gene expression data,
#  6 microarrays and 100 genes with one gene differentially expressed in first 3 arrays
M <- matrix(rnorm(100*6,sd=0.3),100,6)
M[1,1:3] <- M[1,1:3] + 2
#  Design matrix includes two treatments, one for first 3 and one for last 3 arrays
design <- cbind(First3Arrays=c(1,1,1,0,0,0),Last3Arrays=c(0,0,0,1,1,1))
fit <- lm.series(M,design=design)
eb <- ebayes(fit)
#  Large values of eb$t indicate differential expression
qqt(eb$t[,1],df=fit$df+eb$df.prior)
abline(0,1)
}
\keyword{models}
\keyword{regression}

\eof
\name{lmFit}
\alias{lmFit}
\title{Linear Model for Series of Arrays}
\description{Fit linear model for each gene given a series of arrays}
\usage{
lmFit(object,design=NULL,ndups=1,spacing=1,correlation=0.75,weights=NULL,method="ls",...) 
}
\arguments{
  \item{object}{object of class \code{numeric}, \code{matrix}, \code{MAList}, \code{marrayNorm} or \code{exprSet} containing log-ratios or log-values of expression for a series of microarrays}
  \item{design}{the design matrix of the microarray experiment, with rows corresponding to arrays and columns to coefficients to be estimated.  Defaults to the unit vector meaning that the arrays are treated as replicates.} 
  \item{ndups}{a positive integer giving the number of times each gene is printed on an array}
  \item{spacing}{the spacing between duplicate spots, \code{spacing=1} for consecutive spots}
  \item{correlation}{the inter-duplicate correlation}
  \item{weights}{an optional numeric matrix containing weights for each spot}
  \item{method}{character string, \code{"ls"} for least squares or \code{"robust"} for robust regression}
  \item{...}{other optional arguments to be passed to \code{lm.series}, \code{gls.series} or \code{rlm.series}}
}

\value{
Object of class \code{\link[limma:marraylm]{MArrayLM}}
}

\details{
A linear model is fitted for each gene by calling one of \code{lm.series}, \code{gls.series} or \code{rlm.series}.
Note that the arguments \code{design}, \code{ndups} will be extracted from the data \code{object} if available and do not normally need to set explicitly in the call.
If arguments are set in the call then they will over-ride slots or components in the data \code{object}.
}

\seealso{
An overview of linear model functions in limma is given by \link{5.LinearModels}.
}

\author{Gordon Smyth}
\keyword{models}
\keyword{regression}

\eof
\name{loessFit}
\alias{loessFit}
\title{Fast Simple Loess}
\description{
A fast version of locally weighted regression when there is only one x-variable and only the fitted values and residuals are required.
}
\usage{
loessFit(y, x, weights=NULL, span=0.3, bin=0.01/(2-is.null(weights)), iterations=4)
}
\arguments{
  \item{y}{numeric vector of response values.  Missing values are allowed.}
  \item{x}{numeric vector of predictor values  Missing values are allowed.}
  \item{weights}{numeric vector of non-negative weights.  Missing values are allowed.}
  \item{span}{numeric parameter between 0 and 1 specifying proportion of data to be used in the local regression moving window.
  Larger numbers give smoother fits.}
  \item{bin}{numeric value between 0 and 1 giving the proportion of the data which can be grouped in a single bin when doing local regression fit.
  \code{bin=0} forces an exact local regression fit with no interpolation.}
  \item{iterations}{number of iterations of loess fit}
}

\details{
This function is a low-level equivalent to \code{lowess} in the base library if \code{weights} is null and to \code{loess} in the modreg package otherwise.
It is used by \code{\link{normalizeWithinArrays}}.
The parameters \code{span}, \code{cell} and \code{iterations} have the same meaning as in \code{loess}.
\code{span} is equivalent to the argument \code{f} to \code{lowess} and \code{iterations} is equivalent to \code{iter+1}.
Unlike \code{lowess} this function returns values in original rather than sorted order.

The parameter \code{bin} is equivalent to \code{delta=bin*diff(range(x))} in a call to \code{lowess} when \code{weights=NULL} or to \code{cell=bin/span} in a call to \code{loess} when \code{weights} are given.

The treatment of missing values is analogous to \code{na.exclude}.
}
\value{
A list with components
\item{fitted}{numeric vector of same length as \code{y} giving the loess fit}
\item{residuals}{numeric vector of same length as \code{x} giving residuals from the fit}
}

\author{Gordon Smyth}

\seealso{
An overview of LIMMA functions for normalization is given in \link{4.Normalization}.

See also \code{\link[base]{lowess}} in the base library and \code{\link[modreg]{loess}} in the modreg package.
}

\examples{
y <- rnorm(1000)
x <- rnorm(1000)
w <- rep(1,1000)
# The following are equivalent apart from execution time
system.time(fit <- loessFit(y,x)$fitted)
system.time(fit <- loessFit(y,x,w)$fitted)
system.time(fit <- fitted(loess(y~x,weights=w,span=0.3,family="symmetric",iterations=4)))
# Similar but with sorted x-values
system.time(fit <- lowess(x,y,f=0.3)$y)
}

\keyword{models}

\eof
\name{m.spot}
\alias{m.spot}
\alias{a.spot}
\title{Extract M or A-values from SPOT data.frame or matrix}
\description{
Extract M-values or A-values from a SPOT data.frame or matrix.
}
\usage{
m.spot(spot)
a.spot(spot)
}
\arguments{
  \item{spot}{data.frame or matrix giving SPOT output for one microarray}
}
\value{
Vector of M-values (\code{m.spot}) or A-values (\code{a.spot})
}
\author{Gordon Smyth}
\keyword{IO}

\eof
\name{makeContrasts}
\alias{makeContrasts}
\title{Construct Matrix of Custom Contrasts}
\description{
Construct the contrasts matrix corresponding to specified contrasts of a set of given parameters.
}
\usage{
makeContrasts(\dots, levels) 
}
\arguments{
  \item{\dots}{expressions, or character strings which can be parsed to expressions, specifying contrasts}
  \item{levels}{character vector giving the names of the parameters to be contrasts, or a factor or design matrix from which the names can be extracted.}
}

\value{
Matrix which columns corresponding to contrasts.
}

\seealso{
An overview of linear model functions in limma is given by \link{5.LinearModels}.
}

\author{Gordon Smyth}

\examples{
makeContrasts(B-A,C-B,C-A,levels=c("A","B","C"))
makeContrasts("A","B","B-A",levels=c("A","B"))
}

\keyword{regression}

\eof
\name{makeUnique}
\alias{makeUnique}
\title{Make Values of Character Vector Unique}
\description{
Paste characters on to values of a character vector to make them unique.
}
\usage{
makeUnique(x)
}
\arguments{
  \item{x}{object to be coerced to a character vector}
}
\details{
Repeat values of \code{x} are labelled with suffixes "1", "2" etc.
}
\value{
A character vector of the same length as \code{x}
}
\author{Gordon Smyth}
\seealso{
\code{makeUnique} is called by \code{\link{merge.RGList}}.
}
\examples{
x <- c("a","a","b")
makeUnique(x)
}
\keyword{character}

\eof
\name{MAList-class}
\docType{class}
\alias{MAList-class}
\title{M-value, A-value Expression List - class}

\description{
A simple list-based class for storing M-values and A-values for a batch of spotted microarrays.
\code{MAList} objects are usually created during normalization by the functions \code{\link{normalizeWithinArrays}} or \code{\link{MA.RG}}.
}

\section{Slots/List Components}{
\code{MAList} objects can be created by \code{new("MAList",MA)} where \code{MA} is a list.
This class contains no slots (other than \code{.Data}), but objects should contain the following list components:
\tabular{ll}{
  \code{M}:\tab numeric matrix containing the M-values or log-2 expression ratios.  Rows correspond to spots and columns to arrays.\cr
  \code{A}:\tab numeric matrix containing the A-values or average log-2 expression values
}
\tabular{ll}{
  \code{weights}:\tab numeric matric containing relative spot quality weights.  Should be non-negative.\cr
  \code{printer}:\tab list containing information on the process used to print the spots on the arrays.  See \link[limma:PrintLayout]{PrintLayout}.\cr
  \code{genes}:\tab data.frame containing information on the genes spotted on the arrays.  Should include a character column \code{Name} containing names for the genes or controls.\cr
  \code{targets}:\tab data.frame containing information on the target RNA samples.  Should include factor or character columns \code{Cy3} and \code{Cy5} specifying which RNA was hybridized to each array.
}
All of the matrices should have the same dimensions.
The row dimension of \code{targets} should match the column dimension of the matrices.
}

\section{Methods}{
This class inherits directly from class \code{list} so any operation appropriate for lists will work on objects of this class.
In addition, \code{MAList} objects can be \link[limma:subsetting]{subsetted} and \link[limma:cbind]{combined}.
\code{RGList} objects will return dimensions and hence functions such as \code{\link[limma:dim]{dim}}, \code{\link[base:nrow]{nrow}} and \code{\link[base:nrow]{ncol}} are defined. 
\code{MALists} also inherit a \code{\link[methods]{show}} method from the virtual class \code{\link[limma:LargeDataObject]{LargeDataObject}}, which means that \code{RGLists} will print in a compact way.

Other functions in LIMMA which operate on \code{MAList} objects include
\code{\link{normalizeWithinArrays}},
\code{\link{normalizeBetweenArrays}},
\code{\link{normalizeForPrintorder}},
\code{\link{plotMA}}
and \code{\link{plotPrintTipLoess}}.
}

\author{Gordon Smyth}

\seealso{
  \link{2.Classes} gives an overview of all the classes defined by this package.
  
  \code{\link[marrayClasses]{marrayNorm-class}} is the corresponding class in the marrayClasses package.
}

\keyword{classes}
\keyword{data}

\eof
\name{MArrayLM-class}
\docType{class}
\alias{MArrayLM-class}
\title{Microarray Linear Model Fit - class}

\description{
A list-based class for storing the results of fitting gene-wise linear models to a batch of microarrays.
Objects are normally created by \code{\link{lmFit}}.
}

\section{Slots/Components}{
\code{MArrayLM} objects do not contain any slots (apart from \code{.Data}) but they should contain the following list components:
  \describe{
    \item{\code{coefficients}:}{\code{matrix} containing fitted coefficients or contrasts}
    \item{\code{stdev.unscaled}:}{\code{matrix} containing unscaled standard deviations of the coefficients or contrasts}
    \item{\code{sigma}:}{\code{numeric} vector containing residual variances for each gene}
    \item{\code{df.residual}:}{\code{numeric} vector containing residual degrees of freedom for each gene}
  }
  Object may also contain the following optional components:
  \describe{
    \item{\code{genes}:}{\code{data.frame} containing gene names and annotation}
    \item{\code{design}:}{design \code{matrix} of full column rank}
    \item{\code{contrasts}:}{\code{matrix} defining contrasts of coefficients for which results are desired}
    \item{\code{s2.prior}:}{\code{numeric} value giving empirical Bayes estimated prior value for residual variances}
    \item{\code{df.prior}:}{\code{numeric} vector giving empirical Bayes estimated degrees of freedom associated with \code{s2.prior} for each gene}
    \item{\code{s2.post}:}{\code{numeric} vector giving posterior residual variances}
    \item{\code{t}:}{\code{matrix} containing empirical Bayes t-statistics}
    \item{\code{var.prior}:}{\code{numeric} vector giving empirical Bayes estimated variance for each true coefficient}
  }
}

\section{Methods}{
\code{RGList} objects will return dimensions and hence functions such as \code{\link[limma:dim]{dim}}, \code{\link[base:nrow]{nrow}} and \code{\link[base:nrow]{ncol}} are defined. 
\code{MArrayLM} objects inherit a \code{show} method from the virtual class \code{LargeDataObject}.

The functions \code{\link{ebayes}} and \code{\link{classifyTests}} accept \code{MArrayLM} objects as arguments.
}

\author{Gordon Smyth}

\seealso{
  \link{2.Classes} gives an overview of all the classes defined by this package.
}

\keyword{classes}
\keyword{regression}

\eof
\name{matvec}
\alias{matvec}
\alias{vecmat}

\title{Multiply a Matrix by a Vector}
\description{Multiply the rows or columns of a matrix by the elements of a vector.}

\usage{
matvec(M, v)
vecmat(v, M)
}

\arguments{
\item{M}{numeric matrix, or object which can be coerced to a matrix.}
\item{v}{numeric vector, or object which can be coerced to a vector. Length should match the number of columns of \code{M} (for \code{matvec}) or the number of rows of \code{M} (for \code{vecmat})}
}

\value{A matrix of the same dimensions as \code{M}.}

\details{
\code{matvec(M,v)} is equivalent to \code{M \%*\% diag(v)} but is faster to execute.
Similarly \code{vecmat(v,M)} is equivalent to \code{diag(v) \%*\% M} but is faster to execute.
}

\examples{
A <- matrix(1:12,3,4)
A
matvec(A,c(1,2,3,4))
vecmat(c(1,2,3),A)
}

\author{Gordon Smyth}

\keyword{array}
\keyword{algebra}

\eof
\name{merge.RGList}
\alias{merge.RGList}
\title{Merge RGList Data Objects}
\description{
Merge two microarray data sets represented by RGLists.
}
\usage{
\method{merge}{RGList}(x,y,\dots)
}
\arguments{
  \item{x}{\code{\link{RGList-class}} object with list components \code{R}, \code{G}, \code{Rb} and \code{Gb} containing the foreground and background intensities for each spot on each array.}
  \item{y}{\code{RGList} object, corresponding to the same genes as for \code{x}, possibly in a different order, but with different arrays.}
  \item{\dots}{other arguments are accepted but not used at present}
}
\details{
An \code{RGList} is a list object containing numeric matrices all of the same dimensions.
The RGLists are merged by merging each of the components by row names.
Unlike when using \code{\link{cbind}}, row names are not required to be in the same order or to be unique.
In the case of repeated row names, the order of the rows with repeated names in preserved.
This means that the first occurrence of each name in \code{x$R} is matched with the first occurrence of the same name in \code{y$R}, the second with the second, and so on.
The final vector of row names is the same as in \code{x}.
}
\value{
An \code{RGList} with the same components as \code{x}.
Component matrices have the same rows names as in \code{x} but columns from \code{y} as well as \code{x}.
}
\author{Gordon Smyth}
\seealso{
R base provides a \code{\link[base]{merge}} method for merging data.frames.

An overview of limma commands for reading, subsetting and merging data is given in \link{3.ReadingData}.
}
\examples{
R <- G <- matrix(11:14,4,2)
rownames(R) <- rownames(G) <- c("a","a","b","c")
RG1 <- new("RGList",list(R=R,G=G))

R <- G <- matrix(21:24,4,2)
rownames(R) <- rownames(G) <- c("b","a","a","c")
RG2 <- new("RGList",list(R=R,G=G))

merge(RG1,RG2)
merge(RG2,RG1)
}
\keyword{manip}

\eof
\name{normalizeRobustSpline}
\alias{normalizeRobustSpline}
\title{Normalize Single Microarray Using Shrunk Robust Splines}
\description{
Normalize the M-values for a single microarray using robustly fitted regression splines and empirical Bayes shrinkage.
}
\usage{
normalizeRobustSpline(M,A,layout,df=5,method="M")
}
\arguments{
  \item{M}{numeric vector of M-values}
  \item{A}{numeric vector of A-values}
  \item{layout}{list specifying the dimensions of the spot matrix and the grid matrix}
  \item{df}{degrees of freedom for regression spline, i.e., the number of regression coefficients and the number of knots}
  \item{method}{choices are \code{"M"} for M-estimation or \code{"MM"} for high breakdown point regression}
}
\details{
This function implements an idea similar to print-tip loess normalization but uses regression splines in place of the loess curves and uses empirical Bayes ideas to shrink the individual prtin-tip curves towards a common value.
This allows the technique to introduce less noise into good quality arrays with little spatial variation while still giving good results on arrays with strong spatial variation.
}
\value{
Numeric vector containing normalized M-values.
}
\author{Gordon Smyth}
\references{
The function is based on unpublished work by the author.
}
\seealso{
  An overview of LIMMA functions for normalization is given in \link{4.Normalization}.
}
\examples{
library(sma)
data(MouseArray)
M <- m.spot(mouse1)
A <- a.spot(mouse1)
M <- normalizeRobustSpline(M,A,mouse.setup)
}
\keyword{models}

\eof
\name{normalizeWithinArrays}
\alias{normalizeWithinArrays}
\alias{MA.RG}
\alias{RG.MA}
\title{Normalize Within Arrays}
\description{
Normalize the expression log-ratios for one or more two-colour spotted microarray experiments so that the log-ratios average to zero within each array or sub-array.
}
\usage{
normalizeWithinArrays(object, layout, method="printtiploess", weights=object$weights, span=0.3, iterations=4, controlspots=NULL, df=5, robust="M")
MA.RG(object, log.transform=TRUE)
RG.MA(object)
}
\arguments{
  \item{object}{object of class \code{list}, \code{RGList} or \code{MAList} containing two-color microarray data}
  \item{layout}{list specifying the dimensions of the spot matrix and the grid matrix}
  \item{method}{character string specifying the normalization method.
  Choices are \code{"none"}, \code{"median"}, \code{"loess"}, \code{"printtiploess"}, \code{"composite"} and \code{"robustspline"}.
  A partial string sufficient to uniquely identify the choice is permitted.}
  \item{weights}{numeric matrix or vector of the same size and shape as the components of \code{object}. Will use by default weights found in \code{object} if they exist.}
  \item{span}{numeric scalar giving the smoothing parameter for the \code{loess} fit}
  \item{iterations}{number of iterations used in loess fitting.  More iterations give a more robust fit.}
  \item{controlspots}{numeric or logical vector specifying the subset of spots which are non-differentially expressed control spots, for use with \code{method="composite"}}
  \item{df}{degrees of freedom for spline if \code{method="robustspline"}}
  \item{robust}{robust regression method if \code{method="robustspline"}.  Choices are \code{"M"} or \code{"MM"}.}
  \item{log.transform}{logical indicating whether intensities should be log2 transformed.
  If \code{FALSE} then intensities are assumed to be already logged.}
}

\details{
Normalization is intended to remove from the expression measures any systematic trends which arise from the microarray technology rather than from differences between the probes or between the target RNA samples hybridized to the arrays.

This function normalizes M-values (log-ratios) for dye-bias within each array.
Apart from \code{method="none"} and \code{method="median"}, all the normalization methods make use of the relationship between dye-bias and intensity.
The loess normalization methods were proposed by Yang et al (2001, 2002).
Smyth and Speed (2003) give a detailed statement of the methods.

More information on the loess control parameters \code{span} and \code{iterations} can be found under \code{\link{loessFit}}.
The default values given here are equivalent to those for the older function \code{stat.ma} in the SMA package.

The \code{"robustspline"} method calls \code{\link{normalizeRobustSpline}}.

\code{MA.RG} converts an unlogged \code{RGList} object into an \code{MAList} object.
\code{MA.RG(object)} is equivalent to \code{normalizeWithinArrays(object,method="none")}.

\code{RG.MA(object)} converts back from an \code{MAList} object to a \code{RGList} object with intensities on the log2 scale.
}
\value{
An object of class \code{\link[limma:malist]{MAList}}.
}

\author{Gordon Smyth}

\references{
Yang, Y. H., Dudoit, S., Luu, P., and Speed, T. P. (2001). Normalization for cDNA microarray data. In \emph{Microarrays: Optical Technologies and Informatics}, M. L. Bittner, Y. Chen, A. N. Dorsel, and E. R. Dougherty (eds), Proceedings of SPIE, Vol. 4266, pp. 141-152. 

Yang, Y. H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., and Speed, T. P. (2002). Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. \emph{Nucleic Acids Research} \bold{30}(4):e15.

Smyth, G. K., and Speed, T. P. (2003). Normalization of cDNA microarray data. In: \emph{METHODS: Selecting Candidate Genes from DNA Array Screens: Application to Neuroscience}, D. Carter (ed.). To appear.
}

\seealso{
An overview of LIMMA functions for normalization is given in \link{4.Normalization}.

See also \code{\link{normalizeBetweenArrays}} and \code{\link[marrayNorm]{maNorm}} in the marrayNorm package.
}

\examples{
#  See normalizeBetweenArrays
}

\keyword{models}

\eof
\name{normalizeBetweenArrays}
\alias{normalizeBetweenArrays}
\title{Normalize Between Arrays}

\description{
Normalizes expression intensities so that the intensities or log-ratios have similar distributions across a series of arrays.
}

\usage{
normalizeBetweenArrays(object, method, ties=FALSE)
}

\arguments{
  \item{object}{an \code{matrix} or \code{\link[limma:malist]{MAList}} object containing expression ratios for a series of arrays}
  \item{method}{character string specifying the normalization method to be used.
  Choices are \code{"none"}, \code{"scale"}, \code{"quantile"} or \code{"Aquantile"}.
  A partial string sufficient to uniquely identify the choice is permitted.}
  \item{ties}{logical, if \code{TRUE} then ties are treated in a careful way when \code{method="quantile"} or \code{method="Aquantile"}}
}

\details{
\code{normalizeWithinArrays} normalizes expression values to make intensities consistent within each array.
\code{normalizeBetweenArrays} normalizes expression values to achieve consistency between arrays.

The scale normalization method was proposed by Yang et al (2001, 2002) and is further explained by Smyth and Speed (2003).
The idea is simply to scale the log-ratios to have the same median-abolute-deviation (MAD) across arrays.
This idea has also been implemented by the \code{maNormScale} function in the marrayNorm package.
The implementation here is slightly different in that the MAD scale estimator is replaced with the median-absolute-value and the A-values are normalized as well as the M-values.

Quantile normalization was proposed by Bolstad et al (2003) for Affymetrix-style single-channel arrays and by Yang and Thorne (2003) for two-color cDNA arrays.
\code{method="quantile"} ensures that the intensities have the same empirical distribution across arrays and across channels.
\code{method="Aquantile"} ensures that the A-values (average intensities) have the same empirical distribution across arrays leaving the M-values (log-ratios) unchanged.
These two methods are called "q" and "Aq" respectively in Yang and Thorne (2003).

If \code{object} is a \code{matrix} then the scale or quantile normlization will be applied to the columns.
Applying \code{method="Aquantile"} when \code{object} is a \code{matrix} will produce an error.
}

\value{
If \code{object} is a matrix then \code{normalizeBetweenArrays} produces a matrix of the same size.
Otherwise, \code{normalizeBetweenArrays} produces an \code{\link[limma:malist]{MAList}} object.
}

\author{Gordon Smyth}

\references{
Bolstad, B. M., Irizarry R. A., Astrand, M., and Speed, T. P. (2003), A comparison of normalization methods for high density oligonucleotide array data based on bias and variance. \emph{Bioinformatics} \bold{19}, 185-193.

Smyth, G. K., and Speed, T. P. (2003). Normalization of cDNA microarray data. In: \emph{METHODS: Selecting Candidate Genes from DNA Array Screens: Application to Neuroscience}, D. Carter (ed.). To appear.

Yang, Y. H., Dudoit, S., Luu, P., and Speed, T. P. (2001). Normalization for cDNA microarray data. In \emph{Microarrays: Optical Technologies and Informatics}, M. L. Bittner, Y. Chen, A. N. Dorsel, and E. R. Dougherty (eds), Proceedings of SPIE, Volume 4266, pp. 141-152. 

Yang, Y. H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., and Speed, T. P. (2002). Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. \emph{Nucleic Acids Research} \bold{30}(4):e15.

Yang, Y. H., and Thorne, N. P. (2003). Normalization for two-color cDNA microarray data.
In: D. R. Goldstein (ed.), \emph{Science and Statistics: A Festschrift for Terry Speed}, IMS Lecture Notes - Monograph Series, Volume 40, pp. 403-418.
}

\seealso{
  An overview of LIMMA functions for normalization is given in \link{4.Normalization}.

  See also \code{\link[marrayNorm]{maNormScale}} in the marrayNorm package and \code{\link[affy:normalize-methods]{normalize}} in the affy package.
}

\examples{
library(sma)
data(MouseArray)
MA <- normalizeWithinArrays(mouse.data, mouse.setup)
plot.scale.box(MA$M)

#  Between array scale normalization as in Yang et al (2001):
MA <- normalizeBetweenArrays(MA,method="scale")
print(MA)
show(MA)
plot.scale.box(MA$M)

#  One can get the same results using the matrix method:
M <- normalizeBetweenArrays(MA$M,method="scale")
plot.scale.box(M)

#  MpAq normalization as in Yang and Thorne (2003):
MpAq <- normalizeWithinArrays(mouse.data, mouse.setup)
MpAq <- normalizeBetweenArrays(MpAq, method="Aq")
plotDensities(MpAq)
}

\keyword{models}
\keyword{multivariate}

\eof
\name{normalizeForPrintorder}
\alias{normalizeForPrintorder}
\alias{normalizeForPrintorder.rg}
\alias{plotPrintorder}
\title{Print-Order Normalization}

\description{
Normalize intensity values on one or more spotted microarrays to adjust for print-order effects.
}
\usage{
normalizeForPrintorder(object, layout, start="topleft", method = "loess", separate.channels = FALSE, span = 0.1, plate.size = 32)
normalizeForPrintorder.rg(R, G, printorder, method = "loess", separate.channels = FALSE, span = 0.1, plate.size = 32, plot = FALSE)
plotPrintorder(object, layout, start="topleft", slide = 1, method = "loess", separate.channels = FALSE, span = 0.1, plate.size = 32)
}
\arguments{
  \item{object}{an \code{RGList} or \code{list} object containing components \code{R} and \code{G} which are matrices containing the red and green channel intensities for a series of arrays}
  \item{R}{numeric vector containing red channel intensities for a single microarray}
  \item{G}{numeric vector containing the green channel intensities for a single microarray}
  \item{layout}{list specifying the printer layout}
  \item{start}{character string specifying where printing starts in each pin group.  Choices are \code{"topleft"} or \code{"topright"}.}
  \item{printorder}{numeric vector specifying order in which spots are printed.
  Can be computed from \code{printorder(layout,start=start)}.}
  \item{slide}{positive integer giving the column number of the array for which a plot is required}
  \item{method }{character string, "loess" if a smooth loess curve should be fitted through the print-order trend or "plate" if plate effects are to be estimated}
  \item{separate.channels}{logical, \code{TRUE} if normalization should be done separately for the red and green channel and \code{FALSE} if the normalization should be proportional for the two channels}
  \item{span}{numerical constant between 0 and 1 giving the smoothing span for the loess the curve.  Ignored if \code{method="plate"}.}
  \item{plate.size}{positive integer giving the number of consecutive spots corresponding to one plate or plate pack.  Ignored if \code{method="loess"}.}
  \item{plot}{logical. If \code{TRUE} then a scatter plot of the print order effect is sent to the current graphics device.}
}
\details{
Print-order is associated with the 384-well plates used in the printing of spotted microarrays.
There may be variations in DNA concentration or quality between the different plates.
The may be variations in ambient conditions during the time the array is printed.

This function is intended to pre-process the intensities before other normalization methods are applied to adjust for variations in DNA quality or concentration and other print-order effects.

Printorder means the order in which spots are printed on a microarray.
Spotted arrays are printed using a print head with an array of print-tips.
Spots in the various tip-groups are printed in parallel.
Printing is assumed to start in the top right hand corner of each tip-group and to proceed left and down by rows.
(WARNING: this is not always the case.)
This is true for microarrays printed at the Australian Genome Research Facility but might not be true for arrays from other sources.

If \code{object} is an \code{RGList} then printorder is performed for each intensity in each array.

\code{plotPrintorder} is a non-generic function which calls \code{normalizeForPrintorder} with \code{plot=TRUE}.
}
\value{
\code{normalizeForPrintorder} produces an \code{RGList} containing normalized intensities.
The function \code{plotPrintorder} or \code{normalizeForPrintorder.rg} with \code{plot=TRUE} returns no value but produces a plot as a side-effect.
\code{normalizeForPrintorder.rg} with \code{plot=FALSE} returns a list with the following components: 
  \item{R}{numeric vector containing the normalized red channel intensities}
  \item{G}{numeric vector containing the normalized red channel intensites}
  \item{R.trend}{numeric vector containing the fitted printorder trend for the red channel}
  \item{G.trend}{numeric vector containing the fitted printorder trend for the green channe}
}
\references{
Smyth, G. K. Print-order normalization of cDNA microarrays. March 2002.  \url{http://www.statsci.org/smyth/pubs/porder/porder.html}
}
\author{Gordon Smyth}
\seealso{
\code{\link{printorder}}.

An overview of LIMMA functions for normalization is given in \link{4.Normalization}.
}
\examples{
library(sma)
data(MouseArray)
plotPrintorder(mouse.data,mouse.setup,slide=1,separate=TRUE)
RG <- normalizeForPrintorder(mouse.data,mouse.setup)
}
\keyword{models}

\eof
\name{normalizeQuantiles}
\alias{normalizeQuantiles}
\title{Normalize Columns of a Matrix to have the same Quantiles}
\description{
Normalize the columns of a matrix to have the same quantiles, allowing for missing values.
Users do not normally need to call this function directly - use \code{\link{normalizeBetweenArrays}} instead.
}
\usage{
normalizeQuantiles(A, ties=FALSE)
}
\arguments{
  \item{A}{numeric matrix. Missing values are allowed.}
  \item{ties}{logical. If \code{TRUE}, ties in each column of \code{A} are treated in careful way. tied values will be normalized to the mean of the corresponding pooled quantiles.}
}
\details{
This function is intended to normalize single channel or A-value microarray intensities between arrays.
Each quantile of each column is set to the mean of that quantile across arrays.
The intention is to make all the normalized columns have the same empirical distribution.
This will be exactly true if there are no missing values and no ties within the columns: the normalized columns are then simply permutations of one another.

If there are ties amongst the intensities for a particular array, then with \code{ties=FALSE} the ties are broken in an unpredictable order.
If \code{ties=TRUE}, all the tied values for that array will be normalized to the same value, the average of the quantiles for the tied values.
}
\value{
A matrix of the same dimensions as \code{A} containing the normalized values.
}
\references{
Bolstad, B. M., Irizarry R. A., Astrand, M., and Speed, T. P. (2003), A comparison of normalization methods for high density oligonucleotide array data based on bias and variance. \emph{Bioinformatics} \bold{19}, 185-193.
}
\author{Gordon Smyth}
\seealso{
An overview of LIMMA functions for normalization is given in \link{4.Normalization}.
}  
\keyword{models}

\eof
\name{normalizeScale}
\alias{normalizeMedianDeviations}
\alias{normalizeMedians}
\title{Normalize Columns of a Matrix to have the Same Scale}
\description{
Performs scale normalization of an M-value matrix or an A-value matrix across a series of arrays.
Users do not normally need to call these functions directly - use \code{normalizeBetweenArrays} instead.
}
\usage{
normalizeMedianDeviations(x)
normalizeMedians(x)
}
\arguments{
  \item{x}{numeric matrix}
}
\value{

\code{normalizeMedianDeviations} produces a numeric matrix of the same size as that input which has been scaled so that each column as the same median-absolute value.

\code{normalizeMedians} produces a numeric matrix which has been scaled so that each column has the same median-value.

}

\details{
If \code{x} is a matrix of log-ratios of expression (M-values) then \code{normalizeMedianDeviations} is very similar to scaling to equalize the median absolute deviation (MAD) as in Yang et al (2001, 2002).
Here the median-absolute value is used for preference to as to not re-center the M-values.

\code{normalizeMedians} is used for A-values of overall expression.
}
\author{Gordon Smyth}
\seealso{
  An overview of LIMMA functions for normalization is given in \link{4.Normalization}.
}
\examples{
M <- cbind(Array1=rnorm(10),Array2=2*rnorm(10))
normalizeMedianDeviations(M)

A <- cbind(Array1=rlnorm(10),Array2=2*rlnorm(10))
normalizeMedians(A)
}
\keyword{array}

\eof
\title{Single-channel Densities Plot}
\name{plotDensities}
\alias{plotDensities}
\description{
Creates a plot of the densities of single-channels from two-color cDNA
microarray data.
}
\usage{
plotDensities(object, log.transform=FALSE, arrays=NULL, singlechannels=NULL,
              groups=NULL, col=NULL) 
}
\arguments{
  \item{object}{must be either a list with components \code{M}
    containing log-ratios and \code{A} containing average intensities or
    a list with components \code{R} containing log2 red intensities
    and \code{G} containing log2 green intensities.  If object is
    given as an \code{MAList} it is converted to an \code{RGList}.}

  \item{log.transform}{logical which needs to be \code{TRUE} if object
    supplied is an \code{RGList} of unlogged intensities.}
  
  \item{arrays}{vector of integers giving the arrays from which the
    single-channels will be selected to be plotted.
    Corresponds to columns of \code{M}
    and \code{A} (or \code{R} of \code{G})  If \code{NULL} (which is the
    default), arrays is given by \code{1:ncol(object$R)}.}
  
  \item{singlechannels}{vector of integers indicating which
    single-channels will be selected to be plotted.  Values correspond
    to the columns of the matrix of \code{cbind(R,G)} and range
    between \code{1:ncol(R)} for red single-channels and
    \code{( (ncol(R)+1):(ncol(R)+ncol(G)) )} for the green
    single-channels in \code{object}.}
  
  \item{groups}{vector of consecutive integers beginning at 1 indicating
    the groups of arrays or single-channels (depending on which of
    \code{arrays} or \code{singlechannels} are non \code{NULL}).  This is used
    to color any groups of the single-channel densities.
    If \code{NULL} (default), \code{groups} correspond to the
    red and green channels.  If both \code{arrays} and
    \code{singlechannels} are \code{NULL} all arrays are selected and
    groups (if specified) must correspond to the arrays.}

  \item{col}{vector of colors of the same length as the number of
    different groups. If \code{NULL} (default) the \code{col} equals
    \code{c("red","green")}.  See details for more specifications.}
}

\details{
This function is used as a data display technique associated with single-channel normalization.
See the section on single-channel normalization in the LIMMA User's Guide.

If no \code{col} is specified, the default is to color singlechannels
according to red and green. If both \code{arrays} and \code{groups} are
non-\code{NULL}, then the length of \code{groups} must equal the length
of \code{arrays} and the maximum of \code{groups} (i.e. the number of
groups) must equal the length of \code{col} otherwise the default color
of black will be used for all single-channels.
If \code{arrays} is \code{NULL} and both \code{singlechannels} and
\code{groups} are non-\code{NULL}, then the length of \code{groups} must
equal the length of \code{singlechannels} and the maximum of \code{groups}
(i.e. the number of groups) must equal the length of \code{col}
otherwise the default color of black will be used for all single-channels.
}
\value{A plot is created on the current graphics device.}
\author{Natalie Thorne}
\seealso{
An overview of diagnostic plots in LIMMA is given in \link{6.Diagnostics}.
There is a section using \code{plotDensities} in conjunction with single-channel normalization
in the \emph{\link[limma:../doc/usersguide]{LIMMA User's Guide}}.
}
\examples{
library(sma)
data(MouseArray)

#  no normalization but background correction is done
MA.n <- MA.RG(mouse.data)

#  Default settings for plotDensities.
plotDensities(MA.n)

#  One can reproduce the default settings.
plotDensities(MA.n,arrays=c(1:6),groups=c(rep(1,6),rep(2,6)),
col=c("red","green"))

#  Color R and G single-channels by blue and purple.
plotDensities(MA.n,arrays=NULL,groups=NULL,col=c("blue","purple"))

#  Indexing single-channels using singlechannels (arrays=NULL).
plotDensities(MA.n,singlechannels=c(1,2,7))

#  Change the default colors from c("red","green") to c("pink","purple")
plotDensities(MA.n,singlechannels=c(1,2,7),col=c("pink","purple"))

#  Specified too many colors since groups=NULL defaults to two groups.
plotDensities(MA.n,singlechannels=c(1,2,7),col=c("pink","purple","blue"))

#  Three single-channels, three groups, three colors.
plotDensities(MA.n,singlechannels=c(1,2,7),groups=c(1,2,3),
col=c("pink","purple","blue"))

#  Three single-channels, one group, one color.
plotDensities(MA.n,singlechannels=c(1,2,7),groups=c(1,1,1),
col=c("purple"))

#  All single-channels, three groups (ctl,tmt,reference), three colors.
plotDensities(MA.n,singlechannels=c(1:12),
groups=c(rep(1,3),rep(2,3),rep(3,6)),col=c("darkred","red","green"))

}
\keyword{hplot}




\eof
\title{MA-Plot}
\name{plotMA}
\alias{plotMA}
\description{
Creates an MA-plot with color coding for various sort of control spots.
}
\usage{
plotMA(MA,array=1,pch=16,status=NULL,
       values=c("gene","blank","buffer","utility","negative","calibration","ratio"),
       col=c("black","yellow","orange","pink","brown","blue","red"),
       cex=c(0.1,0.6,0.6,0.6,0.6,0.6,0.6))
}
\arguments{
  \item{MA}{list with components \code{M} containing log-ratios and \code{A} containing average intensities}
  \item{array}{integer giving the array to be plotted. Corresponds to columns of \code{M} and \code{A}.}
  \item{pch}{vector or list of plotting characters}
  \item{status}{character vector giving the control status of each spot on the array.
  If \code{NULL} then subsequent arguments are ignored.}
  \item{values}{character vector giving unique values of \code{status} corresponding to control states of interest}
  \item{col}{vector of colors, of the same length as \code{values}}
  \item{cex}{numeric vector of the same length as \code{values} giving sizes for plot symbols}
}

\details{
See \code{\link[base]{points}} for possible values for \code{pch}, \code{col} and \code{cex}.
}

\value{A plot is created on the current graphics device.}
\author{Gordon Smyth}
\seealso{
An overview of diagnostic functions available in LIMMA is given in \link{6.Diagnostics}.
}
\keyword{hplot}

\eof
\title{MA Plots by Print-Tip Group}
\name{plotPrintTipLoess}
\alias{plotPrintTipLoess}
\description{
Creates a coplot giving MA-plots with lowess curves by print-tip groups.
}
\usage{
plotPrintTipLoess(MA,layout,array=1,span=0.3,...)
}
\arguments{
  \item{MA}{list with components \code{M} containing log-ratios and \code{A} containing average intensities}
  \item{layout}{a list specifying the number of tip rows and columns and the number of spot rows and columns printed by each tip}
  \item{array}{integer giving the array to be plotted. Corresponds to columns of \code{M} and \code{A}.}
  \item{span}{span of window for \code{lowess} curve}
  \item{...}{other arguments passed to \code{panel.smooth}}
}
\value{A plot is created on the current graphics device}
\seealso{
An overview of diagnostic functions available in LIMMA is given in \link{6.Diagnostics}.
}
\author{Gordon Smyth}
\keyword{hplot}

\eof
\name{printHead}
\alias{printHead}
\title{Print Leading Rows of Large Objects}

\description{
Print the leading rows of a large vector, matrix or data.frame.
This function is used by \code{show} methods for data classes defined in LIMMA.
}

\usage{
printHead(x)
}

\arguments{
  \item{x}{any object}
}

\details{
If \code{x} is a vector with more than 20 elements, then \code{printHead(x)} prints only the first 5 elements.
If \code{x} is a matrix or data.frame with more than 10 rows, then \code{printHead(x)} prints only the first 5 rows.
Any other type of object is printed normally.
}

\author{Gordon Smyth}

\seealso{
An overview of classes defined in LIMMA is given in \link{2.Classes}
}

\keyword{hplot}

\eof
\name{printorder}
\alias{printorder}
\title{Identify Order in which Spots were Printed}
\description{
Identify order in which spots were printed and the 384-well plate from which they were printed.
}
\usage{
printorder(layout, ndups=1, npins=layout$ngrid.r*layout$ngrid.c, start="topleft")
}
\arguments{
  \item{layout}{list with the components \code{ngrid.r}, \code{ngrid.c}, \code{nspot.r} and \code{nspot.c}, or an \code{RGList} or \code{MAList} object from which the printer layout may be extracted.}
  \item{ndups}{number of duplicate spots, i.e., number of times print-head dips into each well}
  \item{npins}{actual number of pins or tips on the print-head}
  \item{start}{character string giving position of the spot printed first in each grid.
  Choices are \code{"topleft"} or \code{"topright"} and partial matches are accepted.}
}
\details{
In most cases the printer-head contains the \code{layout$ngrid.r} times \code{layout$ngrid.c} pins or tips and the array is printed using \code{layout$nspot.r} times \code{layout$npot.c} dips of the head.
The plate holding the DNA to be printed is assumed to have 384 wells in 16 rows and 24 columns.

In some cases a smaller number of physical pins is used and the total number of grids is built up by effectively printing two or more sub-arrays on the same slide.
In this case the number of grids should be a multiple of the number of pins.

Printing is assumed to proceed by rows within in each grid starting either from the top-left or the top-right.
}
\value{
List with components
\item{printorder}{numeric vector giving printorder of each spot, i.e., which dip of the print-head was used to print it}
\item{plate}{numeric vector giving plate number from which each spot was printed}
\item{plate.r}{numeric vector giving plate-row number of the well from which each spot was printed}
\item{plate.c}{numeric vector giving plate-column number of the well from which each spot was printed}
\item{plateposition}{character vector summarizing plate number and plate position of the well from which each spot was printed with letters for plate rows and number for columns.
For example \code{02B13} is second row, 13th column, of the second plate.}
}
\seealso{
\code{\link{normalizeForPrintorder}}.

An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}
\author{Gordon Smyth}
\examples{
printorder(list(ngrid.r=2,ngrid.c=2,nspot.r=12,nspot.c=8))
}
\keyword{IO}

\eof
\name{qqt}
\alias{qqt}
\title{Student's t Quantile-Quantile Plot}
\description{Plots the quantiles of a data sample against the theoretical quantiles of a Student's t distribution.}
\usage{qqt(y, df = Inf, ylim = range(y), main = "Student's t Q-Q Plot", 
    xlab = "Theoretical Quantiles", ylab = "Sample Quantiles", plot.it = TRUE, ...) 
}
\arguments{
\item{y}{a numeric vector or array containing the data sample}
\item{df}{degrees of freedom for the t-distribution.  The default \code{df=Inf} represents the normal distribution.}
\item{ylim}{plotting range for \code{y}}
\item{main}{main title for the plot}
\item{xlab}{x-axis title for the plot}
\item{ylab}{y-axis title for the plot}
\item{plot.it}{whether or not to produce a plot}
\item{...}{other arguments to be passed to \code{plot}}
}
\value{A list is invisibly returned containing the values plotted in the QQ-plot:

\item{x}{theoretical quantiles of the t-distribution}
\item{y}{the data sample, same as input \code{y}}
}

\details{
This function is analogous to \code{qqnorm} for normal probability plots.
In fact \code{qqt(y,df=Inf)} is identical to \code{qqnorm(y)} in all respects except the default title on the plot.
}
\author{Gordon Smyth}
\seealso{\code{\link[base]{qqnorm}}}
\examples{
y <- rt(50,df=4)
qqt(y,df=4)
}
\keyword{distribution}

\eof
\name{QualityWeights}
\alias{QualityWeights}
\alias{wtarea}
\alias{wtflags}
\alias{wtIgnore.Filter}
\title{Spot Quality Weights}
\description{
Functions to calculate quality weights for individual spots based on image analyis output file.
}
\usage{
wtarea(ideal=c(160,170))
wtflags(w=0.1)
wtIgnore.Filter
}
\arguments{
  \item{ideal}{numeric vector giving the ideal area or range of areas for a spot in pixels}
  \item{w}{weight to be given to flagged spots}
}
\details{
These functions can be passed as an argument to \code{read.maimages} to construct quality weights as the microarray data is read in.

\code{wtarea} downweights unusually small or large spots and is designed for SPOT output.
It gives weight 1 to spots which have areas in the ideal range, given in pixels, and linearly downweights spots which are smaller or larger than this range.

\code{wtflags} is designed for GenePix output and gives weight \code{flagged.wt} to spots with \code{Flags} value less than zero. 

\code{wtIgnore.Filter} is designed for QuantArray output and sets the weights equal to the column \code{Ignore Filter} produced by QuantArray.
These weights are 0 for spots to be ignored and 1 otherwise.
}
\value{
A function which takes a dataframe or matrix as argument and produces a numeric vector of weights between 0 and 1
}
\author{Gordon Smyth}
\seealso{
An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}
\examples{
#  Read in spot output files from current directory and give full weight to 165
#  pixel spots.  Note: for this example to run you must set fnames to the names
#  of actual spot output files (data not provided).
\dontrun{
RG <- read.maimages(fnames,source="spot",wt.fun=wtarea(165))
#  Spot will be downweighted according to weights found in RG
MA <- normalizeWithinArrays(RG,layout)
}
}
\keyword{regression}

\eof
\name{read.maimages}
\alias{read.maimages}
\alias{read.imagene}
\title{Read RGList from Image Analysis Output Files}
\description{
Reads an RGList from a series of microarray image analysis output files
}
\usage{
read.maimages(files,source="spot",path=NULL,ext=NULL,names=files,columns=NULL,wt.fun=NULL,verbose=TRUE,sep="\t",quote="\"",\dots)
read.imagene(files,path=NULL,ext=NULL,names=NULL,columns=NULL,wt.fun=NULL,verbose=TRUE,sep="\t",quote="\"",...)
}
\arguments{
  \item{files}{character vector giving the names of the files containing image analysis output or, for Imagene data, a character matrix of names of files.
  If omitted, then all files with extension \code{ext} will be read.}
  \item{source}{character string specifing the image analysis program which produced the output files.  Choices are \code{arrayvision}, \code{"genepix"}, \code{"imagene"}, \code{"quantarray"}, \code{"smd"}, \code{"spot"} or \code{"spot.close.open"}.}
  \item{path}{character string giving the directory containing the files. Can be omitted if the files are in the current working directory.}
  \item{ext}{character string giving optional extension to be added to each file name}
  \item{names}{character vector of names to be associated with each array as column name}
  \item{columns}{list with fields \code{Rf}, \code{Gf}, \code{Rb} and \code{Gb} giving the column names to be used for red and green foreground and background.  This is not usually specified by the user but, if it is, it over-rides \code{source}.}
  \item{wt.fun}{function to calculate quality weights}
  \item{verbose}{logical, \code{TRUE} to report each time a file is read in}
  \item{sep}{the field separator character}
  \item{quote}{character string of characters to be treated as quote marks}
  \item{\dots}{any other arguments are passed to \code{read.table}}
}
\details{
This is the main data input function for the LIMMA package.
It extracts the foreground and background intensities from a series of files, produced by an image analysis program, and assembles them into the components of one list.
The image analysis programs ArrayVision, GenePix, Imagene, QuantArray, Stanford Microarray Database (SMD) and SPOT are supported explicitly.
Data from some other image analysis programs can be read if the appropriate column names containing the foreground and background intensities are specified using the \code{columns} argument.
(This will work if the column names are unique and if there are no rows in the file after the last line of data.
Header lines are ok.)
In the case of SPOT, two possible background estimators are supported:
if \code{source="spot.close.open"} then background intensities are estimated from \code{morph.close.open} rather than \code{morph}.

Spot quality weights may be extracted from the image analysis files using a ready-made or a user-supplied weight function \code{\link[limma:QualityWeights]{wt.fun}}.
\code{wt.fun} may be any user-supplied function which accepts a data.frame argument and returns a vector of non-negative weights.
The columns of the data.frame are as in the image analysis output files.
See \code{\link{QualityWeights}} for provided weight functions.

For Imagene image data the argument \code{files} should be a matrix with two columns.
The first column should contain the names of the files containing green channel (cy3) data and the second column should contain names of files containing red channel (cy5) data.
The function \code{read.imagene} is called by \code{read.maimages} when \code{source="imagene"}.
It does not need to be called directly by users.
}
\value{
An \code{\link[limma:rglist]{RGList}} object containing the components
  \item{R}{matrix containing the red channel foreground intensities for each spot for each array.}
  \item{Rb}{matrix containing the red channel background intensities for each spot for each array.}
  \item{G}{matrix containing the green channel foreground intensities for each spot for each array.}
  \item{Gb}{matrix containing the green channel background intensities for each spot for each array.}
  \item{weights}{spot quality weights, if \code{wt.fun} is given}
  \item{printer}{list of class \code{\link[limma:printlayout]{PrintLayout}}, currently set only if \code{source="imagene"}}
  \item{genes}{data frame containing gene names and IDs and spatial positions on the array, currently set only if \code{source="imagene"}}
}
\author{Gordon Smyth}
\references{
Web pages for the image analysis software packages mentioned here are listed at \url{http://www.statsci.org/micrarra/image.html}
}
\seealso{
\code{read.maimages} is based on \code{\link[base]{read.table}} in the base package.
\code{\link[marrayInput]{read.marrayRaw}} is the corresponding function in the marrayInput package.

An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}
\examples{
#  Read all .gpr files from current working directory
#  and give weight 0.1 to spots with negative flags

\dontrun{files <- dir(pattern="*\\\\.gpr$")
RG <- read.maimages(files,"genepix",wt.fun=wtflags(0.1))}

#  Read all .spot files from current working director and down-weight
#  spots smaller or larger than 150 pixels

\dontrun{files <- dir(pattern="*\\\\.spot$")
RG <- read.maimages(files,"spot",wt.fun=wtarea(150))}
}
\keyword{file}

\eof
\name{read.matrix}
\alias{read.matrix}
\title{Read Matrix with Headers from File}
\description{
Read a numeric matrix from a file assuming column headings on the first line.
Not normally used directly by users.
}
\usage{
read.matrix(file,nrows=0,skip=0,...)
}
\arguments{
  \item{file}{character string giving the file name}
  \item{nrows}{maximum number of rows of data to read, if greater than zero}
  \item{skip}{number of lines of the data file to skip before reading data}
  \item{...}{any other arguments to be passed to \code{scan}}
}
\details{
This function is similar to but faster than \code{read.table(file,header=TRUE)} when all the columns are numeric.
}
\value{
A numeric matrix with column names.
}
\author{Gordon Smyth}
\seealso{
\code{\link[base]{read.table}}, \code{\link[base]{scan}}.

An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}

\keyword{file}

\eof
\name{read.series}
\alias{read.series}
\title{Read series of image files}
\description{
Read in a series of array image analysis output files as data frames.
}
\usage{
read.series(slides, path=NULL, suffix="spot", ...)
}
\arguments{
  \item{slides}{character vector giving the names of files to be read in. Any suffix such as ".spot" or ".gpr" which is common to all the files can be omitted.}
  \item{path}{character string giving the directory containing the files. Can be omitted if the files are in the current working directory.}
  \item{suffix}{character string giving a suffix such as "spot" or "gpr" to be added to each file. If \code{NULL} then no suffix is added.}
  \item{\dots}{any other arguments to be passed to \code{read.table}}
}
\details{
This function performs a series of calls to \code{read.table}.
The image analysis output files are assumed to have been edited to remove all pre-heading material.
The files are assumed to contain only column names and data.
In most cases only can use \code{read.maimages} instead.

The data.frames produced by this command will typically be processed further using one of the functions \code{rg.spot}, \code{rg.genepix} or \code{rg.quantarray}.
}
\value{
No value is returned.
However a series of data.frames are created on the current environment with names of the form filename.suffix.
The files names are given by the elements of \code{slides} and the suffix is given by \code{suffix}.
}
\author{Gordon Smyth}
\seealso{
\code{\link[base]{read.table}}.

An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}

\keyword{file}

\eof
\name{readGPRHeaders}
\alias{readGPRHeaders}
\title{Read GenePix Results File Header Information}
\description{
Read the header information from a GenePix Results (GPR) file.
This function is used internally by \code{read.maimages} and is not usually called directly by users.
}
\usage{
readGPRHeaders(file)
}
\arguments{
  \item{file}{character string giving file name, including path if not in current working directory}
}
\details{
Output files produced by the image analysis software GenePix include a number of lines of header which contain information about the scanning process.
This function extracts that information and locates the line where the intensity data begins.
}
\value{
A list with components corresponds to lines of header information.
All components are character vectors.
}
\references{
\url{http://www.axon.com/gn_GenePix_File_Formats.html}
}
\author{Gordon Smyth}
\seealso{\code{\link{read.maimages}}

An overview of LIMMA functions to read data is given in \link{3.ReadingData}.
}
\keyword{file}

\eof
\name{readImageneHeaders}
\alias{readImageneHeaders}
\title{Read Imagene Header Information}
\description{
Read the header information from an Imagene image analysis output file.
This function is used internally by \code{read.maimages} and is not usually called directly by users.
}
\usage{
readImageneHeaders(file)
}
\arguments{
  \item{file}{character string giving file name, including path if not in current working directory}
}
\details{
Output files produced by the image analysis software Imagene include a number of lines of header which contain information about the printing process.
This function extracts that information and locates the line where the intensity data begins.
}
\value{
A list with components
  \item{Begin.Raw.Data}{line number immediately before intensity data begins}
  \item{Version}{version number of Imagene software}
  \item{Date}{character string giving time and data that array was scanned}
  \item{Image.File}{character string giving original file name to which data was written by Imagene}
  \item{Inverted}{logical}
  \item{Field.Dimension}{list with components \code{Field} containing a character string,
  \code{Metarows} containing number of grid rows,
  \code{Metacols} containing number of grid columns,
  \code{Rows} containing number of spot rows in each grid,
  \code{Cols} containing number of spot columns in each grid}
  \item{Measurement.Parameters}{list with numerical components \code{Signal.Low}, \code{Signal.High}, \code{Background.Low}, \code{Background.High}, \code{Background.Buffer} and \code{Background.Width}}
}
\references{
\url{http://www.biodiscovery.com/imagene.asp}
}
\author{Gordon Smyth}
\seealso{\code{\link{read.imagene}}

An overview of LIMMA functions to read data is given in \link{3.ReadingData}.
}
\examples{
\dontrun{This function is not intended to be called by users.
There is an example of use in the code for function read.imagene.}
print(read.imagene)
}
\keyword{file}

\eof
\name{readSpotTypes}
\alias{readSpotTypes}
\title{Read Spot Types File}
\description{
Read a table giving regular expressions to identify different types of spots in the gene-dataframe.
}
\usage{
readSpotTypes(file="SpotTypes.txt", sep="\t")
}
\arguments{
  \item{file}{character string giving the name of the file specifying the spot types.}
  \item{sep}{the field separator character}
}
\details{
The file is a text file with rows corresponding to types of spots and the following columns: \code{SpotType} gives the name for the spot type, \code{ID} is a regular expression matching the ID column, \code{Name} is a regular expression matching the Name column, and \code{Color} is the R name for the color to be associated with this type.
}
\value{
A data frame with columns
  \item{SpotType}{character vector giving names of the spot types}
  \item{ID}{character vector giving regular expressions}
  \item{Name}{character vector giving regular expressions}
  \item{Color}{character vector giving names of colors}
}
\author{Gordon Smyth}
\seealso{
An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}
\keyword{IO}

\eof
\name{readTargets}
\alias{readTargets}
\title{Read RNA Targets File}
\description{
Read a RNA targets file into dataframe.
}
\usage{
readTargets(file="Targets.txt", sep="\t")
}
\arguments{
  \item{file}{character string giving the name of the targets file.}
  \item{sep}{the field separator character}
}
\details{
The targets file is a text file with rows corresponding to microarrays and columns \code{Cy3} and \code{Cy5} specifying which RNA samples are hybridized to which channel of each microarray.
Other columns are optional.
}
\value{
A data frame including columns
  \item{Cy3}{character vector giving names of RNA samples}
  \item{Cy5}{character vector giving names of RNA samples}
}
\author{Gordon Smyth}
\seealso{
An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}
\keyword{IO}

\eof
\name{readGAL}
\alias{readGAL}
\title{Read a GAL file}
\description{
Read a GenePix Allocation List (GAL) file into a dataframe.
}
\usage{
readGAL(galfile=NULL,path=NULL,header=TRUE,sep="\t",quote="\"",skip=NULL,as.is=TRUE,...)
}
\arguments{
  \item{galfile}{character string giving the name of the GAL file.  If \code{NULL} then a file with extension \code{.gal} is found in the directory specified by \code{path}.}
  \item{path}{character string giving the directory containing the files.  If \code{NULL} then assumed to be the current working directory.}
  \item{header}{logical variable, if \code{TRUE} then the first line after \code{skip} is assumed to contain column headings.  If \code{FALSE} then a value should specified for \code{skip}.}
  \item{sep}{the field separator character}
  \item{quote}{the set of quoting characters}
  \item{skip}{number of lines of the GAL file to skip before reading data.  If \code{NULL} then this number is determined by searching the file for column headings.}
  \item{as.is}{logical variable, if \code{TRUE} then read in character columns as vectors rather than factors.}
  \item{...}{any other arguments are passed to \code{read.table}}
}
\details{
A GAL file is a list of genes and associated information produced by an Axon microarray scanner.
This functions reads in the list, which is assumed to be a standard format apart from an unspecified number of lines at the beginning, with a minimum of user information.
In most cases the function can be used without specifying any of the arguments.
}
\value{
A data frame with columns
  \item{Block}{numeric vector containing the print tip indices}
  \item{Column}{numeric vector containing the spot columns}
  \item{Row}{numeric vector containing the spot rows}
  \item{ID}{character vector, for factor if \code{as.is=FALSE}, containing gene library identifiers}
  \item{Name}{character vector, for factor if \code{as.is=FALSE}, containing gene names}
}
\author{Gordon Smyth}
\seealso{
An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}
\references{
\url{http://www.axon.com/gn_GenePix_File_Formats.html}
}
\examples{
# readGAL()
# will read in the first GAL file (with suffix ".gal")
# found in the current working directory
}
\keyword{IO}

\eof
\name{removeExt}
\alias{removeExt}

\title{Remove Common Extension from File Names}
\description{Finds and removes any common extension from a vector of file names.}

\usage{
removeExt(x)
}

\arguments{
\item{x}{character vector}
}

\value{
A character vector of the same length as \code{x} in which any common extension has been stripped off.
}

\seealso{
An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}

\examples{
x <- c("slide1.spot","slide2.spot","slide3.spot")
removeExt(x)
}

\author{Gordon Smyth}

\keyword{character}
\keyword{file}

\eof
\name{rg.genepix}
\alias{rg.genepix}
\title{Extract RGList from data.frames Containing Genepix Data}
\description{
Extracts an RGList from Genepix image analysis output when the data has already been read from files into data.frames objects.
}
\usage{
rg.genepix(slides,names.slides=names(slides),suffix="gpr")
}
\arguments{
  \item{slides}{character vector giving the names of the data frames containing the Spot output.}
  \item{names.slides}{character vector giving column names to be associated with each slide.}
  \item{suffix}{the dataframe names are assumed to have this suffix added to names in \code{slides}.}
}
\details{
This function extracts the foreground and background intensities from a series of data frames and assembles them in the components of one list.
}
\value{
A list containing the components
  \item{R}{A matrix containing the red channel foreground intensities for each spot for each array.}
  \item{Rb}{A matrix containing the red channel background intensities for each spot for each array.}
  \item{G}{A matrix containing the green channel foreground intensities for each spot for each array.}
  \item{Gb}{A matrix containing the green channel background intensities for each spot for each array.}
}
\author{Gordon Smyth}
\seealso{
An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}
\keyword{IO}

\eof
\name{rg.quantarray}
\alias{rg.quantarray}
\title{Extract RGList from data.frames Containing Quantarray Data}
\description{
Extracts an RGList from Quantarray image analysis output when the data has already been read from files into data.frames objects.
}
\usage{
rg.quantarray(slides,names.slides=names(slides),suffix="qta")
}
\arguments{
  \item{slides}{Character vector giving the names of the data frames containing the Spot output.}
  \item{names.slides}{Names to be associated with each slide as column name.}
  \item{suffix}{The dataframe names are assumed to have this suffix added to names in \code{slides}.}
}
\details{
This function extracts the foreground and background intensities from a series of data frames and assembles them in the components of one list.
}
\value{
A list containing the components
  \item{R}{A matrix containing the red channel foreground intensities for each spot for each array.}
  \item{Rb}{A matrix containing the red channel background intensities for each spot for each array.}
  \item{G}{A matrix containing the green channel foreground intensities for each spot for each array.}
  \item{Gb}{A matrix containing the green channel background intensities for each spot for each array.}
}
\author{Gordon Smyth}
\seealso{
An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}
\keyword{IO}

\eof
\name{rg.series.spot}
\alias{rg.series.spot}
\title{Read RGList from SPOT Image Analysis Output Files}
\description{
Extracts an RGList from a series of Spot image analysis files.
This is a depreciated function.
Use \code{\link{read.maimages}} instead.
}
\usage{
rg.series.spot(slides,path=NULL,names.slides=names(slides),suffix="spot",wt.fun=NULL,verbose=TRUE,...)
}
\arguments{
  \item{slides}{character vector giving the names of the files containing the Spot output}
  \item{path}{character string giving the directory containing the files. Can be omitted if the files are in the current working directory.}
  \item{names.slides}{names to be associated with each slide as column name}
  \item{suffix}{the file names are assumed to have this suffix added to names in \code{slides}}
  \item{wt.fun}{function to calculate quality weights}
  \item{verbose}{\code{TRUE} to report each time a file in read in}
  \item{\dots}{any other arguments to be passed to \code{scan}}
}
\details{
This function extracts the foreground and background intensities from a series of files, produced by the image analysis program SPOT, and assembles them in the components of one list.

Spot quality weights may also be extracted using an optional weight function.
}
\value{
A list containing the components
  \item{R}{matrix containing the red channel foreground intensities for each spot for each array.}
  \item{Rb}{matrix containing the red channel background intensities for each spot for each array.}
  \item{G}{matrix containing the green channel foreground intensities for each spot for each array.}
  \item{Gb}{matrix containing the green channel background intensities for each spot for each array.}
  \item{weights}{spot quality weights, if \code{wt.fun} is not \code{NULL}}
}
\author{Gordon Smyth}
\seealso{
An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}
\keyword{IO}

\eof
\name{rg.spot}
\alias{rg.spot}
\title{Extract RGList from data.frames Containing SPOT Data}
\description{
Extracts an RGList from Spot image analysis output when the data has already been read from files into data.frames objects.
}
\usage{
rg.spot(slides,names.slides=names(slides),suffix="spot",area=FALSE)
}
\arguments{
  \item{slides}{Character vector giving the names of the data frames containing the Spot output.}
  \item{names.slides}{Names to be associated with each slide as column name.}
  \item{suffix}{The dataframe names are assumed to have this suffix added to names in \code{slides}.}
  \item{area}{If \code{TRUE} then the output list includes a component containing the spot areas.}
}
\details{
This function extracts the foreground and background intensities from a series of data frames and assembles them in the components of one list.

Spot areas may also be extracted, which is useful for downweighting unusually small or large spots in subsequent analyses.
}
\value{
A list containing the components
  \item{R}{A matrix containing the red channel foreground intensities for each spot for each array.}
  \item{Rb}{A matrix containing the red channel background intensities for each spot for each array.}
  \item{G}{A matrix containing the green channel foreground intensities for each spot for each array.}
  \item{Gb}{A matrix containing the green channel background intensities for each spot for each array.}
}
\author{Gordon Smyth}
\seealso{
An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}
\keyword{IO}

\eof
\name{RGList-class}
\docType{class}
\alias{RGList-class}
\title{Red, Green Intensity List - class}

\description{
A simple list-based class for storing red and green channel foreground and background intensities for a batch of spotted microarrays.
\code{RGList} objects are normally created by \code{\link{read.maimages}}.
}

\section{Slots/List Components}{
\code{RGList} objects can be created by \code{new("RGList",RG)} where \code{RG} is a list.
Objects of this class contains no slots (other than \code{.Data}), but objects should contain the following list components:
\tabular{ll}{
  \code{R}:\tab numeric matrix containing the red (cy5) foreground intensities.  Rows correspond to spots and columns to arrays.\cr
  \code{G}:\tab numeric matrix containing the green (cy3) foreground intensities\cr
  \code{Rb}:\tab numeric matrix containing the red (cy5) background intensities\cr
  \code{Gb}:\tab numeric matrix containing the green (cy3) background intensities
}
Optional components include
\tabular{ll}{
  \code{weights}:\tab numeric matric containing relative spot quality weights.  Should be non-negative.\cr
  \code{printer}:\tab list containing information on the process used to print the spots on the arrays.  See \link[limma:PrintLayout]{PrintLayout}.\cr
  \code{genes}:\tab data.frame containing information on the genes spotted on the arrays.  Should include a character column \code{Name} containing names for the genes or controls.\cr
  \code{targets}:\tab data.frame containing information on the target RNA samples.  Should include factor or character columns \code{Cy3} and \code{Cy5} specifying which RNA was hybridized to each array.
}
All of the matrices should have the same dimensions.
The row dimension of \code{targets} should match the column dimension of the matrices.
}

\section{Methods}{
This class inherits directly from class \code{list} so any operation appropriate for lists will work on objects of this class.
In addition, \code{RGList} objects can be \link[limma:subsetting]{subsetted}, \link[limma:cbind]{combined} and \link[limma:merge]{merged}.
\code{RGList} objects will return dimensions and hence functions such as \code{\link[limma:dim]{dim}}, \code{\link[base:nrow]{nrow}} and \code{\link[base:nrow]{ncol}} are defined. 
\code{RGLists} also inherit a \code{\link[methods]{show}} method from the virtual class \code{\link[limma:LargeDataObject]{LargeDataObject}}, which means that \code{RGLists} will print in a compact way.

Other functions in LIMMA which operate on \code{RGList} objects include
\code{\link{normalizeBetweenArrays}},
\code{\link{normalizeForPrintorder}},
\code{\link{normalizeWithinArrays}}.
}

\author{Gordon Smyth}

\seealso{
  \link{2.Classes} gives an overview of all the classes defined by this package.
  
  \code{\link[marrayClasses]{marrayRaw-class}} is the corresponding class in the marrayClasses package.
}

\keyword{classes}
\keyword{data}

\eof
\name{rlm.series}
\alias{rlm.series}
\title{Robust Linear Model for Series of Microarrays}
\description{Fit linear models for each gene to a series of microarrays. Fit is by robust M-estimation.}
\usage{rlm.series(M,design=NULL,ndups=1,spacing=1,weights=NULL,...)}
\arguments{
  \item{M}{a numeric matrix. Usually the log-ratios of expression for a series of cDNA microarrrays with rows corresponding to genes and columns to arrays.}
  \item{design}{the design matrix of the microarray experiment, with rows corresponding to arrays and columns to comparisons to be estimated. The number of rows must match the number of columns of \code{M}. Defaults to the unit vector meaning that the arrays are treated as replicates.} 
  \item{ndups}{a positive integer giving the number of times each gene is printed on an array. \code{nrow(M)} must be divisible by \code{ndups}.}
  \item{spacing}{the spacing between the rows of \code{M} corresponding to duplicate spots, \code{spacing=1} for consecutive spots}
  \item{weights}{an optional numeric matrix of the same dimension as \code{M} containing weights for each spot. If it is of different dimension to \code{M}, it will be filled out to the same size.}
  \item{...}{any other arguments are passed to \code{rlm.default}.}
}
\value{
  A list with components
  \item{coefficients}{numeric matrix containing the estimated coefficients for each linear model. Same number of rows as \code{M}, same number of columns as \code{design}.}
  \item{stdev.unscaled}{numeric matrix conformal with \code{coef} containing the unscaled standard deviations for the coefficient estimators. The standard errors are given by \code{stdev.unscaled * sigma}.}
  \item{sigma}{numeric vector containing the residual standard deviation for each gene.}
  \item{df.residual}{numeric vector giving the degrees of freedom corresponding to \code{sigma}.}
}
\details{
The linear model is fit for each gene by calling the function \code{rlm} from the MASS library.

Warning: don't use weights with this function unless you understand how \code{rlm} treats weights.
The treatment of weights is different from that of \code{lm.series} and \code{gls.series}.
}
\seealso{
\code{\link[MASS]{rlm}}.

An overview of linear model functions in limma is given by \link{5.LinearModels}.
}
\author{Gordon Smyth}
\examples{
#  Simulate gene expression data,
#  6 microarrays and 100 genes with one gene differentially expressed in first 3 arrays
M <- matrix(rnorm(100*6,sd=0.3),100,6)
M[1,1:3] <- M[1,1:3] + 2
#  Design matrix includes two treatments, one for first 3 and one for last 3 arrays
design <- cbind(First3Arrays=c(1,1,1,0,0,0),Last3Arrays=c(0,0,0,1,1,1))
fit <- rlm.series(M,design=design)
eb <- ebayes(fit)
#  Large values of eb$t indicate differential expression
qqt(eb$t[,1],df=fit$df+eb$df.prior)
abline(0,1)
}
\keyword{models}
\keyword{regression}

\eof
\name{splitName}
\alias{splitName}

\title{Split Composite Gene Names}
\description{
Split composite gene names into short names and gene annotation strings.}

\usage{
splitName(x, split=";", extended=TRUE)
}

\arguments{
\item{x}{character vector}
\item{split}{character to split each element of vector on, see \code{strsplit}}
\item{extended}{logical.  If \code{TRUE}, extended regular expression matching is used, see \code{strsplit}.}
}

\value{
A list containing components
\item{Name}{character vector of the same length as \code{x} contain first splits of each element}
\item{Annotation}{character vector of the same length as \code{x} contain second splits of each element}
}

\details{
Gene names are assumed to comprise a short name or identifier followed by more detailed annotation information.
}

\seealso{
\code{\link[base]{strsplit}}.

An overview of LIMMA functions for reading data is given in \link{3.ReadingData}.
}

\examples{
x <- c("AA196000;actinin, alpha 3",
"AA464163;acyl-Coenzyme A dehydrogenase, very long chain",
"3E7;W15277;No Annotation")
splitName(x)
}

\author{Gordon Smyth}

\keyword{character}

\eof
\name{subsetting}
\alias{[.RGList}
\alias{[.MAList}
\title{Subset RGList or MAList Objects}
\description{
Extract a subset of an \code{RGList} or \code{MAList} objects.
}
\usage{
object[i, j]
}
\arguments{
  \item{object}{object of class \code{RGList} or \code{MAList}}
  \item{i,j}{elements to extract. \code{i} subsets the genes or spots while \code{j} subsets the arrays}
}
\details{
\code{i,j} may take any values acceptable for the matrix components of \code{object}.
See the \link[base]{Extract} help entry for more details on subsetting matrices.
}
\value{
An \code{\link[limma:rglist]{RGList}} or \code{\link[limma:malist]{MAList}} object holding data from the specified subset of genes and arrays.
}
\author{Gordon Smyth}
\seealso{
  \code{\link[base]{Extract}} in the base package.
  
  \link{3.ReadingData} gives an overview of data input and manipulation functions in LIMMA.
}
\examples{
M <- A <- matrix(11:14,4,2)
rownames(M) <- rownames(A) <- c("a","b","c","d")
colnames(M) <- colnames(A) <- c("A","B")
MA <- new("MAList",list(M=M,A=A))
MA[1:2,]
MA[1:2,2]
MA[,2]
}
\keyword{manip}

\eof
\name{tmixture}
\alias{tmixture.vector}
\alias{tmixture.matrix}
\title{Estimate Scale Factor in Mixture of t-Distributions}
\description{
This function estimates the unscaled standard deviation of the log fold change for differentially expressed genes.
It is called by the function \code{ebayes} and is not intended to be called by users.
}
\usage{
tmixture.vector(tstat,stdev.unscaled,df,proportion,c0lim=NULL)
tmixture.matrix(tstat,stdev.unscaled,df,proportion,c0lim=NULL)
}
\arguments{
  \item{tstat}{numeric vector or matrix of t-statistics}
  \item{stdev.unscaled}{numeric matrix conformal with \code{tstatf} containing the unscaled standard deviations for the coefficient estimators}
  \item{df}{numeric vector giving the degrees of freedom associated with \code{tstat}}
  \item{proportion}{assumed proportion of genes which are differentially expressed}
  \item{c0lim}{optional upper limit for the estimated unscaled standard deviation}
}
\value{
Numeric vector of length equal to the number of columns of \code{tstat} and \code{stdev.unscaled}.
}
\seealso{
\code{\link{ebayes}}
}
\author{Gordon Smyth}
\keyword{htest}

\eof
\name{toptable}
\alias{toptable}
\alias{topTable}
\title{Table of Top Genes from Linear Model Fit}
\description{
Extract a table of the top-ranked genes from a linear model fit.
}
\usage{
toptable(fit,coef=1,number=10,genelist=NULL,A=NULL,eb=NULL,adjust.method="holm",sort.by="B",...)
topTable(fit,coef=1,number=10,genelist=NULL,adjust.method="holm",sort.by="B")
}
\arguments{
  \item{fit}{for \code{toptable}, this is an output list from \code{lm.series}, \code{gls.series} or \code{rlm.series}.
  For \code{topTable} is an object of class \code{MArrayLM}.}
  \item{coef}{column number of the effect or contrast to rank the genes on}
  \item{number}{how many genes to pick out}
  \item{genelist}{a data frame containing the gene allocation list or a vector containing the gene names}
  \item{A}{matrix of A-values or vector of average A-values.}
  \item{eb}{output list from \code{ebayes(fit)}}
  \item{adjust.method}{method to use to adjust the P-values for multiple testing, e.g., "holm" or "fdr". See \code{\link[ctest]{p.adjust}} for the available options. If \code{NULL} or \code{"none"} then the P-values are not adjusted.}
  \item{sort.by}{statistic to rank genes by.  Possibilities are \code{"M"}, \code{"A"}, \code{"T"}, \code{"P"} or \code{"B"}.}
  \item{...}{any other arguments are passed to \code{ebayes} if \code{eb} is \code{NULL}}
}
\value{
  A dataframe with a row for the \code{number} top genes and the following columns:
  \item{genelist}{if genelist was included as input}
  \item{M}{estimate of the effect or the contrast, on the log2 scale}
  \item{t}{moderated t-statistic}
  \item{P.Value}{nominal P-value}
  \item{B}{log odds that the gene is differentially expressed}
}
\details{
This function summarizes a fit object produced by \code{lm.series}, \code{gls.series} or \code{rlm.series} by selecting the top-ranked genes for any given contrast.
}
\seealso{
\code{\link{ebayes}}, \code{\link[base]{p.adjust}}, \code{\link{lm.series}}, \code{\link{gls.series}}, \code{\link{rlm.series}}.
}
\author{Gordon Smyth}
\examples{
#  Simulate gene expression data,
#  6 microarrays and 100 genes with first gene differentially expressed
M <- matrix(rnorm(100*6,sd=0.3),100,6)
M[1,1:3] <- M[1,1:3] + 2
#  Design matrix includes two treatments, one for first 3 and one for last 3 arrays
design <- cbind(First3Arrays=c(1,1,1,0,0,0),Last3Arrays=c(0,0,0,1,1,1))
fit <- lm.series(M,design=design)
toptable(fit)
}
\keyword{htest}

\eof
\name{trigammaInverse}
\alias{trigammaInverse}
\title{Inverse Trigamma Function}
\description{
The inverse of the trigamma function.
}
\usage{
trigammaInverse(x)
}
\arguments{
  \item{x}{numeric vector or array}
}
\details{
The function uses Newton's method with a clever starting value to ensure monotonic convergence.
}
\value{
Numeric vector or array \code{y} satisfying \code{trigamma(y)==x}.
}
\author{Gordon Smyth}
\seealso{
   \code{\link[base:Special]{trigamma}}
}
\warning{
This function does not accept a data.frame as argument although the internal function \code{trigamma} does.
}
\examples{
y <- trigammaInverse(5)
trigamma(y)
}
\keyword{math}

\eof
\name{uniquegenelist}
\alias{uniquegenelist}
\title{Eliminate Duplicate Names from the Gene List}
\description{
Eliminate duplicate names from the gene list. The new list is shorter than the full list by a factor of \code{ndups}.
}
\usage{
uniquegenelist(genelist,ndups=2,spacing=1)
}
\arguments{
  \item{genelist}{vector of gene names}
  \item{ndups}{number of duplicate spots. The number of rows of \code{genelist} must be divisible by \code{ndups}.}
  \item{spacing}{the spacing between duplicate names in \code{genelist}}
}
\value{
A vector of length \code{length(genelist)/ndups} containing each gene name once only.
}
\author{Gordon Smyth}
\seealso{
\code{\link{unwrapdups}}
}
\examples{
genelist <- c("A","A","B","B","C","C","D","D")
uniquegenelist(genelist,ndups=2)
genelist <- c("A","B","A","B","C","D","C","D")
uniquegenelist(genelist,ndups=2,spacing=2)
}
\keyword{array}

\eof
\name{unwrapdups}
\alias{unwrapdups}
\title{Unwrap Duplicate Spot Values from Rows into Columns}
\description{Reshape a matrix so that a set of consecutive rows becomes a single row in the output.}
\usage{
unwrapdups(M,ndups=2,spacing=1)
}
\arguments{
  \item{M}{a matrix.}
  \item{ndups}{number of duplicate spots. The number of rows of M must be divisible by \code{ndups}.}
  \item{spacing}{the spacing between the rows of \code{M} corresponding to duplicate spots, \code{spacing=1} for consecutive spots}
}
\value{A matrix containing the same values as \code{M} but with fewer rows and more columns by a factor of \code{ndups}.
Each set of \code{ndups} rows in \code{M} is strung out to a single row so that duplicate values originally in consecutive rows in the same column are in consecutive columns in the output.
}
\details{
This function is used on matrices corresponding to a series of microarray experiments.
Rows corresponding to duplicate spots are re-arranged to that all values corresponding to a single gene are on the same row.
This facilitates fitting models or computing statistics for each gene.
}
\author{Gordon Smyth}
\examples{
M <- matrix(1:12,6,2)
unwrapdups(M,ndups=2)
unwrapdups(M,ndups=3)
unwrapdups(M,ndups=2,spacing=3)
}
\keyword{array}

\eof
\name{venn}
\alias{vennCounts}
\alias{vennDiagram}
\title{Venn Diagrams}
\description{
Compute classification counts or plot classification counts in a Venn diagram.
}
\usage{
vennCounts(classification, include="both")
vennDiagram(object, include="both", names, \dots)
}
\arguments{
  \item{classification}{classification matrix of 0's and 1's indicating significance of a test.
  Usually created by \code{\link{classifyTests}}.}
  \item{object}{either a classification matrix or a \code{VennCounts} object produced by \code{vennCounts}.}
  \item{include}{character string specifying whether to counts genes up-regulated, down-regulated or both.
  Choices are \code{"both"}, \code{"up"} or \code{"down"}.}
  \item{names}{optional character vector giving names for the sets or contrasts}
  \item{\dots}{any other arguments are passed to \code{plot}}
}
\value{
\code{vennCounts} produces a \code{VennCounts} object, which is a numeric matrix with last column \code{"Counts"} giving counts for each possible vector outcome.
}
\seealso{
An overview of linear model functions in limma is given by \link{5.LinearModels}.
}
\author{Gordon Smyth and James Wettenhall}
\examples{
tstat <- matrix(rt(300,df=10),100,3)
tstat[1:33,] <- tstat[1:33,]+2
clas <- classifyTests(tstat,df=10,p.value=0.05)
a <- vennCounts(clas)
print(a)
vennDiagram(a)
}
\keyword{htest}

\eof
\name{zscore}
\alias{zscoreGamma}
\alias{zscoreT}

\title{z-score equivalents}

\description{
Compute z-score equivalents of for gamma or t-distribution random deviates.
}

\usage{
zscoreGamma(q, shape, rate = 1, scale = 1/rate) 
zscoreT(x, df)
}

\arguments{
\item{q, x}{numeric matrix for vector giving deviates of a random variaable}
\item{shape}{gamma shape parameter}
\item{rate}{gamma rate parameter}
\item{scale}{gamma scale parameter}
\item{df}{degrees of freedom}
}

\value{
Numeric vector giving equivalent deviates from a standard normal distribution.
}

\details{
This function computes the deviates of a standard normalization distribution which have the same quantiles as the given values in the specified distribution.
For example, if \code{z <- zscoreT(x,df=df)} then \code{pnorm(z)} equals \code{pt(x,df=df)}.

Care is taken to do the computations accurately in both tails of the distributions.
}

\author{Gordon Smyth}
\seealso{
\code{\link[base]{qnorm}}, \code{\link[base]{pgamma}}, \code{\link[base]{pt}}
}
\examples{
zscoreGamma(1, shape=1, scale=1)
zscoreT(0, df=3)
}
\keyword{distribution}

\eof
