sam {siggenes}R Documentation

Significance Analysis of Microarrays

Description

Performs a Significance Analysis of Microarrays (SAM) for a set of positive thresholds.

Usage

sam(data,x,y,paired=FALSE,mat.samp=NULL,B=100,balanced=FALSE,
    na.rm=FALSE,s0=NA,alpha.s0=seq(0,1,.05),include.s0=TRUE,factor.s0=
    1.4826,p0=NA,lambda.p0=1,vec.lambda.p0=(0:95)/100,delta.fdr=
    (1:10)/5,med.fdr=TRUE,graphic.fdr=TRUE,thres.fdr=seq(0.5,2,.5),
    pty.fdr=TRUE,help.fdr=TRUE,ngenes=NA,iteration=3,initial.delta=
    c(0.1,seq(.2,2,.2),4),rand=NA)

Arguments

data the data set that should be analyzed. Every row of this data set must correspond to a gene.
x vector of the columns of the data set that correspond to the treatment group. In the paired case (x[i],y[i]) build a pair. If, e.g., the first n1 columns contain the gene expression values of the treatment group, x=1:n1.
y vector of the columns of the data set that correspond to the control group. In the paired case (x[i], y[i]) are an observation pair.
paired paired (TRUE) or unpaired (FALSE) data. Default is FALSE
mat.samp a permutation matrix. If specified, this matrix will be used, even if rand and B are specified.
B number of permutations used in the calculation of the null density. Default is B=100.
balanced if TRUE, balanced permutations will be used. Default is FALSE.
na.rm if FALSE (default), the expression scores d of genes with one or more missing values will be set to NA. If TRUE, the missing values will be replaced by the genewise mean of the non-missing values.
s0 the fudge factor. If NA (default), the fudge factor s0 will be computed automatically.
alpha.s0 the possible values of the fudge factor s0 in terms of quantiles of the standard deviations of the genes.
include.s0 if TRUE (default), s0=0 is a possible choice for the fudge factor.
factor.s0 constant with which the MAD is multiplied in the computation of the fudge factor.
p0 the probability that a gene is not differentially expressed. If not specified (default), it will be computed.
lambda.p0 number between 0 and 1 that is used to estimate p0. If set to 1 (default), the automatic p0 selection using the natural cubic spline fit is used.
vec.lambda.p0 vector of values for λ used in the automatical computation of p0.
delta.fdr a vector of values for the threshold Delta for which the SAM analysis is performed.
med.fdr if TRUE (default), the median number, otherwise the expected number, of falsely called genes will be computed.
graphic.fdr if TRUE (default), both the SAM plot and the plots of Delta vs. FDR and Delta vs. number of significant genes will be generated.
thres.fdr for each value contained in thres.fdr, two lines parallel to the 45-degree line are generated in the SAM plot.
pty.fdr if TRUE (default), a square SAM Plot will be generated.
help.fdr if TRUE (default), help-lines will be drawn in both Delta plots.
ngenes a number or proportion of genes for which the FDR is estmated.
iteration the number of iterations used in the estimation of the FDR for a given number or proportion of genes.
initial.delta a set of initial guesses for Delta in the computation of the FDR for a given number or proportion of genes.
rand if specified, the random number generator will be put in a reproducible state.

Value

a table of statistics (estimate of p0, number of significant genes, number of falsely called genes and FDR) for the specified set of Deltas, a SAM Plot, a Delta vs. FDR plot, and a plot of Delta vs. the number of significant genes.

Note

For further analyses with sam.plot, the results of sam must be assigned to an object.

Author(s)

Holger Schwender holger.schw@gmx.de

References

Tusher, V.G., Tibshirani, R., and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response, PNAS, 98, 5116-5121.

Storey, J.D. (2002). A direct approach to the false discovery rate, Journal of the Royal Statistical Society, Series B, 64, 479-498.

Storey, J.D., and Tibshirani, R. (2003). Statistical significance for genome-wide experiments, Technical Report, Department of Statistics, Stanford University.

Schwender, H. (2003). Assessing the false discovery rate in a statistical analysis of gene expression data, Chapter 5, Diploma thesis, Department of Statistics, University of Dortmund, http://de.geocities.com/holgerschw/thesis.pdf.

See Also

sam.plot sam.wilc


[Package Contents]