ebam.wilc {siggenes}R Documentation

Empirical Bayes Analysis using Wilcoxon Rank Sums

Description

Performs an Empirical Bayes Analysis of Microarrays by using Wilcoxon Rank Sums as expression scores for the genes.

Usage

    ebam.wilc(data,x,y,paired=FALSE,delta=.9,p0=NA,stable.p0=TRUE,use.offset=TRUE,use.weights=TRUE,
    ties.rand=TRUE,zero.rand=TRUE,ns.df=5,col.accession=NA,col.gene.name=NA,R.fold=TRUE,
    R.dataset=data,file.out=NA,rand=NA,na.rm=FALSE)

Arguments

data the data set that should be analyzed. Every row of this data set must correspond to a gene.
x vector of the columns of data that correspond to the treatment group. In the paired case, (x[i],y[i]) build a pair. If, e.g., the first n1 columns of data build the treatment group, x=1:\eqn{n_1}{n1}.
y vector of the columns of data that correspond to the control group. In the paired case, (x[i], y[i]) are an observation pair.
paired paired (TRUE) or unpaired (FALSE) data. Default is FALSE.
delta a gene will be called significant, if its posterior probability of being differentially expressed is larger than or equal to delta.
p0 prior probability that a gene is differentially expressed. If not specified, it will automatically be computed.
stable.p0 if TRUE (default), p0 will be computed by the algorithm of Storey and Tibshirani (2003). If FALSE, the (unstable) estimate will be computed that ensures that the posterior probability of being differentially expressed is always nonnegative.
use.offset if TRUE (default), an offset will be used in the Poisson regression for the estimation of the density of the expression scores of all genes.
use.weights if TRUE (default), weights are used in the natural cubic spline fit for the estimation of p0.
ties.rand if TRUE (default), non-integer expression scores will be randomly assigned to the next lower or upper integer. Otherwise, they are assigned to the integer that is closer to the mean.
zero.rand if TRUE (default), the sign of each Zero in the computation of the Wilcoxon signed rank sums will be randomly assigned. If FALSE, the sign of the Zeros will be set to '–'.
ns.df the number of degrees of freedom used in the Poisson regression for the estimation of the mixture density of the expression scores of all genes.
col.accession the column of data containing the accession numbers of the genes. If specified, the accession numbers of the significant genes will be added to the output.
col.gene.name the column of data that contains the names of the genes. If specified, the names of the significant genes will be added to the output.
R.fold if TRUE (default), the fold change for each differentially expressed gene will be computed.
R.dataset the data set used in the computation of the fold change. This data set can be a transformed version of data.
file.out if specified, general information like the number of significant genes and the estimated FDR and gene-specific information like the expression scores, the q-values, the R fold etc. of the differentially expressed genes are stored in this file.
rand if specified, the random number generator will be set in a reproducible state.
na.rm if FALSE (default), the fold change of genes with at least one missing value will be set to NA. If TRUE, missing values will be replaced by the genewise mean.

Value

a plot of the expression scores vs. their posterior probability of being differentially expressed, and (optionally) a file containing general information like the FDR and the number of differentially expressed genes and gene-specific information on the differentially expressed genes like their names, their q-values and their fold change.

nsig number of significant genes.
fdr estimated FDR.
ebam.output table containing gene-specific information on the differentially expressed genes.
row.sig.genes vector containing of the row numbers that belong to the differentially expressed genes.
...

Author(s)

Holger Schwender, holger.schw@gmx.de

References

Efron, B., Storey, J.D., Tibshirani, R. (2001). Microarrays, empirical Bayes methods, and the false discovery rate, Technical Report, Department of Statistics, Stanford University.

Storey, J.D., and Tibshirani, R. (2003). Statistical significance for genome-wide experiments, Technical Report, Department of Statistics, Stanford University.

Schwender, H. (2003). Assessing the false discovery rate in a statistical analysis of gene expression data, Chapter 8, Diploma thesis, Department of Statistics, University of Dortmund, http://de.geocities.com/holgerschw/thesis.pdf.

See Also

ebam


[Package Contents]