[BioC] stat/math question on Category vignette

Seth Falcon sfalcon at fhcrc.org
Thu Aug 23 16:25:37 CEST 2007


Hi Mark,

Mark W Kimpel <mkimpel at iupui.edu> writes:
> I am working my way through the Category vignette and have a question as 
> to how the t statistics for categories are computed from the incidence 
> matrix and individual probeset t-statistics. The code that does this can 
> be found on the bottom of page 3 (development version vignette) and is 
> as follows:
>
> There are 135 pathways (categories)...
> A = AmER2 %*% tobs$statistic
> A = tA/sqrt(rs2)
> ames(tA) = row.names(AmER2)
>
> I know this is matrix multiplication, but don't know the mathematical or 
> statistical basis for the computation. I am interested in turning the t 
> statistic values in tA into p values, so I need to know the df. for each 
> resultant t. Is that the rs2?

Each row of the matrix represents a gene set (a category) and each
column a gene.  Each cell in the matrix is 0/1 depending on whether
the given gene is in the given gene set.

The vector tobs$statistic has the t-stat for each gene.  The matrix
multiplication is a convenient way to obtain the sum of the t-stats
for each gene set.

Does that help?

+ seth

-- 
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
BioC: http://bioconductor.org/
Blog: http://userprimary.net/user/



More information about the Bioconductor mailing list