[BioC] Problem of using MultiClass SAM (siggenes package)

Holger Schwender holger.schw at gmx.de
Sat Nov 10 13:44:47 CET 2007


Hi,

I actually do not know what the reasons for these differences are, since I do not know what exactly is implemented in the Excel SAM version. What sam in siggenes does is to compute an ordinary F-statistic as implemented in mt.teststat.num.denum of the multtest package, and then to add the fudge factor to the denominator of this test statistic. If Excel SAM also uses this statistic (and the fudge factors are equal), then at least the resulting values of the test statistics for the genes (if out is the output of sam, i.e. out <- sam(...), then out at d will give you these values) should be identical. So for a start, you might take a look at the fudge factors if they are equal in siggenes and Excel SAM (likely they are not, although I implemented them in the way it was described in the Excel SAM manual), and then compare the test scores returned by sam and Excel SAM (regarding the values of the fudge factors).

Another reason for the differences might be that Excel SAM uses the median number of falsely called genes, whereas siggenes uses by default the mean number. To change the latter, set med=FALSE. However, that actually only influences the estimated value of the FDR. So it should not be a reason for the other differences.

But the chosen permutations might play a role. Note that even though you set the random number generator to the same seed in sam and Excel SAM, this does not mean that you will get the same permutations (you can "just" reproduce the results of two applications of sam). However, sam allows to input a permutation matrix (see the argument mat.samp in d.stat). So if it is possible to obtain the matrix with the permuted class labels from Excel SAM, you can use this matrix in sam.

Best,
Holger



-------- Original-Nachricht --------
> Datum: Sat, 10 Nov 2007 17:55:05 +0800
> Von: "呂若陽" <davidlue7 at gmail.com>
> An: bioconductor at stat.math.ethz.ch
> Betreff: [BioC] Problem of using MultiClass SAM (siggenes package)

> Hello,
> I am using the package siggenes for multiclass problem.
>  (R 2.6.0 ;BioConductor 2.1;siggenes 1.2.11)
> My dataset is 9 samples in 3 classes, each class contains 3 samples.
> There are totally 13915 genes.
> When I using siggenes to do Multiclass SAM,
> (code:   samResults<-sam(dd,cl, B=500, rand=123) )
> the results are as follow:
> 
> SAM Analysis for the Multi-Class Case with 3 Classes
> 
>     Delta    p0     False     Called      FDR
> 1     0.1 0.061     11893.018 13745       0.052440
> 2   913.6 0.061     0.538     59          0.000553
> 3  1827.2 0.061     0.084     12          0.000424
> 
> Also,I know I can use SAM for EXCEL to do this.
> However, the results are quite different:
> 
> delta	# med false pos	90th perc false pos # called	median FDR	90th perc
> FDR	
> 
> 0.099   2119.652318	2173.516349	      13863	0.152899972	0.156785425	
> 1.046	110.2554078	341.349479            10251	0.010755576	0.033299139	
> 7.156	0	        3.475098814	      1166	0	        0.002980359	
> 14.86	0               0.473877111	      166	0	        0.002854681	
> 61.90	0	        0	              1	        0	        0	
> 
> Both of them are computed with 500 permutation, and rand seed 123.
> What's going wrong with my work?
> 
> I have read manual for several times(siggenes.pdf),
> but the only information about multiclass is how to assign grouping.
> Should I assign more parameters when using SAM? How?
> 
> Thank you very much for answering.
> 
> 
> Sincerely,
> Ruo Yang, Lu
> 
> 
> ===== ===== ===== ===== =====
> NTU Research Center For Medical Excellence
> Bioinformatics and Biostatistics Core
> TEL:(02)2312-3456#8685
> FAX:(02)3322-4179
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor

--



More information about the Bioconductor mailing list