[BioC] SAM analysis : samr package vs siggenes package

Holger Schwender holger.schw at gmx.de
Tue Apr 22 16:56:09 CEST 2008


Hi Cecile,

have you used the same set of permutations of class labels? If not, this might be a reason why the statistics differ slightly. There might be other reasons. But I do not know them cause I do not know samr.

The reason why samr and siggenes differ is that these are two independent implementations of SAM -- samr done by the developers of SAM themselves, and siggenes done by me years ago in my diploma thesis (the SAM version in siggenes has much improved since then). How to use SAM in siggenes is explained in the vignette to siggenes which you can obtain by calling

> vignette("siggenes")

and in the article also available in siggenes. The latest version of this vignette -- currently only available in the developmental branch of BioC, but will be part of the next BioC release -- also contains explanations for which arguments are important and which arguments of sam and d.stat should only be changed if completely understood and one wants, e.g., reproduce the results of a SAM analysis with samr.

Best,
Holger 

-------- Original-Nachricht --------
> Datum: Tue, 22 Apr 2008 15:22:14 +0200
> Von: "Cécile Laurent" <cecile.laurent at curie.fr>
> An: Holger Schwender <holger.schw at gmx.de>
> CC: bioconductor at stat.math.ethz.ch
> Betreff: Re: [BioC] SAM analysis : samr package vs siggenes package

> Hi Holger,
> 
> Thanks for your reply.
> samr package is available in CRAN, developped by R. Tibshirani, G. Chu, 
> T. Hastie, Balasubramanian Narasimhan and maintained by R. Tibshirani 
> (so it does the same things as the Excel SAM, I think).
> 
> Actually, when I change the parameters, the results are almost but not 
> exactly the same and I don't understand why.
> For a same delta, FDR and false positive genes are slightly different. 
> Can you explain me why ??
> 
> I also have a last question : why default parameters of these 2 packages 
> are different?
> Indeed, I don't know which parameters I have to choose and why?
> 
> Regards,
> Cecile
> 
> 
> 
> Holger Schwender a écrit :
> > Hi Cecile,
> >
> > I do not know the samr package. But if it does the same things as the
> Excel SAM has done a few years ago, then you should set var.equal=TRUE,
> med=TRUE and lambda=0.5(??) in the siggenes SAM. The latter should lead to the
> same estimation of pi0. Moreover, you need to use the same permutations of
> the class labels. If you can get these permutations from samr, you will be
> able to use in these permutations in sam by specifying mat.samp. 
> >
> > Best,
> > Holger
> >
> >
> > -------- Original-Nachricht --------
> >   
> >> Datum: Mon, 21 Apr 2008 14:54:48 +0200
> >> Von: "Cécile Laurent" <cecile.laurent at curie.fr>
> >> An: bioconductor at stat.math.ethz.ch
> >> Betreff: [BioC] SAM analysis : samr package vs siggenes package
> >>     
> >
> >   
> >> Hi,
> >>
> >> I would like to know why when I run SAM analysis with SamR and Siggenes
> >> packages, I don't have the same results for the small pvalues.
> >> I test here on golub data, and I don't have the same differential genes
> >> number when I observe the delta tables (delta.table for samr and 
> >> sam.out at mat.fdr for siggenes) for a FDR 0% (250 are differential with 
> >> samr package and 9 with siggenes package).
> >> On my personnal data, I observe the same difference in differential
> gene 
> >> number at FDR 5%.
> >> I also observe the same s0, but not the same pi0, in the 2 sam objects.
> >> I think, in my code, the parameters are the same... So, I don't know 
> >> which package used...
> >> Is it something wrong???
> >> Can you explain me these differences in the results..
> >>
> >> Thanks,
> >> Cécile
> >>
> >>
> >> Here is my sessionInfo()
> >>
> >> R version 2.6.0 (2007-10-03)
> >> i486-pc-linux-gnu
> >>
> >> locale:
> >>
> LC_CTYPE=fr_FR.UTF-8;LC_NUMERIC=C;LC_TIME=fr_FR.UTF-8;LC_COLLATE=fr_FR.UTF-8;LC_MONETARY=fr_FR.UTF-8;LC_MESSAGES=fr_FR.UTF-8;LC_PAPER=fr_FR.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=fr_FR.UTF-8;LC_IDENTIFICATION=C
> >>
> >> attached base packages:
> >> [1] splines   tools     stats     graphics  grDevices utils    
> datasets
> >> [8] methods   base    
> >>
> >> other attached packages:
> >> [1] siggenes_1.12.0 samr_1.25       impute_1.0-5    multtest_1.18.0
> >> [5] survival_2.33   Biobase_1.16.0
> >>
> >> loaded via a namespace (and not attached):
> >> [1] rcompgen_0.1-15
> >>
> >>
> >> ### My code :
> >> library(multtest)
> >> library(samr)
> >> library(siggenes)
> >> data(golub)
> >> golubCl=golub.cl
> >> golubCl[which(golub.cl==1)]=2
> >> golubCl[which(golub.cl==0)]=1
> >>
> >> ## SamR package
> >> samrData=list(x=golub, y=golubCl, geneid=golub.gnames[,3], 
> >> genesnames=golub.gnames[,3], logged2=T)
> >> samr.obj=samr(samrData, resp.type="Two class unpaired", 
> >> testStatistic="standard", nperms=1000, random.seed=123)
> >> delta.table=samr.compute.delta.table(samr.obj, dels=seq(0.1,5,0.05))
> >> pv.samr=samr.pvalues.from.perms(samr.obj$tt, samr.obj$ttstar)
> >>
> >> ## Siggenes package
> >> sam.out=sam(golub, golub.cl, rand=123,B=1000, 
> >> gene.names=golub.gnames[,3], method="d.stat", var.equal=T, s0=NA, 
> >> include.zero=F, delta=seq(0.1,5,0.05) )
> >>
> >> ## plot to observe the values (the version with -log() to see the 
> >> difference in the small pvalues)
> >> #plot(sam.out at d[order(sam.out at d)],samr.obj$tt[order(samr.obj$tt)])
> >>
> #plot(-log(sam.out at d[order(sam.out at d)]),-log(samr.obj$tt[order(samr.obj$tt)]))
> >> plot(sam.out at p.value[order(sam.out at p.value)],pv.samr[order(pv.samr)])
> >>
> plot(-log(sam.out at p.value[order(sam.out at p.value)]),-log(pv.samr[order(pv.samr)]))
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives:
> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>     
> >
> >   

-- 

Jetzt dabei sein: http://www.shortview.de/?mc=sv_ext_mf@gmx



More information about the Bioconductor mailing list