[BioC] problem with siggenes

Holger Schwender holger.schw at gmx.de
Fri Jan 14 11:35:29 CET 2005


Hi Fangxin,

there are some differences between the default options of the R version and
the Excel version. First of all, R SAM computes by default Welch's
t-statistic (i.e. a t-statistic that does not assume equal group variances)
while Excel SAM computes the "usual" t-statistic. Set var.equal=TRUE to
obtain the usual t-statistics. Second R SAM computes by default the mean
number of falsely called genes, whereas Excel SAM computes the median number
of falsely called genes. Set med=TRUE to obtain the median number. There are
some other differences but the above differences should be the main reasons
for the different number of genes you obtain. I have summarized all the
changes more than once at the Excel SAM forum sam-software at yahoo.com and I
will add a function RvsExcelSAM in the next version of siggenes.

You are correct. There is a bug in the summary function for SAM: You will
get an error when you have no or just one differentially expressed genes.
This will be fixed in the next version of siggenes. I planned to publish
this version in the middle of January (i.e. sometimes around today) but
because of lots of other work it will take a little longer.

The only purpose of the list.genes function is to list the significant genes
and not to give you some of the statistics (summary is the function that
does this). And it was actually thought to put this name in some file.
That's why it currently has no output. But it will have one in the next
version of siggenes.

Best,
Holger


> First, please ignore the emails I sent out yesterday, I was using an old
> version of siggenes.
> 
> However, I do find problems with siggenes. There is Excel version SAM
> method which can be downloaded from Stanford website. For the same data
> set, I got very different results from siggenes and from Excel SAM,
> 
> > out=sam(data,cl,delta=seq(0.1,7,0.1),rand=123)
> > FDR=summary(out)[,5]
> > Delta=summary(out)[,1]
> > d.min=min(Delta[FDR<0.05])
> > gene.list1=summary(out,d.min,ll=FALSE)$row.sig.genes
> > gene.list2=list.siggenes(out,d.min)
> 
> siggenes identify much less genes than Excel SAM does. In addition, if
> only one gene identified using certain delta value,
> summary()$row.sig.genes (gene.list1 above) will not list that gene since
> there is error in the function. list.siggenes will only print the
> identified genes out, but won't assign gene list/name to other object
> (gene.list2 is empty in the example)
> 
> Anyone knows what is going on here or what mistakes I might made.
> 
> Thanks.
> Fangxin
> 
> > As far as I know, if you only have two arrays, one from each "treatment"
> > in your experiment, there is no way that you can do any kind of
> statistics
> > at all....
> >
> > -----Original Message-----
> > From: bioconductor-bounces at stat.math.ethz.ch
> > [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Edoardo
> > Saccenti
> > Sent: 13 January 2005 16:45
> > To: bioconductor at stat.math.ethz.ch
> > Subject: [BioC] problem with siggenes
> >
> >
> > I would like to manage a FDR analysis via
> > SAM as implemented in siggenes package.
> >
> > First I read 2 file.CEL into an affybatch object called "mydata" Then i
> > used rma routine to correct my data obtaining an exprSet object called
> > "myeset"
> >
> > According to the guide I need to pass to sam
> > the data (myeset in this case) and a vector cl
> >
> > This is a one class case,  so
> > so cl must be a vector of ones of length equal to number
> > of sample.
> > As the number of sample is 2 (2 CEL files)
> >
> > cl <- c(1,1)
> >
> > Typing at the R prompt:
> >
> > 	out <- sam.dstat(myeset, cl, rand=123)
> >
> > I get the following:
> >
> > 	We're doing 4 complete permutations
> > 	Error in rowSums(x, prod(dn), p, na.rm) : invalid value of n
> > 	In addition: Warning message:
> > 	There are 147 genes with zero variance. These genes are removed,
> > 	and their d-values are set to NA.
> >
> > I'm sure I'm doing some stupid mistake 'couse I'm new to R and BioC:
> > nevertheless can anybody help me?
> >
> > Thanks
> > edoardo
> >
> >
> >
> > "Raffiniert ist der Herr Gott,
> >  aber boshaft ist Er nicht."
> >
> > ---
> > Dr. Edoardo Saccenti
> > FiorGen Pharmacogenomics Foundation
> > CERM Nuclear Magnetic Resonace Research Center
> > Scientific Pole - University of Florence
> > Via Luigi Sacconi n° 6
> > 50019 Sesto Fiorentino (FI)
> > tel: +39 055 4574193
> > fax: +39 055 4574253
> > saccenti at cerm.unifi.it
> > www.cerm.unifi.it
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> >
> >
> 
> 
> -- 
> Fangxin Hong, Ph.D.
> Plant Biology Laboratory
> The Salk Institute
> 10010 N. Torrey Pines Rd.
> La Jolla, CA 92037
> E-mail: fhong at salk.edu
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> 

-- 
+++ Sparen Sie mit GMX DSL +++ http://www.gmx.net/de/go/dsl
AKTION für Wechsler: DSL-Tarife ab 3,99 EUR/Monat + Startguthaben



More information about the Bioconductor mailing list