[BioC] differentially expressed genes with limma

Naomi Altman naomi at stat.psu.edu
Thu Apr 20 18:39:56 CEST 2006

Generally, the power is so low with only 3 replicates that you are 
better off using all the samples.  What I do is plot all the 
log(expression) values verse each other using the "pairs" command in 
R.  You should see a diagonal line near y=x on each plot, at least 
within treatment.  The spread in the orthogonal direction is the 
important one for ANOVA.  It should be about the same for each group.

I have never used the affy control probes.  Perhaps someone else can 
address this part of the question.

At 11:02 AM 4/20/2006, Lisa Luo wrote:
>Thanks, Naomi.
>   Now I understand where the difference comes from.  But how to 
> measure the difference between groups to decide if to include them 
> for analysis together?  In the five sample groups I mentioned, 
> there are tumor samples as well as normal tissue (cell) samples.  I 
> would like to know to which tissue (cell) the tumor is more 
> similar.  Should I analyze them together or separately?
>   Another question is about the p-value.  I used the p-value from 
> affy control probes to select the p-value cutoff.  In the case of 
> using 15 samples, the p-value for affy probes is 1e-7.  Does this 
> indicate problems in the analysis?
>   Thank you,
>   Lisa
>Naomi Altman <naomi at stat.psu.edu> wrote:
>   The difference between your analyses comes from the denominator of
>the test. In both cases, the numerator is the differences in
>means. But in the first case, all of the samples are used to compute
>the within sums of squares, and all of these sums of squares are used
>in the limma ebayes adjustment. In the second case, only the 6
>samples were used to compute the within sums of squares.
>Assuming that the groups have the about the same variance, the method
>using all 15 samples is more powerful (has a smaller error rate) and
>is preferable. If the 2 groups of interest have VERY difference
>variances, then you might we better off using just the 2 groups.
>If you did gcrma first using all the data and then using only the 6
>samples, that would also contribute to the differences. Unless the
>groups are very different, my choice would be to use all the samples.
>At 09:27 AM 4/20/2006, Lisa Luo wrote:
> >Dear list,
> > I am confused with my problem and hope get some help from you.
> > I have 5 groups of sample, each with 3 samples (all AFFY). I
> > first read in all the 15 samples and did lmFit. I am interested in
> > the difference between group1 and group2, so I made a contrast
> > matrix with "group1-group2". Then I only read the 6 samples of
> > group1 and group2 and did the same thing. However, the
> > differentially expressed gene list are very different.
> > I used gcrma to normalize the dataset. Do you think the
> > difference is caused by normalization or I did something wrong?
> > Thanks,
> > Lisa
> >
> >
> >---------------------------------
> >
> > [[alternative HTML version deleted]]
> >
> >_______________________________________________
> >Bioconductor mailing list
> >Bioconductor at stat.math.ethz.ch
> >https://stat.ethz.ch/mailman/listinfo/bioconductor
> >Search the archives:
> >http://news.gmane.org/gmane.science.biology.informatics.conductor
>Naomi S. Altman 814-865-3791 (voice)
>Associate Professor
>Dept. of Statistics 814-863-7114 (fax)
>Penn State University 814-865-1348 (Statistics)
>University Park, PA 16802-2111
>         [[alternative HTML version deleted]]
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>Search the archives: 

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111

More information about the Bioconductor mailing list