[BioC] limma: print-tip loess and empty spots
J.delasHeras at ed.ac.uk
J.delasHeras at ed.ac.uk
Sat Jun 2 17:41:52 CEST 2007
Quoting Gordon Smyth <smyth at wehi.edu.au>:
> Dear Adrian,
> At 08:36 AM 2/06/2007, Adrian Steward wrote:
>> Thank you for your reply, Dr. Smyth.
>> I do not yet completely understand exactly HOW normalizing works
>> (I've seen the data, transformations, and so I know what it does,
>> just not how, yet) but it appears to me that I can simply change the
>> sign of the normalized output to make the proper tests
> In general, you cannot simplify the constructions of tests by
> swapping the sign of the normalized log-ratios. The only experiment
> in which people might be tempted to do this is a simple replicated
> comparison using two-colour arrays with dye-swaps (and you have given
> no indication that this is your experiment.) For anything more
> complicated, swapping the signs of the log-ratios would only
> complicate matters. Even for the replicated comparison, swapping the
> log-ratios is unhelpful because it prevents the inclusion of
> probe-specific dye-effects in the model.
>> (or as someone else stated, reverse the contrast / estimate statements).
>> You picked up on my motivations here - I am chiefly concerned that
>> the exported normalized data has proper signs
> The normalized data already has what we consider to be the "proper" signs.
>> because at present I am required to do all of my linear modeling
>> in SAS, and large datasets need to be 'read in.' I personally
>> would rather do it all in R which is why I am running things in
>> parallel to make the case for limma-only analysis.
> You can certainly fit linear models in SAS, but you can't do a limma
> empirical Bayes analysis.
>> You people are both programmers and teachers, and thanks for your
>> patience with the noobs.
> You can easily change the signs of columns of data in either R or
> SAS. You could get advice on how to do this from the R help list. But
> don't expect this from me or Keith because I believe it is undesirable.
> There is absolutely no reason why linear modelling in SAS or R
> requires any prior fudging of the data. You can easily handle the
> data as it actually is. Spend a little more time understanding how
> linear modelling works for microarray data, then you'll see why this
> is so. That would be time much better spent than trying to persuade
> limmaGUI to do what it doesn't want to do.
> Best wishes
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> Search the archives:
Very reluctantly I will jump in, because I remember my own experience
as a total "newbie" to this world not long ago... and I feel that the
reason why Adrian is asking about changing the signs may have little
to do with linear modelling and what it does to the data. At least, I
had similar questions... but I don't want to insult Adrian by
comparing him with me ;-)
When I started with limma (actually limmaGUI), I did it with data from
dye-swap experiments. After normalisation, the sign of the M values is
determined by the log2 of the ratio Cy5/Cy3, as Gordon explained.
That's the convention. Just like we generally agree to call Cy3 the
Green channel, and Cy5 the Red channel... which was counterintuitive
for somebody like me, who was used to using Cy3 in microscopy and it's
usually seen as red (reddish, but the computer then goes and paints it
bright red)... Just a convention.
According to that convention, teh signs of my dye-swapped arrays were
either positive or negative, depending on teh orientation of the hyb
in question. At first that was a little disorientationg, because I had
to make sure I remember which array was hyb in what order (info that's
stored in the 'targets' object, if using limma).
However, one doesn't need to worry about that. The normalised data
(per array) I only look at it to check the quality of the hybs,
really... to make a few MA plots and see general patterns, check for
After that step, we take the normalised data, and we fit a linear
model to it with the function 'lmFit'. Limma does this taking into
account the orientation of the separate hybs (information present in
the 'targets' object), and using a design matrix of our choice.
Similarly if we want to specify particular contrasts. After this, we
obtain M values that have the "correct" sign, according to whatever
orientation we indicate in teh design matrix... so not only it's not
necessary to change manually the signs of the normalised data, per
array, but also, if we do so, we'd mess up the linear model fitting...
which is the whole point about using Limma.
So, if I have four slides, comparing samples A and B, with two dye swaps:
Array Cy3 Cy5
1 A B
2 A B
3 B A
4 B A
and I am ultimately interested in B-A, and I have a gene X that has
higher expression in sample B than in A... when I normalise the data,
the M values for that gene X will be positive in arrays 1 and 2, and
negative in arrays 3 and 4 [log2(Cy5/Cy3)].
After fitting teh linear modelling, where we indicate we want the
comparison B-A, what we'll get is a single M value, and its sign will
I am not sure if this helped any, or it was too obvious to be of any
use... I just felt you were using limma only half-way, stopping at the
normalisation stage, and ignoring the 'best' part of it: the linear
Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk
The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374
Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
More information about the Bioconductor