[BioC] liimma and Across Array Normalisation

Gordon K Smyth smyth at wehi.EDU.AU
Wed Feb 12 00:08:31 CET 2014



---------------------------------------------
Professor Gordon K Smyth,
Bioinformatics Division,
Walter and Eliza Hall Institute of Medical Research,
1G Royal Parade, Parkville, Vic 3052, Australia.
Tel: (03) 9345 2326, Fax (03) 9347 0852,
http://www.statsci.org/smyth

On Tue, 11 Feb 2014, Saket Choudhary wrote:

> On 11-Feb-2014, at 10:52 PM, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>
>> On Tue, 11 Feb 2014, Saket Choudhary wrote:
>>
>>> On 11-Feb-2014, at 10:31 PM, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>>
>>>> Yes, obviously there'll be a baseline shift when you subtract background, then add an offset and log transform.
>>>>
>>>> You plots do not appear to be a valid MA plots.
>>>
>>> Could you please point out the error?
>>> I understand a base line shoft is expected, but I cant figure out what
>>> is going wrong otherwise.
>>
>> Well, you manually create an MAList object from your single channel data, even though an MAList is strictly for two colour data.
>>
>> If you deceive limma as to the true nature of your data, it's not surprising that the resulting plot might not be correct.
>>
>> I am not clear why you need to make so many variations on the standard limma single channel analysis pipeline.
>>
>
> Is there any other way to visualise MA plots for single channel data?

plotMA() already works directly on any data object:

   x <- read.maimages(targets$FileName,source="genepix",green.only=TRUE)
   plotMA(x)

What could be easier than that?

Gordon

>
>> Gordon
>>
>>
>>>
>>> Thanks,
>>> Saket
>>>
>>>
>>>> Gordon
>>>>
>>>> On Tue, 11 Feb 2014, Saket Choudhary wrote:
>>>>
>>>>> Hello Gordon,
>>>>>
>>>>> Is there a reason to believe the MA plots should inherently be
>>>>> baseline shifted after normalisation?
>>>>>
>>>>> Raw MA: https://db.tt/kDBod1EJ
>>>>> background correction with 'nec': https://db.tt/0vVWeD21
>>>>> background correction with nec followed by normalisation: https://db.tt/f0M0rWeg
>>>>> background correction with 'normexp: https://db.tt/OJO0zea5
>>>>> background correction with normexp followed by normalisation:
>>>>> https://db.tt/rbLJmFBE
>>>>>
>>>>>
>>>>> The files are a bit heavy so might take some time to load into any pdf reader.
>>>>>
>>>>> Code: https://gist.github.com/saketkc/8931951
>>>>>
>>>>> Saket
>>>>>
>>>>> On 9 February 2014 20:45, Saket Choudhary <saketkc at gmail.com> wrote:
>>>>>> Related question: Similar to your case, my final topTable()'s output
>>>>>> indicates  some genes having a negative logFC, though literature
>>>>>> expects them to have a positive logFC.
>>>>>>
>>>>>> I looked up the calculations and the transition from positive to
>>>>>> negative logFC for these genes seems to happen after the
>>>>>> normalizeBetweenArrays step (irrespective of the kind of normalisation
>>>>>> I choose).
>>>>>>
>>>>>> This is a naive question again, but I am trying to understand what should be
>>>>>> a good metric to decide which method tends to give the least false
>>>>>> positives like this, given tham I have limited knowledge of which
>>>>>> genes should be up or down regulated(unlike in your case, where you
>>>>>> knew the  kind  of regulation[up/down] expected).
>>>>>>
>>>>>> Thanks,
>>>>>> Saket
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 9 February 2014 04:00, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>>>>>>
>>>>>>> On Sat, 8 Feb 2014, Saket Choudhary wrote:
>>>>>>>
>>>>>>>> Hello Gordon,
>>>>>>>>
>>>>>>>> I had a chance to go through the paper. I have a set of negative and
>>>>>>>> positive controls, arising out of single channel Genepix platform.
>>>>>>>> From what I could gather, 'nec' method in limma performs
>>>>>>>> backgroundcorrection using these negative control spots.
>>>>>>>
>>>>>>>
>>>>>>> Yes, but the negative controls are assumed to behave exactly like probes for
>>>>>>> unexpressed genes.  This is true for Illumina Beadchips, but is often not
>>>>>>> the case for other platforms.  If not, then you would be better to stick
>>>>>>> with normexp as you are already using.
>>>>>>>
>>>>>>>
>>>>>>>> However one of the inputs to 'nec' is also "detection.p", which the
>>>>>>>> .gprs don't have.
>>>>>>>
>>>>>>>
>>>>>>> detection.p is not a required argument.  It is used only when negative
>>>>>>> controls are not available.
>>>>>>>
>>>>>>>
>>>>>>>> I could simply take a mean of all the negative controls E and Eb, and
>>>>>>>> subtract it from each probe's E&Eb, doing it for all the arrays. Would
>>>>>>>> this mimic what I want to acheive with the 'nec' function?
>>>>>>>
>>>>>>>
>>>>>>> No, that naive approach is not equivalent and typically performs poorly.
>>>>>>>
>>>>>>> Gordon
>>>>>>>
>>>>>>>
>>>>>>>> Saket
>>>>>>>>
>>>>>>>> On 6 February 2014 13:04, Saket Choudhary <saketkc at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hello Gordon,
>>>>>>>>>
>>>>>>>>> Unfortunately I do not have access to this as of now. I will however
>>>>>>>>> get hold of it soon.
>>>>>>>>>
>>>>>>>>> After implementing this, I would expect the 'CONTROL' to have similar,
>>>>>>>>> if not same values, right?
>>>>>>>>>
>>>>>>>>> However some of the values for these Control genes after the
>>>>>>>>> normalisebetweenarray step have high variance. Is this behaviour
>>>>>>>>> normal or am I missing something?
>>>>>>>>>
>>>>>>>>> Saket
>>>>>>>>>
>>>>>>>>> On 6 February 2014 06:32, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>>>>>>>>>
>>>>>>>>>> If 'x' is your background-corrected EList, then
>>>>>>>>>>
>>>>>>>>>> w <- rep(1,nrow(x))
>>>>>>>>>> w[controls] <- 100
>>>>>>>>>> y <- normalizeBetweenArrays(x, method="cyclicloess", weights=w)
>>>>>>>>>>
>>>>>>>>>> does what you want.
>>>>>>>>>>
>>>>>>>>>> For an example of this approach:
>>>>>>>>>>
>>>>>>>>>> http://rnajournal.cshlp.org/content/19/7/876
>>>>>>>>>>
>>>>>>>>>> Best wishes
>>>>>>>>>> Gordon
>>>>>>>>>>
>>>>>>>>>> --------- original message ----------
>>>>>>>>>> Saket Choudhary saketkc at gmail.com
>>>>>>>>>> Thu Feb 6 06:59:42 CET 2014
>>>>>>>>>>
>>>>>>>>>> I am analysing a proteomics microarray data set for a two group
>>>>>>>>>> sample(Normal and Disease) using single color channel. The arrays have a
>>>>>>>>>> set
>>>>>>>>>> of pre-defined CONTROL points whose expression levels are supposed to be
>>>>>>>>>> similar/same across all the arrays.
>>>>>>>>>>
>>>>>>>>>> I would like to 'normalise' the levels of all probes such that
>>>>>>>>>> normalisation
>>>>>>>>>> ends up with all CONTROL points having similar expression levels. If I
>>>>>>>>>> understand it right, normalizebetweenarray does not allow this kind of
>>>>>>>>>> normalisation.
>>>>>>>>>>
>>>>>>>>>> Is there a pre-implemented function to do this? If not, what would be a
>>>>>>>>>> way
>>>>>>>>>> to acheive this kind of normalisation?
>>>>>>>>>>
>>>>>>>>>> Code: https://gist.github.com/saketkc/8669586
>>>>>>>>>>
>>>>>>>>>> ______________________________________________________________________
>>>>>>>>>> The information in this email is confidential and intended solely for
>>>>>>>>>> the
>>>>>>>>>> addressee.
>>>>>>>>>> You must not disclose, forward, print or use it without the permission
>>>>>>>>>> of
>>>>>>>>>> the sender.
>>>>>>>>>> ______________________________________________________________________
>>>>>>>
>>>>>>> ______________________________________________________________________
>>>>>>> The information in this email is confidential and intended solely for the
>>>>>>> addressee.
>>>>>>> You must not disclose, forward, print or use it without the permission of
>>>>>>> the sender.
>>>>>>> ______________________________________________________________________
>>>>
>>>> ______________________________________________________________________
>>>> The information in this email is confidential and intended solely for the addressee.
>>>> You must not disclose, forward, print or use it without the permission of the sender.
>>>> ______________________________________________________________________
>>
>> ______________________________________________________________________
>> The information in this email is confidential and intended solely for the addressee.
>> You must not disclose, forward, print or use it without the permission of the sender.
>> ______________________________________________________________________
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list