[BioC] liimma and Across Array Normalisation

Gordon K Smyth smyth at wehi.EDU.AU
Tue Feb 11 23:52:01 CET 2014


On Tue, 11 Feb 2014, Saket Choudhary wrote:

> On 11-Feb-2014, at 10:31 PM, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>
>> Yes, obviously there'll be a baseline shift when you subtract background, then add an offset and log transform.
>>
>> You plots do not appear to be a valid MA plots.
>>
>
> Could you please point out the error?
> I understand a base line shoft is expected, but I cant figure out what
> is going wrong otherwise.

Well, you manually create an MAList object from your single channel data, 
even though an MAList is strictly for two colour data.

If you deceive limma as to the true nature of your data, it's not 
surprising that the resulting plot might not be correct.

I am not clear why you need to make so many variations on the standard 
limma single channel analysis pipeline.

Gordon


>
> Thanks,
> Saket
>
>
>> Gordon
>>
>> On Tue, 11 Feb 2014, Saket Choudhary wrote:
>>
>>> Hello Gordon,
>>>
>>> Is there a reason to believe the MA plots should inherently be
>>> baseline shifted after normalisation?
>>>
>>> Raw MA: https://db.tt/kDBod1EJ
>>> background correction with 'nec': https://db.tt/0vVWeD21
>>> background correction with nec followed by normalisation: https://db.tt/f0M0rWeg
>>> background correction with 'normexp: https://db.tt/OJO0zea5
>>> background correction with normexp followed by normalisation:
>>> https://db.tt/rbLJmFBE
>>>
>>>
>>> The files are a bit heavy so might take some time to load into any pdf reader.
>>>
>>> Code: https://gist.github.com/saketkc/8931951
>>>
>>> Saket
>>>
>>> On 9 February 2014 20:45, Saket Choudhary <saketkc at gmail.com> wrote:
>>>> Related question: Similar to your case, my final topTable()'s output
>>>> indicates  some genes having a negative logFC, though literature
>>>> expects them to have a positive logFC.
>>>>
>>>> I looked up the calculations and the transition from positive to
>>>> negative logFC for these genes seems to happen after the
>>>> normalizeBetweenArrays step (irrespective of the kind of normalisation
>>>> I choose).
>>>>
>>>> This is a naive question again, but I am trying to understand what should be
>>>> a good metric to decide which method tends to give the least false
>>>> positives like this, given tham I have limited knowledge of which
>>>> genes should be up or down regulated(unlike in your case, where you
>>>> knew the  kind  of regulation[up/down] expected).
>>>>
>>>> Thanks,
>>>> Saket
>>>>
>>>>
>>>>
>>>>
>>>> On 9 February 2014 04:00, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>>>>
>>>>> On Sat, 8 Feb 2014, Saket Choudhary wrote:
>>>>>
>>>>>> Hello Gordon,
>>>>>>
>>>>>> I had a chance to go through the paper. I have a set of negative and
>>>>>> positive controls, arising out of single channel Genepix platform.
>>>>>> From what I could gather, 'nec' method in limma performs
>>>>>> backgroundcorrection using these negative control spots.
>>>>>
>>>>>
>>>>> Yes, but the negative controls are assumed to behave exactly like probes for
>>>>> unexpressed genes.  This is true for Illumina Beadchips, but is often not
>>>>> the case for other platforms.  If not, then you would be better to stick
>>>>> with normexp as you are already using.
>>>>>
>>>>>
>>>>>> However one of the inputs to 'nec' is also "detection.p", which the
>>>>>> .gprs don't have.
>>>>>
>>>>>
>>>>> detection.p is not a required argument.  It is used only when negative
>>>>> controls are not available.
>>>>>
>>>>>
>>>>>> I could simply take a mean of all the negative controls E and Eb, and
>>>>>> subtract it from each probe's E&Eb, doing it for all the arrays. Would
>>>>>> this mimic what I want to acheive with the 'nec' function?
>>>>>
>>>>>
>>>>> No, that naive approach is not equivalent and typically performs poorly.
>>>>>
>>>>> Gordon
>>>>>
>>>>>
>>>>>> Saket
>>>>>>
>>>>>> On 6 February 2014 13:04, Saket Choudhary <saketkc at gmail.com> wrote:
>>>>>>>
>>>>>>> Hello Gordon,
>>>>>>>
>>>>>>> Unfortunately I do not have access to this as of now. I will however
>>>>>>> get hold of it soon.
>>>>>>>
>>>>>>> After implementing this, I would expect the 'CONTROL' to have similar,
>>>>>>> if not same values, right?
>>>>>>>
>>>>>>> However some of the values for these Control genes after the
>>>>>>> normalisebetweenarray step have high variance. Is this behaviour
>>>>>>> normal or am I missing something?
>>>>>>>
>>>>>>> Saket
>>>>>>>
>>>>>>> On 6 February 2014 06:32, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>>>>>>>
>>>>>>>> If 'x' is your background-corrected EList, then
>>>>>>>>
>>>>>>>>  w <- rep(1,nrow(x))
>>>>>>>>  w[controls] <- 100
>>>>>>>>  y <- normalizeBetweenArrays(x, method="cyclicloess", weights=w)
>>>>>>>>
>>>>>>>> does what you want.
>>>>>>>>
>>>>>>>> For an example of this approach:
>>>>>>>>
>>>>>>>>  http://rnajournal.cshlp.org/content/19/7/876
>>>>>>>>
>>>>>>>> Best wishes
>>>>>>>> Gordon
>>>>>>>>
>>>>>>>> --------- original message ----------
>>>>>>>> Saket Choudhary saketkc at gmail.com
>>>>>>>> Thu Feb 6 06:59:42 CET 2014
>>>>>>>>
>>>>>>>> I am analysing a proteomics microarray data set for a two group
>>>>>>>> sample(Normal and Disease) using single color channel. The arrays have a
>>>>>>>> set
>>>>>>>> of pre-defined CONTROL points whose expression levels are supposed to be
>>>>>>>> similar/same across all the arrays.
>>>>>>>>
>>>>>>>> I would like to 'normalise' the levels of all probes such that
>>>>>>>> normalisation
>>>>>>>> ends up with all CONTROL points having similar expression levels. If I
>>>>>>>> understand it right, normalizebetweenarray does not allow this kind of
>>>>>>>> normalisation.
>>>>>>>>
>>>>>>>> Is there a pre-implemented function to do this? If not, what would be a
>>>>>>>> way
>>>>>>>> to acheive this kind of normalisation?
>>>>>>>>
>>>>>>>> Code: https://gist.github.com/saketkc/8669586
>>>>>>>>
>>>>>>>> ______________________________________________________________________
>>>>>>>> The information in this email is confidential and intended solely for
>>>>>>>> the
>>>>>>>> addressee.
>>>>>>>> You must not disclose, forward, print or use it without the permission
>>>>>>>> of
>>>>>>>> the sender.
>>>>>>>> ______________________________________________________________________
>>>>>
>>>>> ______________________________________________________________________
>>>>> The information in this email is confidential and intended solely for the
>>>>> addressee.
>>>>> You must not disclose, forward, print or use it without the permission of
>>>>> the sender.
>>>>> ______________________________________________________________________
>>
>> ______________________________________________________________________
>> The information in this email is confidential and intended solely for the addressee.
>> You must not disclose, forward, print or use it without the permission of the sender.
>> ______________________________________________________________________
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list