[BioC] liimma and Across Array Normalisation

Tue Feb 11 16:12:36 CET 2014

On 11 February 2014 14:03, James W. MacDonald <jmacdon at uw.edu> wrote:
> Hi Saket,
>
>
> On 2/11/2014 4:52 AM, Saket Choudhary wrote:
>>
>> Hello Gordon,
>>
>> Is there a reason to believe the MA plots should inherently be
>> baseline shifted after normalisation?
>>
>> Raw MA: https://db.tt/kDBod1EJ
>> background correction with 'nec': https://db.tt/0vVWeD21
>> background correction with nec followed by normalisation:
>> https://db.tt/f0M0rWeg
>> background correction with 'normexp: https://db.tt/OJO0zea5
>> background correction with normexp followed by normalisation:
>> https://db.tt/rbLJmFBE
>>
>>
>> The files are a bit heavy so might take some time to load into any pdf
>> reader.
>
>
> That's why you don't use a vector graphics format for plots with lots of
> points. Instead, use png or jpeg.
>

The motivation for this was to combine everything using Sweave into a
high-res report.
Here are the low resolution version:

Raw MA: https://db.tt/keqtevVR
background correction with 'nec':   https://db.tt/0vVWeD21
background correction with nec followed by normalisation: https://db.tt/3eFFJXKk
background correction with 'normexp: https://db.tt/TNb5CHMc
background correction with normexp followed by normalisation:
https://db.tt/FBw5NLAN

> Best,
>
> Jim
>
>
>>
>> Code: https://gist.github.com/saketkc/8931951
>>
>> Saket
>>
>> On 9 February 2014 20:45, Saket Choudhary <saketkc at gmail.com> wrote:
>>>
>>> Related question: Similar to your case, my final topTable()'s output
>>> indicates  some genes having a negative logFC, though literature
>>> expects them to have a positive logFC.
>>>
>>> I looked up the calculations and the transition from positive to
>>> negative logFC for these genes seems to happen after the
>>> normalizeBetweenArrays step (irrespective of the kind of normalisation
>>> I choose).
>>>
>>> This is a naive question again, but I am trying to understand what should
>>> be
>>> a good metric to decide which method tends to give the least false
>>> positives like this, given tham I have limited knowledge of which
>>> genes should be up or down regulated(unlike in your case, where you
>>> knew the  kind  of regulation[up/down] expected).
>>>
>>> Thanks,
>>> Saket
>>>
>>>
>>>
>>>
>>> On 9 February 2014 04:00, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>>>
>>>> On Sat, 8 Feb 2014, Saket Choudhary wrote:
>>>>
>>>>> Hello Gordon,
>>>>>
>>>>> I had a chance to go through the paper. I have a set of negative and
>>>>> positive controls, arising out of single channel Genepix platform.
>>>>>  From what I could gather, 'nec' method in limma performs
>>>>> backgroundcorrection using these negative control spots.
>>>>
>>>>
>>>> Yes, but the negative controls are assumed to behave exactly like probes
>>>> for
>>>> unexpressed genes.  This is true for Illumina Beadchips, but is often
>>>> not
>>>> the case for other platforms.  If not, then you would be better to stick
>>>> with normexp as you are already using.
>>>>
>>>>
>>>>> However one of the inputs to 'nec' is also "detection.p", which the
>>>>> .gprs don't have.
>>>>
>>>>
>>>> detection.p is not a required argument.  It is used only when negative
>>>> controls are not available.
>>>>
>>>>
>>>>> I could simply take a mean of all the negative controls E and Eb, and
>>>>> subtract it from each probe's E&Eb, doing it for all the arrays. Would
>>>>> this mimic what I want to acheive with the 'nec' function?
>>>>
>>>>
>>>> No, that naive approach is not equivalent and typically performs poorly.
>>>>
>>>> Gordon
>>>>
>>>>
>>>>> Saket
>>>>>
>>>>> On 6 February 2014 13:04, Saket Choudhary <saketkc at gmail.com> wrote:
>>>>>>
>>>>>> Hello Gordon,
>>>>>>
>>>>>> Unfortunately I do not have access to this as of now. I will however
>>>>>> get hold of it soon.
>>>>>>
>>>>>> After implementing this, I would expect the 'CONTROL' to have similar,
>>>>>> if not same values, right?
>>>>>>
>>>>>> However some of the values for these Control genes after the
>>>>>> normalisebetweenarray step have high variance. Is this behaviour
>>>>>> normal or am I missing something?
>>>>>>
>>>>>> Saket
>>>>>>
>>>>>> On 6 February 2014 06:32, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>>>>>>
>>>>>>> If 'x' is your background-corrected EList, then
>>>>>>>
>>>>>>>    w <- rep(1,nrow(x))
>>>>>>>    w[controls] <- 100
>>>>>>>    y <- normalizeBetweenArrays(x, method="cyclicloess", weights=w)
>>>>>>>
>>>>>>> does what you want.
>>>>>>>
>>>>>>> For an example of this approach:
>>>>>>>
>>>>>>>    http://rnajournal.cshlp.org/content/19/7/876
>>>>>>>
>>>>>>> Best wishes
>>>>>>> Gordon
>>>>>>>
>>>>>>> --------- original message ----------
>>>>>>> Saket Choudhary saketkc at gmail.com
>>>>>>> Thu Feb 6 06:59:42 CET 2014
>>>>>>>
>>>>>>> I am analysing a proteomics microarray data set for a two group
>>>>>>> sample(Normal and Disease) using single color channel. The arrays
>>>>>>> have a
>>>>>>> set
>>>>>>> of pre-defined CONTROL points whose expression levels are supposed to
>>>>>>> be
>>>>>>> similar/same across all the arrays.
>>>>>>>
>>>>>>> I would like to 'normalise' the levels of all probes such that
>>>>>>> normalisation
>>>>>>> ends up with all CONTROL points having similar expression levels. If
>>>>>>> I
>>>>>>> understand it right, normalizebetweenarray does not allow this kind
>>>>>>> of
>>>>>>> normalisation.
>>>>>>>
>>>>>>> Is there a pre-implemented function to do this? If not, what would be
>>>>>>> a
>>>>>>> way
>>>>>>> to acheive this kind of normalisation?
>>>>>>>
>>>>>>> Code: https://gist.github.com/saketkc/8669586
>>>>>>>
>>>>>>>
>>>>>>> ______________________________________________________________________
>>>>>>> The information in this email is confidential and intended solely for
>>>>>>> the
>>>>>>> addressee.
>>>>>>> You must not disclose, forward, print or use it without the
>>>>>>> permission
>>>>>>> of
>>>>>>> the sender.
>>>>>>>
>>>>>>> ______________________________________________________________________
>>>>>
>>>>>
>>>> ______________________________________________________________________
>>>> The information in this email is confidential and intended solely for
>>>> the
>>>> addressee.
>>>> You must not disclose, forward, print or use it without the permission
>>>> of
>>>> the sender.
>>>> ______________________________________________________________________
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>