[BioC] help: limma and changing gene results!

Koen Marien koenmarien87 at gmail.com
Wed May 19 12:42:46 CEST 2010


>>>> Dear Jim and others who can help

Koen Marien wrote:
> Thanks for the clear and fast reply, Jim. Indeed a-(b+c+d) isn't a
contrast,
> but I think I'm having a different problem. Here is the experiment shortly
> explained:

Yes, but... a-(b+c+d) doesn't make any sense. Why would you do such a 
thing? Let's say the mean of all four samples for a given gene is 
identical (I dunno, say 5).
Any of a-b, a-c, a-d will be zero, whereas a-(b+c+d) is -10. So what 
does that tell us, in a biological sense?

>>>> I compare a progenitor population with three offspring populations to
identify surface markers. So I need upregulated genes in the 'a' population 
>>>> compared to 'b', 'c' and 'd' populations


>  
> I have four populations of cells with three biological replicates for each
> population -> a1,a2,a3,b1,b2,b3,c1,c2,c3,d1,d2,d3. I normalized them and
> looked at the differentially expressed genes between the 'a' population
and
> each of those other populations individually: a-b, a-c, a-d. The venn
> approach is done with the online web application Venny and only looks at
the
> common probe set ID's in the three lists (let's call it the 'one-on-one
> strategy').
> I also looked at the differentially expressed genes when b, c and d values
> where put together: a-e with e=b+c+d (let's call it the 'group strategy').
> So it's not really the contrasts that are changed.

How are the contrasts not changed? You are comparing a contrast with a 
not-a-contrast that doesn't even make sense. That there will be 
differences is a forgone conclusion.

>>>> I don't really change the contrast (look at the code, it's always
'group2-group1')
>>>> I'll try to explain again: 
>>>> one-on-one strategy: compared a to b, a to c, a to d and compared the
differentially expressed genes with the online Venny-tool
(http://bioinfogp.cnb.csic.es/tools/venny/index.html). So e.g. group1 = 'a'
population (always) and group 2 = 'b' or 'c' or 'd' (I ran the code three 
>>>> times)

>>>> group strategy: compared a to (b&c&d) (look at the code: I annotated
the 'a' files by appointing them to population '1' and the 'b','c','d' files
by 
>>>> appointing >> them to population '2') so group1 = 'a' population and
group2 = 'b'+'c'+'d'

>>>> My questions are: Why do I get different lists in these two approaches?
Which approach gives me the best results when I look for specifically 
>>>> upregulated genes in the 'a' population?

>>>> I'm still learning and especially learn a lot from you, so thanks for
your patience, Koen


Best,

Jim


> 
> Now, when looking at the one-on-one strategy list there are only five
genes
> common in the three groups with a B-value > 2, while in the group strategy
> there are 181 probe sets with a B-value > 2.
> 
> Relevent code used:
> read all the cell files (a,b,c,d)
>
pd<-data.frame(population=c(rep(1,3),rep(2,8)),replicate=c(seq(1,3),seq(1,8)
> )) => group strategy
> or
> only read the .cel files of two populations (a,b or a,c or a,d)
>
pd<-data.frame(population=c(rep(1,3),rep(2,3)),replicate=c(seq(1,3),seq(1,3)
> )) => one-on-one strategy (repeated three times for each comparison)
> 
> group<-factor(eset$population)
> design = model.matrix(~0+group)
> design
> cont.matrix = makeContrasts(eset = (group2 - group1), levels = design)
> cont.matrix
> 
> 
> Regards
> 
> Koen
> 
> -----Original Message-----
> From: James MacDonald [mailto:jmacdon at med.umich.edu] 
> Sent: woensdag 12 mei 2010 4:40
> To: Koen Marien
> Cc: 'Joseph Skaf'; bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] help: limma and changing gene results!
> 
> Hi Koen,
> 
> Koen Marien wrote:
>> Dear
>>
>>
>> Is this also the reason why there is a difference in the (differentially
>> expressed) gene lists of a-(b+c+d) and venny(a-b,a-c,a-d)?
> 
> I am not familiar with the venny() function, so it's hard to say. But if 
> you mean a contrast of a-(b+c+d) versus individual contrasts of a-b, 
> a-c, a-d, then no.
> 
> In the first place, a-(b+c+d) isn't a contrast, and in most cases 
> doesn't make sense. You might mean a-(b+c+d)/3, which is a contrast, and 
> tests the difference between the a group and the mean of the other 
> three. The denominator will be the same in each case, being based on (in 
> simple terms) the average variability of the four groups.
> 
> However, if what I am assuming is correct, then the two contrasts are 
> quite different, and shouldn't be expected to result in the same gene 
> lists. As an example, say the mean of the groups for one gene are:
> 
> a = 5
> b = 2
> c = 5
> d = 8
> 
> since the denominator will be the same we can ignore that here. So do 
> you think there will be a difference in what is called significant when 
> we compare
> 
> 5 - (2+5+8)/3 = 0
> 
> and
> 
> 5 - 2 = 3
> 5 - 5 = 0
> 5 - 8 = -3
> 
> ?
> 
> Best,
> 
> Jim
> 
> 
>> a-(b+c+d): 				putting the b, c and d values in one
>> group (b+c+d) and using limma
>> venny(a-b,a-c,a-d): 		using limma on the separate groups and
>> create a list by looking at the intersection of de venn diagram of the
> three
>> 					'sublists' a-b, a-c, a-d
>>
>>
>> Thanks a lot
>>
>>
>> Koen Marien
>> student bioscience engineering: cell and gene biotechnology
>> University of Ghent, Belgium
>>
>>
>> -----Original Message-----
>> From: bioconductor-bounces at stat.math.ethz.ch
>> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of James W.
>> MacDonald
>> Sent: donderdag 29 april 2010 18:46
>> To: Joseph Skaf
>> Cc: bioconductor at stat.math.ethz.ch
>> Subject: Re: [BioC] help: limma and changing gene results!
>>
>> Hi Joseph,
>>
>> Joseph Skaf wrote:
>>> To whom it may concern,
>>>
>>> I've been having some problems with consistency in my limma results for
>>> genes that are found to have significant differential transcript
>> abundance.
>>> In a given example, I may have 4 different groups (a, b, c, and d) in an
>>> array set of 12.
>>>
>>> From here, I make a contrast matrix that has contrasts for a-b, a-c, and
>>> a-d.  Eventually, I output an eBaye's corrected contrast fit and I use
>>> decideTests from there to find out what genes are differentially
>> expressed.
>>> My misunderstanding is that when I take away an entire group (such as
>>> removing all d's) and redo all steps in the limma analysis, I find that
I
>>> end up with a different set of genes after using decideTests.  I am
>> confused
>>> here, because I would not think that removing group 'd' from the
analysis
>>> would have an effect on contrasts a-b and a-c.
>>>
>>> If anyone could even hint to me a reason as to why this is happening, it
>>> would be greatly appreciated.
>> It's because of how the denominator for your contrast is computed. The 
>> denominator is computed using the intra-group variance for all the 
>> groups in your study, not just the two groups being compared in the 
>> contrast.
>>
>> So if you remove one of the groups, you lose both degrees of freedom as 
>> well as the contribution from the intra-group variance of that group. 
>> Losing the degrees of freedom will reduce your power to detect 
>> differences. Losing the contribution of the intra-group variance will 
>> depend on how variable the group d data are compared to groups a-c.
>>
>> Best,
>>
>> Jim
>>
>>
>>
>>> Thanks and regards,
>>> Joseph Skaf
>>>
>>>
>>>
> 

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be
used for urgent or sensitive issues



More information about the Bioconductor mailing list