[BioC] Advice on experimental setup

Thu Sep 6 12:17:27 CEST 2012

Hi Alex,

That was a very informative reply on a difficult question, I think you
very much for your advice. I might revive this mail at some point,
when we get closer to the actual statistical analysis.

Best,
David

2012/9/6 Alex Gutteridge <alexg at ruggedtextile.com>:
> On 05.09.2012 13:53, David Westergaard wrote:
>>
>> Hi Alex,
>>
>> There is no control group as such. One of the diets is somewhat of a
>> control group, but not quite because it is still a diet that has some
>> 'special' properties. I am used to working with experiments which has
>> atleast one control group, so this setup is a bit out of my domain,
>> which is the reason I'm asking this list for advice.
>>
>> I guess what I meant by 'differentially expressed genes for each
>> diet', was a list of genes that can be attributed to this exact diet.
>> Now that I think about it, it may be more appropriate to collect mRNA
>> at the start, mid and end of the experiment, and measure the change in
>> each group, instead of comparing these. The experiment is set to run
>> for 4months. I have not before dealt with experiments which have ran
>> for so long.
>
>
> Collecting a baseline measure sounds sensible. If these are human subjects
> you should expect a lot of variation (more than in an inbred animal model),
> the baseline measure can help correct for that.
>
> Your question is still quite hard though. It's often useful for me to think
> through some scenarios for patterns of expresison that might appear and plot
> them out before deciding which ones will be interesting and then how to
> design the experiment to find them. E.g: Say Gene X goes up two fold after 4
> months of Diet A and eight fold after 4 months on Diet B do you consider
> that a Diet B 'specific' gene or not? It goes up in both A and B, but much
> more in B, so either interpretation is possible. If you do consider that
> gene Diet B specific then you could do a contrast like (DietBEnd -
> DietBStart) - (DietAEnd - DietAStart), which shows you genes where the
> effect was greater in diet B than A without excluding genes that still
> showed an effect in A.
>
> In experiments like these I am always quite wary of the temptation to get
> differentially expressed gene sets and then do set subtraction. I.e. Diet A
> 'specific' genes = Diet A DE genes - Diet B DE genes - Diet C DE genes. I
> always find that approach is very sensitive to the cutoff used to define DE,
> but it can be easier to interpret I suppose. Again if there is really no
> control diet then creating a mean 'meta-diet' might simplify the analysis
> (at the cost of the interpretation being more abstract). So something like:
> (DietAEnd - DietAStart) - ((DietAEnd - DietAStart)+(DietBEnd -
> DietBStart)+(DietCEnd - DietCStart))/3.
>
>
>> Would the data collected be suited for microarray analysis? And if so,
>> when should the microarray analysis be performed? When each sample is
>> collected, or all together at the end?
>
>
> I would go for altogether at the end. RNA is very prone to degradation
> though, so you need to take all neccessary steps to preserve the samples
> (remove RNases and get to -80C) as soon as possible after collection.
>
>
>> Best,
>> David
>>
>>
>> 2012/9/5 Alex Gutteridge <alexg at ruggedtextile.com>:
>>>
>>> On 05.09.2012 09:55, David Westergaard wrote:
>>>>
>>>>
>>>> Hello,
>>>>
>>>> I am assisting in the setup of an experiment, in which 3 groups, each
>>>> consisting of 8 subjects, will be fed 3 diets:
>>>> Group 1 - Diet A
>>>> Group 2 - Diet B
>>>> Group 3 - Diet C
>>>>
>>>> We plan on using limma to identify the differentially expressed genes.
>>>> Reading the limma users guide, a factorial design matrix seems to be
>>>> appropriate. I am, however, wondering if we, by using this setup, can
>>>> elucidate the differentially expressed genes for each diet, and not
>>>> just the ones between groups, e.g. when comparing Group 1 - Group 2.
>>>
>>>
>>>
>>> From your reply to Sean it's not clear what you mean by this last
>>> sentence.
>>> What are the 'differentially expressed genes for each diet'? Any
>>> differential expression analysis must compare groups of samples by
>>> definition, no?
>>>
>>> You could compare, say Diet A with the average of Diet B and Diet C (or
>>> even
>>> the average of all three). Is that what you mean? Whether that makes any
>>> sense depends on your experimental design. Most obviously, is one of the
>>> the
>>> three diets a 'control' diet? If not then would it be appropriate to
>>> consider an average of the three diets a kind of meta-control (probably
>>> not
>>> a word, but hopefully you know what I mean!)?
>>>
>>> --
>>> Alex Gutteridge
>>>
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> Alex Gutteridge