[BioC] choosing the right model in limma

James W. MacDonald jmacdon at med.umich.edu
Thu Sep 21 17:18:45 CEST 2006


Hi Lina,

Lina Hultin-Rosenberg wrote:
> Dear list,
> 
> I am analyzing some affymetrix chicken data using limma and have a question
> on the best approach regarding random and fixed effects. The target matrix
> is as follows:
> 
> samplenames					sex		tissue	date
> individual
> PA009_kyckling_11_H16_060630.CEL	male		heart		1
> 16
> PA009_kyckling_11_H19_060705.CEL	male		heart		3
> 19
> PA009_kyckling_11_H21_060630.CEL	male		heart		1
> 21
> PA009_kyckling_11_H9_060704.CEL	male		heart		2	9
> PA009_kyckling_12_B16_060704.CEL	male		brain		2
> 16
> PA009_kyckling_12_B19_060704.CEL	male		brain		2
> 19
> PA009_kyckling_12_B21_060705.CEL	male		brain		3
> 21
> PA009_kyckling_12_B9_060630.CEL	male		brain		1	9
> PA009_kyckling_13_G16_060705.CEL	male		gonad		3
> 16
> PA009_kyckling_13_G19_060630.CEL	male		gonad		1
> 19
> PA009_kyckling_13_G21_060704.CEL	male		gonad		2
> 21
> PA009_kyckling_13_G9_060705.CEL	male		gonad		3	9
> PA009_kyckling_21_H10_060705.CEL	female	heart		3	10
> PA009_kyckling_21_H12_060705.CEL	female	heart		3	12
> PA009_kyckling_21_H20_060630.CEL	female	heart		1	20
> PA009_kyckling_21_H2_060704.CEL	female	heart		2	2
> PA009_kyckling_22_B10_060704.CEL	female	brain		2	10
> PA009_kyckling_22_B12_060630.CEL	female	brain		1	12
> PA009_kyckling_22_B20_060705.CEL	female	brain		3	20
> PA009_kyckling_22_B2_060630.CEL	female	brain		1	2
> PA009_kyckling_23_G10_060704.CEL	female	gonad		2	10
> PA009_kyckling_23_G12_060630.CEL	female	gonad		1	12
> PA009_kyckling_23_G20_060704.CEL	female	gonad		2	20
> PA009_kyckling_23_G2_060705.CEL	female	gonad		3	2
> 
> The question of interest is what genes that differ between male and female
> in the different tissues and as well in general. My concern is if I have to
> block for the date/batch and individual effect. In a PCA plot (and other
> quality control plots) there isn't sign of any obvious batch or individual
> effect. I also used duplicateCorrelation to calculate the correlations for
> the batch and individual effects and the results were 0.1 for individual and
> -0.03 for batch. Would it be ok to exclude the batch effect from the model
> and treat the individual as a random effect or is there a way in limma to
> include two random effects?
> 
> I also have a more general question regarding lmFit and eBayes. I fitted a
> model to the gonad samples only and then compared that to fitting a model to
> all samples and extracting the gonad contrast only (see design matrices
> below). Obviously the resulting p-values etc differ between the two
> approaches but I don't really understand the difference and know which is
> the preferred/correct approach.

There are two differences between these approaches. First, when you 
restrict the samples to just the gonad samples, you are reducing the 
amount of data that can be used in the empirical Bayes step to estimate 
the prior variability. I haven't looked at this aspect very closely, but 
I do know that reducing the number of probesets can have a pretty large 
effect on the number of samples called significant. I'm sure Gordon will 
have a better idea of how important this is.

Second, you have to remember that the denominator of your contrast is 
the sums of squares of error for the model (plus the eBayes prior). When 
you only use the gonad samples, this is the same as fitting a 
t-statistic. However, when you use all samples, the SSE measures how 
well the model fits all the data. This is often more powerful because 
you have more degrees of freedom, which implies that your estimates are 
more accurate (hence more powerful).

As for which is preferred, I usually follow the second approach. With 
most microarray analyses the statistics we use are often woefully 
underpowered, so I try to use whatever method will tend to increase power.

HTH,

Jim



> 
> Only gonad samples:
> 						m	f
> PA009_kyckling_13_G16_060705.CEL	1	0
> PA009_kyckling_13_G19_060630.CEL	1	0
> PA009_kyckling_13_G21_060704.CEL	1	0
> PA009_kyckling_13_G9_060705.CEL	1	0
> PA009_kyckling_23_G10_060704.CEL	0	1
> PA009_kyckling_23_G12_060630.CEL	0	1
> PA009_kyckling_23_G20_060704.CEL	0	1
> PA009_kyckling_23_G2_060705.CEL	0	1
> 
> All samples:
> 						mh	mb	mg	fh
> fb	fg
> PA009_kyckling_11_H16_060630.CEL	1	0	0	0	0
> 0
> PA009_kyckling_11_H19_060705.CEL	1	0	0	0	0
> 0
> PA009_kyckling_11_H21_060630.CEL	1	0	0	0	0
> 0
> PA009_kyckling_11_H9_060704.CEL	1	0	0	0	0	0
> PA009_kyckling_12_B16_060704.CEL	0	1	0	0	0
> 0
> PA009_kyckling_12_B19_060704.CEL	0	1	0	0	0
> 0
> PA009_kyckling_12_B21_060705.CEL	0	1	0	0	0
> 0
> PA009_kyckling_12_B9_060630.CEL	0	1	0	0	0	0
> PA009_kyckling_13_G16_060705.CEL	0	0	1	0	0
> 0
> PA009_kyckling_13_G19_060630.CEL	0	0	1	0	0
> 0
> PA009_kyckling_13_G21_060704.CEL	0	0	1	0	0
> 0
> PA009_kyckling_13_G9_060705.CEL	0	0	1	0	0	0
> PA009_kyckling_21_H10_060705.CEL	0	0	0	1	0
> 0
> PA009_kyckling_21_H12_060705.CEL	0	0	0	1	0
> 0
> PA009_kyckling_21_H20_060630.CEL	0	0	0	1	0
> 0
> PA009_kyckling_21_H2_060704.CEL	0	0	0	1	0	0
> PA009_kyckling_22_B10_060704.CEL	0	0	0	0	1
> 0
> PA009_kyckling_22_B12_060630.CEL	0	0	0	0	1
> 0
> PA009_kyckling_22_B20_060705.CEL	0	0	0	0	1
> 0
> PA009_kyckling_22_B2_060630.CEL	0	0	0	0	1	0
> PA009_kyckling_23_G10_060704.CEL	0	0	0	0	0
> 1
> PA009_kyckling_23_G12_060630.CEL	0	0	0	0	0
> 1
> PA009_kyckling_23_G20_060704.CEL	0	0	0	0	0
> 1
> PA009_kyckling_23_G2_060705.CEL	0	0	0	0	0	1
> 
> 
> Any comments or suggestions would be greatly appreciated. Thank you!
> 
> Best regards,
> Lina Rosenberg
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.



More information about the Bioconductor mailing list