[BioC] Nested Design (Again) & Subset WithinArray Correlation

Tue Jul 27 13:57:36 CEST 2010

Hello, 

I have two questions which may be really trivial...but since I am stuck,
I'll appreciate any help.

Question 1: Nested design: This has been addressed before, but I am just not
sure whether I am doing it right. The experiment consisted of two groups of
fishes (treated and not treated) with three tanks in each group. Each tank
hosted three fishes (total =18) of those fishes n=10 (5 per treatment group)
were selected for microarray (Notice unequal number of fishes per tank!).

I am interested in 1) Treatment effect (individual fishes)
                                2) Treatment effect (fishes nested within
tanks, i.e. Need to average the gene expression of fishes within each tank )
                                3) Whether there is tank effect

#ExpressionSet =ES_Filt
#targets= see below:

	      Sample  	    Key   tank	Fish	        SAMPLE_LABEL
25407102_532.xys	    CON	1	CON_3	SOM01K28
25407202_532.xys	    CON	1	CON_2	SOM01K29
25414902_532.xys     EXP	2	EXP_1	SOM01K2D
25407302_532.xys	    CON	3	CON_1	SOM01K2C
25406602_532.xys	    EXP	4	EXP_2	SOM01K25
25407002_532.xys	    EXP      4	EXP_3	SOM01K27
25415502_532.xys	    EXP	4	EXP_4	SOM01K2E
25405602_532.xys	    CON	5	CON_4	SOM01K23
25406702_532.xys	    CON	5	CON_5	SOM01K26
25415702_532.xys	    EXP	6	EXP_5	SOM01K24

I have tried the following design based upon what I found online, but was
not really sure whether this is the right way of doing it.

design.nested_ES<- model.matrix(~Key + (tank/Fish), data=targets)
colnames(design.nested_ES)
#I am getting many contrasts, and I am not sure which one represents
³tank/Fish²

fit.nested_ES <- lmFit(ES_Filt, design.nested_ES)
Fit.nested_ES <- eBayes(fit.nested_ES)
Pred2_Nested_ES<-topTable(Fit.nested_ES, coef=2, adjust="BH", n=Inf)
Pred2_Nested_ES[1:10,]

I will really appreciate your help.

Question 2: Testing Subset of within array replicates with different gene
names. I have a subset of "overlapping" gene list [as below]  and I would like 
to see how they correlate to
assess the hybridization efficiency on the chip. The sequences and the
probes are not identical, but overlap significantly. From reading the
postings, I know I can't use duplicaleCorrelation, because the probes are
randomly scattered on the array and I was not sure about how to use
"avedups" in a subset of genes with different names.

GENSCAN_ID	                        Matched transcript ID
GENSCAN00000010293	ENSGACT00000002218
GENSCAN00000003508	ENSGACT00000001310
GENSCAN00000021873	ENSGACT00000000225
GENSCAN00000007931	ENSGACT00000000496
GENSCAN00000022171	ENSGACT00000002296
GENSCAN00000026278	ENSGACT00000000071
GENSCAN00000000631	ENSGACT00000002139
GENSCAN00000008636	ENSGACT00000002427
GENSCAN00000008635	ENSGACT00000002432
GENSCAN00000022111	ENSGACT00000007564

Thank you so much and my apologies if this has been addressed before (You 
can
point me to the discussion).

Cheers,

Osee