[BioC] cross study comparison of cell populations

Wed Jun 25 11:44:27 CEST 2008

Dear List,

I want to compare a tumor cell population to different normal cell populations on affy chips from different batches (studies)

The problem is that the tumor cell population and one of the normal cell populations come from one batch, whereas the other  cell populations come from two other studies. In principle the data looks like this (there are different numbers of replicas of the different cell populations, so X stands for anything between 3 and 5):

batch 1: X repl. of tumor cells
	 X repl. of cellpop A

batch 2: X repl of cellpop B
	 X repl of cellpop C
	 X repl of cellpop D

batch 3: X repl of cellpop E
	 X repl of cellpop F
	 X repl of cellpop G

The question is - which of the cellpops is most similar to the tumor cell pop.
and of course, without correction of the batch effect, the batches cluster together.

Now if I take out the batch effect (I used e.g. the ComBat script of Cheng Li and W.E. Johnson, Biostatistics (2007), 8,1: 118-127) the batches do not cluster together any more, so tumor cells do not cluster with cellpop A any more, but I think that with taking out the batch effect in such an unbalanced "design" I would also take out some of the real biological differences and similarities, especially because the cell populations within batch 2 and 3 are probably more similar to each other than accross studies (batches).

So my simple questions are: 
Is it sensible at all to do that kind of a comparison?
and what would be the most appropriate method? - I know that there are different packages and methods (e.g. metaArray, rankProd) but I would like to get an opinion of what, if any, would be most appropriate in my special case.

Thanks!
Max

--------------------------------------
Maximilian Kauer
CCRI - Children's Cancer Research Institute
Kinderspitalgasse 6
1090 Vienna Austria