[BioC] How to analyze Affy data, CEL files not available

James W. MacDonald jmacdon at med.umich.edu
Wed Feb 7 19:02:52 CET 2007


Hi Bobby,

Bobby Prill wrote:
> I would like to analyze a set of 40 Affy experiments, but I do not  
> have the CEL files.  What I have is a spreadsheet of the MAS  
> expression measures, one column per array.  Each row corresponds to  
> one gene.
> 
> I load the data:
> eset = read.exprSet(exprs="mydata.txt", phenoData="phenoData.txt")
> 
> My general question is, should/can I perform some sort of  
> normalization so that the arrays are comparable from one to  
> another?   or is this what MAS has already done?  (I'm not familiar  
> with Affy MAS.)
> 
> Other problems include:
> 
> 1. MA plots indicate that the data cloud is skewed (not perfectly  
> centered on M==0 line).  Should I loess?

Almost certainly not. A loess normalization is almost always an 
intra-array normalization for spotted cDNA microarrays rather than 
something useful for the Affy chip type. I would look at a boxplot of 
the data to see if the samples tend to line up. MAS5.0 usually ends up 
doing a scaling and centering of the data, so you will likely see boxes 
with fairly equal medians and inter-quartile ranges.

I suppose you could do a quantile normalization at this point, but that 
might not be necessary or a good idea.

> 
> 2.  Also, the M values have high variance at low A, which I think is  
> a byproduct of the MAS. Probably nothing I can do about this.

Nope.
> 
> I think the typical advice would be to obtain CEL files and run rma 
> ().  But if I'm stuck with the MAS expression calls, what to do?

I would make sure the boxplots line up reasonably well, then go on to 
higher level analyses. If you have the P/M/A calls you can filter out 
the 'absent' samples, or use one of the various options in the 
genefilter package.

HTH,

Jim


> 
> Thanks.
> 
> - Bobby
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.



More information about the Bioconductor mailing list