[BioC] ReadAffy question

Kimpel, Mark William mkimpel at iupui.edu
Tue Jan 17 15:59:25 CET 2006

I work with CEL files that frequently have names assigned randomly in
respect to phenotype. I create my pdata files by modifying spreadsheets
with file and phenotype information already in appropriate columns. I
had been assuming that it did not matter what order the filenames were
in in the first column of the pdata file, that after being read in the
CEL files would be matched to the appropriate row in pdata and would
thus have the correct phenotype assigned.

Some recent work has indicated to me that this is probably NOT the case,
instead, it appears that the files are read in by filename alphanumeric
order and the phenotype and sample is assigned by row order of the pdata
file. This, of course, will often result in incorrect sample names and
phenotypes being assigned to files.

I have searched the documentation and help files for an answer to this
question to no avail.

How is this supposed to work?


Version 2.3.0 Under development (unstable) (2006-01-01 r36947) 

attached base packages:
 [1] "tcltk"     "splines"   "tools"     "methods"   "stats"
 [7] "grDevices" "utils"     "datasets"  "base"     

other attached packages:
    tkWidgets        DynDoc    reposTools   widgetTools    rat2302cdf 
      "1.9.0"       "1.9.0"       "1.9.1"       "1.7.0"       "1.5.1" 
affycoretools       GOstats      multtest    genefilter      survival 
      "1.3.1"       "1.5.4"       "1.8.0"       "1.9.2"        "2.20" 
       xtable          RBGL      annotate            GO         graph 
      "1.3-0"       "1.7.6"       "1.8.0"       "1.6.5"       "1.9.4" 
        Ruuid       cluster         limma          affy       Biobase 
      "1.9.0"      "1.10.2"       "2.4.4"       "1.9.6"       "1.9.2" 

Mark W. Kimpel
I.U. School of Medicine

