[BioC] transferring AffyBatch object and 64 bit R

Donna Toleno toleno at usc.edu
Fri Feb 8 19:19:16 CET 2008


I am having some technical difficulties with my Affymetrix gene expression analysis. I recently received an account on a Linux cluster because I want to do some analysis on large data sets. I installed the extra packages I need locally and I compiled them in 64 bit and I tested them to make sure the libraries load in an R session. I am using R compiled as a 64 bit application. 

I have access to an account on one other cluster as well but I don't have much disk space left to spare in the other account. I also have my personal MacBook and at work I have my Windows computer. I would now like to do my analysis on the new cluster. To start out I want to put my AffyBatch object on the new system. I tried to transfer several different ways. 

So transferring with command line scp and using non-compressed objects from Linux to Linux got me the best results, but I will describe the problem I still have when the AffyBatch object loads.

I am still a bit confused about 32bit vs 64bit systems. Do objects carry with them them information about the operating system?

Another side note is that I had to load each library separately, including the dependencies in the proper order. For example:

library (puma, lib.loc= 'path/to_my_local/R_libraries')

will fail if I don't first do

library (ROCR, lib.loc = 'same_path')
library (gtools, lib.loc = 'same_path')


When I load the data in R on the head node (64bit login) I am able to load and display the AffyBatch and all the packages load properly to display the AffyBatch correctly. Then to do my real work I need to submit a script to the queue. I submit this script to the 64bit processors. The script copies the R object to the temporary directory where I am supposed to be doing my work. At this point I use an R CMD BATCH file to load the AffyBatch object and it does not display the object properly . It loads the object but it does not have the cdf information attached to it.  When I display the AffyBatch object it looks like this:

AffyBatch object
size of arrays=1164x1164 features
cdf=HG-U133_Plus_2 (??? affyids)
number of samples=55
Error in getCdfInfo(object) :
   Could not obtain CDF environment, problems encountered:
Specified environment does not contain HG-U133_Plus_2
Library - package hgu133plus2cdf not installed
Data for package affy did not contain hgu133plus2cdf
Bioconductor - could not connect
Calls: <Anonymous> ... <Anonymous> -> cat -> featureNames -> featureNames -> getCdfInfo
In addition: Warning message:
missing cdf environment! in show(AffyBatch)
Execution halted

Any ideas or clarifications about what is going on would be helpful. The computer support people don't know much about Bioconductor or R. I would appreciate any advice or even questions to ask the computer support people.

Thank you in advance.

