[BioC] Working flow_own CEL file reading problem

Junshi Yazaki jyazaki at salk.edu
Fri May 20 20:51:42 CEST 2005


Hi Jim, Seth, Reddy, Paul,

Thank you  very much for your suggestion.  I may be make cdf 
environment. Could you please help me how to confirm the env is OK or 
No? Next I tried cel file reading and normalize from our custom affy 
array. If my working flow are useful for affy beginner like me, could 
you please help me?

At first, I typed below...
>source("http://www.bioconductor.org/getBioC.R")
>getBioC("all")
>library(makecdfenv)
>Library(affy)
>make.cdf.package ("arabidopsistlgF.cdf")

And move to Terminal on my Mac,
>R CMD INSTALL arabidopsistlgFcdf
Return to R,
>arabidopsistlgF = make.cdf.env("arabidopsistlgF.cdf")

And I shut down my Mac. Is these step correct for making cdf 
environment? And then I started again.

>  source("http://www.bioconductor.org/getBioC.R")
>  getBioC()
>  library(affy)
>Data <- readAffy()
>eset <- rma(data)

I got Error below,
***********
Note: You did not specify a download type.  Using a default value of: Source
This will be fine for almost all users

Error in getCdfInfo(object) : Could not obtain CDF environment, 
problems encountered:
Specified environment specified did not contain arabidopsis_tlgF_4x
Library - package arabidopsistlgf4xcdf not installed
Data for package affy did not contain arabidopsis_tlgF_4x
Bioconductor - arabidopsistlgf4xcdf not available
*********
Q1. I have question. Do I need typing below every time after restart? 
If I need the typing every time for making cdf env, I need lot of 
time for this step (cdf file is big).
**********
>source("http://www.bioconductor.org/getBioC.R")
>getBioC("all")
>library(makecdfenv)
>Library(affy)
>make.cdf.package ("arabidopsistlgF.cdf")
**********
And next, I tried makecdfenv again like below,

>  env =  make.cdf.env("arabidopsistlgF.cdf")
>  library(makecdfenv)
>  env =  make.cdf.env("arabidopsistlgF.cdf")
>  cel.files=list.files(pattern=".CEL$")
>  data=ReadAffy(filenames=cel.files)
>  pname<- cleancdfname(whatcdf("J_HpaII_Wt_10uM.CEL"))
>  temp=rma(data)

I got Error below,
******
Note: You did not specify a download type.  Using a default value of: Source
This will be fine for almost all users

Error in getCdfInfo(object) : Could not obtain CDF environment, 
problems encountered:
Specified environment specified did not contain arabidopsis_tlgF_4x
Library - package arabidopsistlgf4xcdf not installed
Data for package affy did not contain arabidopsis_tlgF_4x
Bioconductor - arabidopsistlgf4xcdf not available
*********
So I made copy of "arabidopsistlgF.cdf", and change name 
"arabidopsistlgF4x". And continue,

>   env =  make.cdf.env("arabidopsistlgF4x.cdf")
>  cel.files=list.files(pattern=".CEL$")
>  data=ReadAffy(filenames=cel.files)
>
>  pname<- cleancdfname(whatcdf("J_HpaII_Wt_10uM.CEL"))

I got Error again,
********
Error in whatcdf("J_HpaII_Wt_10uM.CEL") : Could not open file 
J_HpaII_Wt_10uM.CEL
********

I thought I may be need  for cel file normalization, below,

>  library(gcrma)
Loading required package: matchprobes
>  Data <- ReadAffy()
>  eset <- gcrma(Data)

I got Error again,
********
Computing affinities[1] "Checking to see if your internet connection works..."
Warning message:
unable to connect to 'www.bioconductor.org' on port 80.
Note: http://www.bioconductor.org/repository/devel/package/Source 
does not seem to have a valid repository, skipping
Warning messages:
1: Failed to read replisting at 
http://www.bioconductor.org/repository/devel/package/Source in: 
getReplisting(repURL, repFile, method = method)
2: unable to connect to 'www.bioconductor.org' on port 80.
Note: http://www.bioconductor.org/repository/devel/package/Win32 does 
not seem to have a valid repository, skipping
Note: You did not specify a download type.  Using a default value of: Source
This will be fine for almost all users

Error in getCDF(cdfpackagename) : Environment arabidopsistlgf4xcdf 
was not found in the Bioconductor repository.
In addition: Warning message:
Failed to read replisting at 
http://www.bioconductor.org/repository/devel/package/Win32 in: 
getReplisting(repURL, repFile, method = method)
********
Q2. I can not read my cel file now. Our cdf file name is 
"arabidopsistlgF.cdf" . But cif file name is 
"arabidopsistlgF_4x.cif". Do I need to use same name for cif and cdf? 
Because cel file include cif file name. And how can I start to read 
cel file?

Q3. And also I would like to read cel file and normalization using a 
lot of cel files. Could you please suggest me what package is better 
for reading and normalization of affy custom array? and which is 
better  rma (Robust Multi-Array Average expression measure) or gcrma 
(Background adjustment using sequence information)?

Q4. If our array has over 3 million data, how long do I need for 
reading and normalization for 1 data (depend on machine power?)?  Do 
you have some speculation for calculation efficiency? I need to read 
cdf file for about 20min.

Thank you very much,
Junshi
-- 
***********************************************************
***********************************************************
Junshi Yazaki

The Salk Institute for Biological Studies



More information about the Bioconductor mailing list