[BioC] Affxparser: how to create CEL-file from AffyBatch

Henrik Bengtsson hb at stat.berkeley.edu
Wed Jul 2 23:42:02 CEST 2008


Hi.

On Wed, Jul 2, 2008 at 1:36 PM, James W. MacDonald
<jmacdon at med.umich.edu> wrote:
> Hi Guido,
>
> Hooiveld, Guido wrote:
>>
>> Dear list,
>>  My basic understanding of R/BioC doesn't allow me to solve my problem,
>> so I hope someone can point me to the right direction.
>>  Basically, my question is how to extract/create/save the data embedded
>> in an AffyBatch object as individual *.CEL files (using Affxparser).
>>  -->
>> I am using HarshLight to 'correct' a bunch of CEL files (12x MOE430A
>> arrays) for blemish. Running HarshLight goes fine, but I then would like
>> to save the corrected files (as *.CEL) to my HDD.
>> A function called 'Helpers' (available @ the site of the Harslight
>> developers) normally used for this, doesn't work for me... See below for
>> error (I will also directly contact the authors about this).
>>  Hinted by a recent thread, I tried to give Afxxparser a try. Main reason
>> for this is that this likely is a more generic way of saving CEL files
>> from an AffyBatch object than using 'Helpers'.
>>  http://article.gmane.org/gmane.science.biology.informatics.conductor/184
>> 55
>> <cut/paste>
>> From: James W. MacDonald <jmacdon at ...>
>> Subject: Re: AffyBatch -> CEL File
>> Newsgroups: gmane.science.biology.informatics.conductor
>> Date: 2008-06-16 14:18:16 GMT Hi Markus,
>>  Markus Schmidberger wrote:
>>
>>>
>>> Hello,
>>>
>>> is there any function or package to create a CEL File (or CEL files) from
>>> an AffyBatch?
>>>
>>
>> Seems to me you could use createCel() and updateCel() from the affxparser
>> package.
>>  Best,
>> Jim
>> </cut/paste>
>>    But now I am getting lost....
>> AFter checking the Affxparser help pages I tried this:
>>  # Note: next 4 lines directly copied from help pages Affxparser:
>>
>>>
>>> celFiles <- list.files(pattern="[.](c|C)(e|E)(l|L)$")
>>> if (length(celFiles) == 0)
>>>
>>
>> +     break;
>>
>>>
>>> celFile <- celFiles[1]
>>> celFile
>>>
>>
>> [1] "A23_mIntestine_KOWY7.CEL"
>>  # usage createCel should be: "createCel(outFile, hdr, overwrite=TRUE)"
>>
>>>
>>>  hdr <- readCelHeader(celFile)
>>>  createCel("celGuido.CEL", hdr, overwrite=TRUE)
>>>
>>
>> Warning message:
>> In createCel("celGuido.CEL", hdr, overwrite = TRUE) :
>>  Could not find a CDF file for this chip type: MOE430A
>>
>>>
>>> traceback()
>>>
>>
>> No traceback available   In other words, I cannot get the first step to
>> work [createCel()]....
>> let alone the 2nd [updateCel()]
>>
>
> Where exactly is the error? I see a warning that you don't have the CDF
> file, and I assume you know how to get that. But a warning is not an error,
> so what exactly is the problem?

I can confirm that this warning is not a serious problem; createCel()
tries to validate that the CEL header 'hdr' is consistent with the
corresponding CDF.  The CEL header refers to chip type 'MOE430A' and
therefore it tries to locate the CDF *file* named 'MOE430A.CDF' (see
?findCdf for affxparser on how it is found).  If the CDF cannot be
found, the CEL header is not validated, but assumed to be correct.
[in the next release the warning message will be more specific about
this].

So, have a look in the current directory.  You will find a file
'celGuido.CEL' there after calling createCel().  This is a valid CEL
file, but all its elements are zero.  Next step is to use updateCel()
to will it will values.  You have to make sure that the order of the
AffyBatch probe intensities is the one you want for the CEL file.  If
not, you can either remap it yourself or use argument 'writeMap' to
specify this.

In the bigger picture: Since this is a feature of interest for more
people, "someone" should write a generic writeCel() method for the
AffyBatch class (that also does all the validation etc).  It's on my
todo list, but very far down, so don't count on me to do this.  Also
have a look at aroma.affymetrix [], which stores all your probe
signals automatically as CEL files.

Cheers

/Henrik

>
>>  Therefore, to get me further, I would appreciate if someone could
>> provide some lines of code on how to extract/save multiple CEL files
>> from an AffyBatch object.
>>  Many thanks in advance,
>> Guido
>>    HarshLight:
>> http://www.bioconductor.org/packages/2.2/bioc/html/Harshlight.html
>> For 'Helpers':
>> http://asterion.rockefeller.edu/Harshlight/index2.html
>>  Full Code:
>> library(affy)
>> library(Harshlight)
>> library(Helpers)
>> library(affxparser)
>>
>> data <- ReadAffy()
>>
>>>
>>> data
>>>
>>
>> AffyBatch object
>> size of arrays=712x712 features (9 kb)
>> cdf=MOE430A (22690 affyids)
>> number of samples=12
>> number of genes=22690
>> annotation=moe430a
>> notes=
>>>
>>> ab <- Harshlight(data, na.sub = FALSE)
>>>
>>
>> [1] "Generating Error Images"
>> [1] "Initializing Harshlight"
>> [1] "Analyzing chip number 1"
>> [1] "Analyzing chip number 2"
>> [1] "Analyzing chip number 3"
>> [1] "Analyzing chip number 4"
>> [1] "Analyzing chip number 5"
>> [1] "Analyzing chip number 6"
>> [1] "Analyzing chip number 7"
>> [1] "Analyzing chip number 8"
>> [1] "Analyzing chip number 9"
>> [1] "Analyzing chip number 10"
>> [1] "Analyzing chip number 11"
>> [1] "Analyzing chip number 12"
>> [1] "Substituting values"
>>
>>>
>>> ab
>>>
>>
>> AffyBatch object
>> size of arrays=712x712 features (9 kb)
>> cdf=MOE430A (22690 affyids)
>> number of samples=12
>> number of genes=22690
>> annotation=moe430a
>> notes=
>>
>>>
>>> library(Helpers)
>>> WriteAbatch(ab,prefix="hl")
>>>
>>
>> [1] "handling file: A23_mIntestine_KOWY7.CEL"
>> The file does not look like a CEL file in TXT format.
>> End of file reached unexpectedly. Perhaps this file is truncated.
>> [1] "An error occurre while trying to write to:
>> hl-A23_mIntestine_KOWY7.CEL"
>> [1] "handling file: A23_mIntestine_KOWY8.CEL"
>> The file does not look like a CEL file in TXT format.
>> # <snip>; the same error for all 12 arrays
>>
>> I then continued with using Affxparser as described above.
>>
>>>
>>> sessionInfo()
>>>
>>
>> R version 2.7.0 (2008-04-22) i386-pc-mingw32  locale:
>> LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
>> Kingdom.1252;LC_MONETARY=English_United
>> Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
>>  attached base packages:
>> [1] tools     stats     graphics  grDevices utils     datasets  methods
>> base      other attached packages:
>>  [1] Helpers_0.2-1        affydata_1.11.3      Biobase_2.0.1
>> affxparser_1.12.2    Harshlight_1.10.0    [6] altcdfenvs_2.2.0
>> hypergraph_1.12.0    graph_1.18.1
>> Biostrings_2.8.4     makecdfenv_1.18.0   [11] matchprobes_1.12.0
>> affy_1.18.0          preprocessCore_1.2.0
>> affyio_1.8.0                 loaded via a namespace (and not attached):
>> [1] cluster_1.11.10
>>
>>
>>
>> ------------------------------------------------ Guido Hooiveld, PhD
>> Nutrition, Metabolism & Genomics Group Division of Human Nutrition
>> Wageningen University Biotechnion, Bomenweg 2 NL-6703 HD Wageningen the
>> Netherlands tel: (+)31 317 485788 fax: (+)31 317 483342 internet:
>> http://nutrigene.4t.com <http://nutrigene.4t.com/>  email:
>>  guido.hooiveld at wur.nl
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list