[BioC] Processing exon array in bioconductor

James W. MacDonald jmacdon at med.umich.edu
Thu Dec 18 22:06:03 CET 2008

Hi Jennifer,

Barb, Jennifer (NIH/CIT) [E] wrote:
> Hello All, I am fairly new to processing exon array CEL files using
> the Affy package in Bioconductor and I was trying to obtain RMA
> values for 6 chips on the Human Exon 1.0 st array and I received the
> following error message after the rma command was run:
>> eset<-rma(data)

The affy package is not designed to work with the exon arrays. There are 
three packages designed to analyze these arrays; oligo, xps, and 
exonmap. I just wrote a blurb that will (or maybe has) appear on the 
BioC website, showing what is required for such things. The relevant 
portion looks like this:

Affymetrix Exon ST Arrays:

  - Requires a pdInfoPackage built using pdInfoBuilder
  - This package collates cdf, probe, annotation data together
  - These packages are available from Bioconductor via biocLite()

  - Requires installation of MySQL and Ensembl core database tables
  - Requires specially modified cdf (available at
    http://xmap.picr.man.ac.uk/download/) and affy package

  - Requires installation of ROOT
  - Uses data files from Affymetrix (.CDF, .PGF, .CLF, .CSV) directly

The part about the pdInfoPackage being available appears not to be quite 
true as yet. We built them, but there were a couple of technical aspects 
that need to get fixed before they will be downloadable via biocLite(). 
And I have no idea how long that might take, although if Marc Carlson 
sees this he may chime in.

In the interim you could build your own using pdInfoBuilder. However, 
for the exon arrays you will need a 64-bit linux box with something like 
8 Gb RAM to use either oligo or exonmap, due to the amount of memory 
these things require. You might be able to use xps on Windows with say 4 
Gb RAM, but I am not sure about that.

Another R solution is aroma.affymetrix which has its own mailing list - 
you can google to find more information about that.



> Error in getCdfInfo(object) : Could not obtain CDF environment,
> problems encountered: Specified environment does not contain
> HuEx-1_0-st-v2 Library - package huex10stv2cdf not installed 
> Bioconductor - could not connect In addition: Warning message: In
> readLines(biocURL) : cannot open: HTTP status was '503 Service
> Unavailable'
> I am under the impression that exon arrays have different library
> files than the traditional cdf files used for the IVT expression
> arrays.  The 4 library files for the exon array exist with the
> following extensions:  .mps, .clf, .pgf, .qcc.  Is there something
> special needed for processing exon arrays in bioconductor? Thank you,
>  Jennifer
> [[alternative HTML version deleted]]
> _______________________________________________ Bioconductor mailing
> list Bioconductor at stat.math.ethz.ch 
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor

James W. MacDonald, M.S.
Hildebrandt Lab
1150 W. Medical Center Drive
Ann Arbor MI 48109-0646

More information about the Bioconductor mailing list