[BioC] Experiment export in Gene Expression Omnibus (GEO) SOFT format

Sean Davis sdavis2 at mail.nih.gov
Fri May 26 12:29:48 CEST 2006




On 5/26/06 3:17 AM, "Henrik Hornshøj Jensen" <HenrikH.Jensen at agrsci.dk>
wrote:

> Thank you for clearing this up.
> To me it seems obvious to do the SOFT export in R as well.

The main problem with doing so is that the raw data will typically not be
included if done from R.  The raw data is, in my mind, much more important
than any normalized or processed data, as re-normalization of raw data is
easy, while the usefulness of the normalized data is very limited (likely
limited to only the project at hand).

> Perhaps you could send the perl/R scripts you have been using.

I could, but they are not in a "distributable form".  We have plans to make
them slightly more useful and general, but we don't really have a goal of
releasing them.  Again, generality is a difficult-to-attain goal.
Essentially, what we do is to construct the SOFT format header based on a
template and fill the template from an Excel spreadsheet--R or perl could be
used for this.  After the header, we concatenate the raw tab-delimited text
file, then do the same for all the datafiles associated with an experiment.
SOFT is nice in that all of this text is simply concatenated.  There are
examples of the types of headers that one needs to fill located on the batch
deposit guide on the GEO website.

Sean



More information about the Bioconductor mailing list