[BioC] xps - creating ROOT scheme fi les
beniltoncarvalho at gmail.com
Wed Jun 16 01:52:53 CEST 2010
if you load the 'ff' package prior to reading your CEL files, 'oligo'
(the latest) will use its resources to better manage RAM.
This is still experimental, but has shown great results for genotyping
(via 'crlmm' package using the 'crlmm2' function).
Anyways, once you run rma() on this ExpressionFeatureSet object that
uses 'ff', the expression matrix in the resulting object is an 'ff'
matrix. Because the expression matrix has dimensions much smaller than
the raw data, it is possible that one may be able to handle that in
RAM. In this case, I'd say to use:
expressionMatrix = exprs(rmaResult)
## note the  at the end of the line
This is when them RAM would be used without restrictions and possibly
when one may run out of RAM.
If RAM is still an issue after preprocessing, I'd say to write the
expression matrix to disk (say a tab-delim file) and process batches
afterwards. Something along the lines of:
write.table.ffdf(exprs(rmaResult), file="results.txt", quote=FALSE, sep="\t")
This is all valid for the latest oligo package, running on R-2.11.x.
I'm happy to hear back from you on this (even off-list) and get some
suggestions for further improvement.
On 16 June 2010 00:36, Mark Cowley <m.cowley0 at gmail.com> wrote:
> Hi Benilton,
> i'm using oligo in a processing pipeline which runs out of RAM with >15 exon arrays, so I was considering switching to xps. However, i'm intrigued by your comment re 'if you think you have enough RAM for the expression matrix'. Are you implying that you can use oligo on quite large numbers of exon arrays & that it's only when you create the expression object that you'll run out of RAM?
> Mark Cowley, PhD
> Peter Wills Bioinformatics Centre
> Garvan Institute of Medical Research, Sydney, Australia
> On 15/06/2010, at 9:11 PM, Benilton Carvalho wrote:
>> Dear John,
>> Indeed, xps is one solution. If you're willing to try another
>> approach, you can also use the oligo package. It's experimental, as
>> described in one of the vignettes, but the following should get you
>> ## enable large dataset management
>> ## set a place for temp files
>> ## otherwise it'll use the current dir
>> cels = list.celfiles()
>> raw = read.celfiles(cels)
>> core = rma(raw, target="core")
>> exprsCore = exprs(core)
>> ## save in a tab-delim file
>> write.table.ffdf(as.ffdf(exprsCore), file="core.txt", sep="\t", quote=FALSE)
>> ## if you think you have enough RAM for the expression matrix
>> expression = exprsCore
>> In case you observe anything unexpected, let me know.
>> On 15 June 2010 11:43, John Coulthard <bahhab at hotmail.com> wrote:
>>> Dear list
>>> I've got 6x2 Human Exon 1.0 ST arrays to analyse and 3gB of ram so I believe I need to use xps.
>>> The xps vignette, appendix A.1. says...
>>> "we need to create ROOT scheme fi les directly from the A ymetrix source files, which need to be downloaded
>>> first from the A ymetrix web site."
>>> I think the web page I need to download from is...
>>> but there is no file that ends with ‘annot.csv’. Should I rename one of the other annotation files? Which one?
>>> The CDF file on this page says 'unsupported' so maybe I should be looking for CLF-, PGF-fi les, but they're not on this page either.
>>> Can anyone help me out with what to download from where?
>>> Hotmail: Powerful Free email with security by Microsoft.
>>> [[alternative HTML version deleted]]
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor