[BioC] affy: expresso in separated steps

cstrato cstrato at aon.at
Thu Oct 15 22:22:55 CEST 2009


Dear Kenji,

Maybe you could use package xps, which has a similar function "express" 
which allows you to do normalization stepwise and save interim results 
as text files for reuse, see e.g. the recent vignette:
http://bioconductor.org/packages/2.5/bioc/vignettes/xps/inst/doc/xpsPreprocess.pdf
and the script in xps/examples/script4xpsPreprocess.R

Best regards
Christian
_._._._._._._._._._._._._._._._._._
C.h.r.i.s.t.i.a.n   S.t.r.a.t.o.w.a
V.i.e.n.n.a           A.u.s.t.r.i.a
e.m.a.i.l:        cstrato at aon.at
_._._._._._._._._._._._._._._._._._


Leonardo K. Shikida wrote:
> Hi James
>
> thanks for the fast answer
>
> I am afraid I can't do that. The idea here is to reuse some other
> normalization methods (not implemented in R), so I'd have to, somehow,
> save these intermediary results, perform another normalization method,
> then restore this normalized data to perform summarization, etc
>
> The problem, as you've pointed out, is that affy abstracts the
> internal data structure to make my life easier. My work will probably
> need to deal with this internal structure somehow.
>
> Maybe I could just save the object, export PM and MM data as CSV,
> perform the normalization, then restore the object using the load
> command and overwrite its PM and MM data with the normalized CSV
> files...
>
> Sounds like an horrible way to deal with this situation :-) so I am
> open to better ideas...
>
> []
>
> Kenji
>
>
>
> On Thu, Oct 15, 2009 at 5:00 PM, James W. MacDonald
> <jmacdon at med.umich.edu> wrote:
>   
>> Hi Kenji,
>>
>> Leonardo K. Shikida wrote:
>>     
>>> Hi
>>>
>>> I'd like to know how to perform affy expresso in separate steps
>>>
>>> for example
>>>
>>> what I'd like is
>>>
>>> CEL data => bg correction => save corrected data into a file X
>>> load file X => normalization => save normalized data into file Y
>>> load file Y => summarization => save summarized data into file Z
>>>       
>> I wouldn't save things in files. The objects designed to contain your data
>> are pretty complex, but are designed to make manipulation of your data
>> simple. If you write out to files you increase the complexity of dealing
>> with your data and lose all of the nice functions designed to make your life
>> simpler.
>>
>> You can instead keep your data in an AffyBatch (until you summarize) and
>> just save the objects as you go through your process. For instance:
>>
>> dat <- ReadAffy()
>> bgdat <- bg.correct(dat, method)
>>
>> ## for methods see bgcorrect.methods()
>>
>> normdat <- normalize(bgdat, method)
>>
>> ## for methods see normalize.methods(dat)
>>
>> eset <- computeExprSet(normdat, summary.method = method, pmcorrect.method =
>> pmmethod)
>>
>> ## for summary and pmcorrect methods see
>> express.summary.stat.methods()
>> pmcorrect.methods()
>>
>>
>>     
>>> and so on
>>>
>>> it's not clear to me
>>>
>>> [1] how to access these intermediary datasets. should I save both
>>> pm(Data) and mm(Data)?
>>> [2] if the only thing I need is the intermediary dataset or if I need
>>> anything alse such as platform info (CDF files for example)
>>>       
>> You will need a cdf package. If you are using a commercially available chip
>> and just want to use the 'regular' Affy cdf, then you don't need to do
>> anything. If you don't have the required package it will be downloaded for
>> you. If you want to use a different cdf, there is the cdfname argument to
>> ReadAffy (if BioC has these cdfs; an example would be the MBNI cdfs). If the
>> chip isn't commercial, you will need to get the cdf from Affy, build a
>> package using the makecdfenv package, and then build and install yourself.
>>
>> Best,
>>
>> Jim
>>
>>
>>     
>>> I hope I've been clear about my doubt
>>>
>>> thanks in advance
>>>
>>> Kenji
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>       
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> Douglas Lab
>> University of Michigan
>> Department of Human Genetics
>> 5912 Buhl
>> 1241 E. Catherine St.
>> Ann Arbor MI 48109-5618
>> 734-615-7826
>>
>>     
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list