[BioC] reduce soychip cel file size

Dianjing Guo djguo at vbi.vt.edu
Fri Nov 5 17:00:41 CET 2004

Hi Laurent,

Thanks so much for your great suggestion. I used the following expresso 
command and it worked! But i have one more question regarding the 
normalize.method choice: should i use "quantiles" or "quantiles.robust"? 
What's the difference between them? Following is my command:

 > eset<-expresso(data, normalize.method="quantiles", 
bgcorrect.method="pmonly", summary.method="medianpolish")

Also, I'd like to thank Robert Gentleman, Vince Carey, and Holger 
Schwender for their kind help regarding this issue

Laurent Gautier wrote:

> Robert Gentleman wrote:
>> On Thu, Nov 04, 2004 at 02:08:09PM -0500, Dianjing Guo wrote:
>>> We constantly experienced problems with rma function with soybean 
>>> chip. Since the possible reason being the chip is too huge, i wonder 
>>> whether there's a way to reduce the cel file size by taking only 
>>> part of the raw intensity info for normalization. Any one can 
>>> comment /addvise on that?
>>   That does not seem like a very good idea.
> That's right. This is _really_ not a good idea, unless you really know 
> the guts of the 'affy' package (there is a rewrite of some of the 
> package on its way that will make that this kind of tricks more easy, 
> but we are not there yet).
>>                                             I have not seen any
>>   postings that suggest that size is the issue; have you made them? 
> According to Dian-Jing's previous post, the segfault occurs when the 
> summary values are computed. I do not think either that the size is an 
> issue: the tough part for memory usage is usually the handling of 
> probe level data.
> Robert is probably right: there is memory leak or an array 
> out-of-bound problem. At first sight I think that the problem comes 
> from somewhere in 'do_RMA' (file rma2.c), but it is hard to tell 
> (comment on line 410 is a hint of an out-of-bound thing, but it refers 
> to a value '200' that I cannot see anywhere).
> If Dian-Jing is not into all, the use of 'expresso' (see my previous 
> mail) is segfault safe (currently at the cost of a bit of memory 
> usage, but this will improve very soon).
>>   None of this needs to be mysterious in any way.
>>   You should 1) make sure you have an up to date R, and an up to date
>>   version of the package. If you get errors, such as segmentation
>>   faults then you can use    R -d gdb   provided you have compiled R 
>> with the -g option (and if not then you
>>   will need to recompile it). From there you can track down the source
>>   of the bug and it can be fixed.
>>   For other bugs (such as problems in R code) there are options such
>>   as using debug etc.
>>   It is generally much better to figure out what is wrong, and why
>>   than to invent rather peculiar one-off solutions.
>>   Robert
>>> Many thanks,
>>> Dianjing
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor

More information about the Bioconductor mailing list