[BioC] Large number of CEL files!!!

Adaikalavan Ramasamy ramasamy at cancer.org.uk
Wed Mar 9 00:23:18 CET 2005


Do you want to plot the data before or after preprocessing ? The current
maximum features are 242 million (=55000x200x22) and 11 million
(=55000x200). Also do you want to investigate the distribution of each
array/column or look at overall distribution.

With my Pentium 4 at 1.6 GHz and 512 RAM, I can do a hist() or boxplot()
on the pre-processed dataset.

 mat <- matrix( rnorm(55000*200), nc=200 )
 library(fields)
 system.time( bplot(mat) )
[1] 16.21  1.23 23.83  0.00  0.00

But the real problem is that there are too many data points on the
graphs that makes each array difficult to see.

I think it would be better to read in, say 25-50 arrays at a time and
plot their distribution. Besides being less memory intensive, the
graphics may look well spaces for you to look at.

Regards, Adai



On Tue, 2005-03-08 at 11:10 -0800, Hrishikesh Deshmukh wrote:
> Hi All,
> 
> I have 200 CEL files and i want to use bioconductor to
> read these files and then do simple things like
> hist(),boxplot()! I think i will run into memory
> issues!
> Any suggestions as to how to handle this problem?
> 
> Thanks in advance.
> Hrishi
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>



More information about the Bioconductor mailing list