[BioC] Can't normalize 300+ HuGene arrays in xps

cstrato cstrato at aon.at
Tue Aug 31 21:51:34 CEST 2010


Dear Mike,

Thank you very much for your efforts.

The error message explains the other error messages, because ROOT tries 
to create the trees in RAM since it can no longer write to the file.

However, as you have 95GB free on your hard disk, it is not quite clear 
why the trees can no longer be written to the ROOT file. At the moment 
my only hint is that the hard disk could be too fragmented. In any case 
I will ask the ROOT developers if they have an explanation.

Best regards
Christian


On 8/31/10 11:15 AM, Mike Walter wrote:
> Hi Christian,
>
> Changing the name of the file into"dataRMA" didn't help. So I captured the beginning of the error message just after calculation of common mean finished. It says:
>
> SysError in<TFile::WriteBuffer>: error writing to file F:/Auswertung/GENEPI_combined/dataRMA.root (-1) (No space left on device)
>
> So it seems to be an storage problem. However, there are still 95Gb free on my hard disk (F:). The drive, where I run ROOT and R has approx. 11Gb free disk space. I will try another drive with more than 200Gb space and see if the error still occurs.
>
> Best, Mike
>
> -----Ursprüngliche Nachricht-----
> Von: cstrato<cstrato at aon.at>
> Gesendet: 30.08.2010 21:37:44
> An: Mike Walter<michael_walter at email.de>
> Betreff: Re: [BioC] Can't normalize 300+ HuGene arrays in xps
>
>> Dear Mike,
>>
>> First, I am glad to hear that the stepwise approach did finally work.
>>
>> Thank you also for sending me the screenshot which repeats the following
>> message many times:
>>
>> This error is symptomatic of a Tree created as a memory-resident Tree
>> Instead of doing:
>>     TTree *T = new TTree(...);
>>     TFile *f = new TFile(...);
>> you should do:
>>     TFile *f = new TFile(...);
>>     TTree *T = new TTree(...);
>>
>> Since I create always TFile first before creating new TTree(s) this
>> means that for some reason the connection to TFile got lost so that the
>> trees are kept in RAM. If you have only 6 trees this is no problem but
>> with 324 trees you get this error message. Sadly, the beginning of the
>> error messages are lost so that I do not know whether TFile was created
>> or not.
>>
>> Thus, at the moment I have no idea what might be the reason for this
>> problem and until now this error has never been reported.
>>
>> I would really appreciate if you could you try to run rma() with
>> 'filename = "dataRMA"' instead of 'filename = "tmpdt_dataRMA"' and let
>> me know if the problem remains.
>>
>> Best regards
>> Christian
>>
>>
>> On 8/30/10 1:23 PM, Mike Walter wrote:
>>> Dear Christian,
>>>
>>> Thanks for your help. To answer your questions first: I normally use RGui and my disk space was ~100Gb. I also tried the add.data=FALSE option, without success.
>>>
>>> So I did RMA normalization with 6 arrays in RTerm as you proposed. This worked fine. So I just tried to run RMA on all arrays on RTerm. Here, I got thousands of error messages after the "compution common mean" step was finished for all arrays. After approx. 20min of error messages scrolling over my screen windows ended R, so I couldn't copy any output. I made a screenshot, which is attached (although it might not make it into the BioC list).
>>>
>>> Therefore, I tried the stepwise approach in RTerm. To my great surprise, now everything worked fine. There was no error when I started the quantile normlization with the same code as before (except the verbose=TRUE). The median polish afterwards also worked. The output of RTerm is pasted below.
>>>
>>> So again, thank you very much for your help.
>>>
>>> Kind regards,
>>>
>>> Mike
>>>
>>>
>>>> data.norm = normalize.quantiles(data.bkgd, filename = "quantile", filedir = $
>>> + tmpdir = "", update = FALSE, exonlevel = exonlevel, verbose = TRUE)
>>> Opening file<X:/affy/QC_Scripts/xps/schemes/Scheme_HuGene10stv1r4_na30_hg19.roo
>>> t>   in<READ>   mode...
>>> Creating new file<F:/Auswertung/GENEPI_combined/quantile.root>...
>>> Opening file<F:/Auswertung/GENEPI_combined/bkgd_correct.root>   in<READ>   mode...
>>>
>>> Preprocessing data using method...
>>>    Normalizing raw data...
>>>    normalizing data using method...
>>>    setting selector mask for typepm<9216>
>>>    finished filling<324>   arrays.
>>>    computing common mean...
>>>    finished filling<324>   trees.
>>>    preprocessing finished.
>>>> save.image("F:/Auswertung/GENEPI_combined/GENEPI_all_stepwise.RData")
>>>> data.mp = summarize.rma(data.norm, filename = "medianpolish", filedir = getw$
>>> +   update = FALSE, option = "transcript", exonlevel = exonlevel, xps.scheme =$
>>> Opening file<X:/affy/QC_Scripts/xps/schemes/Scheme_HuGene10stv1r4_na30_hg19.roo
>>> t>   in<READ>   mode...
>>> Creating new file<F:/Auswertung/GENEPI_combined/medianpolish.root>...
>>> Opening file<F:/Auswertung/GENEPI_combined/quantile.root>   in<READ>   mode...
>>> Preprocessing data using method...
>>>    Converting raw data to expression levels...
>>>    summarizing with<medianpolish>...
>>>    setting selector mask for typepm<9216>
>>>    setting selector mask for typepm<9216>
>>>    calculating expression for<28829>   of<33664>   units...Finished.
>>>    expression statistics:
>>>    minimal expression level is<3.11771>
>>>    maximal expression level is<20015.1>
>>>    preprocessing finished.
>>> Opening file<X:/affy/QC_Scripts/xps/schemes/Scheme_HuGene10stv1r4_na30_hg19.roo
>>> t>   in<READ>   mode...
>>> Opening file<F:/Auswertung/GENEPI_combined/medianpolish.root>   in<READ>   mode...
>>>
>>> Opening file<F:/Auswertung/GENEPI_combined/medianpolish.root>   in<READ>   mode...
>>>
>>> Exporting data from tree<*>   to file<F:/Auswertung/GENEPI_combined/medianpolish
>>> .txt>...
>>> Reading entries from<HuGene-1_0-st-v1.ann>   ...Finished
>>> <28829>   of<28829>   records exported.
>>>
>>>
>>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: cstrato<cstrato at aon.at>
>>> Gesendet: 27.08.2010 21:05:46
>>> An: Mike Walter<michael_walter at email.de>
>>> Betreff: Re: [BioC] Can't normalize 300+ HuGene arrays in xps
>>>
>>>> Dear Mike,
>>>>
>>>> In case that your problem turns out to be a memory-related problem, you
>>>> can use rma(...,add.data=FALSE,..), which will prevent filling slot
>>>> "data" with the expression levels. You can then import all normalized
>>>> data or parts thereof using "export.expr()" or "root.expr()", as the
>>>> help files show.
>>>>
>>>> Thus you could first run rma and then import the results in a separate step:
>>>>
>>>> ## rma
>>>>> data.rma<- rma(data.xps, "tmpdt_dataRMA", background="antigenomic",
>>>> normalize=T, exonlevel=exonlevel,  add.data=FALSE, verbose = TRUE)
>>>>
>>>> ## import subset of trees:
>>>> ds<- export.expr(data.rma, treenames=c("name1.mdp","name3.mdp", etc),
>>>> treetype="mdp", varlist="fUnitName:fSymbol:fLevel", outfile="tmp.txt",
>>>> as.dataframe=TRUE)
>>>>
>>>> ## use subset of trees
>>>>> sub.rma<- root.expr(scheme.test3, "tmpdt_dataRMA.root", "mdp",
>>>> c("name1.mdp", "name2", etc))
>>>>> str(sub.rma)
>>>>
>>>> Maybe after starting a new R-session, you are able to import all trees
>>>> with "treenames='*'".
>>>>
>>>> Please let me know if this could solve your problem.
>>>>
>>>> Best regards
>>>> Christian
>>>>
>>>>
>>>> On 8/27/10 3:35 PM, Mike Walter wrote:
>>>>> Hi all,
>>>>>
>>>>> I have a set of 324 HuGene 1.0 arrays I'd like to normalize all in one batch on a "normal" Windows computer. I allready normalized the arrays in two sets of 180 and 144 samples successfully with xps. When I apply the code below to put the samples all together, my R session just crashes.
>>>>>
>>>>> library(xps)
>>>>> memory.limit(size=3000) # I modyfied my boot.ini to allow more memory. At least I hope it works.
>>>>> exonlevel=rep((8192+1024),3)
>>>>> scheme="Scheme_HuGene10stv1r4_na30_hg19.root"
>>>>> gene.scheme<- root.scheme(paste("X:/affy/QC_Scripts/xps/schemes",scheme,sep="/"))
>>>>> data.xps = root.data(gene.scheme, paste(getwd(),"Genepi_all_cel.root",sep="/"))
>>>>> data.rma<- rma(data.xps, "tmpdt_dataRMA", background="antigenomic", normalize=T,
>>>>>                        exonlevel=exonlevel, verbose = FALSE)
>>>>>
>>>>>
>>>>> Thus, I tried to do the RMA stepwise. I succeeded in background correction, but get some error when trying to do the quantile normalization:
>>>>>
>>>>> data.bkgd = bgcorrect.rma(data.xps, filename = "bkgd_correct",
>>>>>                        filedir = getwd(), tmpdir = "", update = FALSE,
>>>>>                        select = "antigenomic", exonlevel = exonlevel, verbose = FALSE)
>>>>>
>>>>> data.norm = normalize.quantiles(data.bkgd, filename = "quantile", filedir = getwd(),
>>>>>                         tmpdir = "", update = FALSE, exonlevel = exonlevel, verbose = FALSE)
>>>>>
>>>>> OR
>>>>>
>>>>>
>>>>> data.norm = normalize(data.bkgd, "quantile", filedir=getwd(), tmpdir="",
>>>>>                        method="quantile", select="pmonly", option="transcript:together:none",
>>>>>                        logbase="0", params=c(0.0), exonlevel=exonlevel)
>>>>>
>>>>>
>>>>> in both cases the output is "Fehler in .local(object, ...) : error in function ‘Normalize’". I guess it is only a wrong option somewhere. I also tried exonlevel="metacore+affx" with same result. Can anyone give me a hint, what might be missing?
>>>>>
>>>>> Thank you very much.
>>>>>
>>>>> Best,
>>>>> Mike
>>>>>
>>>>>> sessionInfo()
>>>>> R version 2.10.1 (2009-12-14)
>>>>> i386-pc-mingw32
>>>>>
>>>>> locale:
>>>>> [1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252
>>>>> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
>>>>> [5] LC_TIME=German_Germany.1252
>>>>>
>>>>> attached base packages:
>>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>
>>>>> other attached packages:
>>>>> [1] xps_1.6.4
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>> [1] tools_2.10.1
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>



More information about the Bioconductor mailing list