[R] R: how to avoid a script from hanging up

jim holtman jholtman at gmail.com
Mon Aug 3 00:14:21 CEST 2009


You can use save/save.image to save the objects in your workspace that
you might need to recover from.  I don't think setting environment
variable will carry over to the next execution of an R session.  It is
probably best to create a parameter file that you can read in to
determine what to do next.

On Sun, Aug 2, 2009 at 5:27 PM, <mauede at alice.it> wrote:
> Thank you very much.
> The instructions you suggested allow the script itself to decide whether to
> exit spontaneously.
> What I am still missing is how to prevent the script from restarting from
> scratch.
> I'll  try to explain my problem a little bit better.
> Please, assume I have 3 huge data.frames called d1,d2,d3
> My script searched d1 first and d2, d3 looking for matching fields with d1.
> A very simplified version of my script looks like the following:
>
>  while (length(d1) > 0) {#............................ LOOP ON  MirBase
> miRNAs LIST ........................
>    if (length (which (d2[] == d1[1])) > 0) {
>       tmp  <- d2[which(d2[] == d1[1])]
>       FromWhere <- "d2"
>      <PROCESS ALL DATA in tmp>
>    }else if (length(which (d2[] == d1[1])) > 0) {
>       tmp  <- d3[which(d3[] == d1[1])]
>       FromWhere <- "d3"
>      <PROCESS ALL DATA in tmp>
>    }else {
>      missing_miRNA <- d1[1]
>      cat("\n miRNA: ",missing_miRNA,"  NOT FOUND IN d2 OR IN d3 \n")
>      flush(stdout())
>      d1  <- d1[-c(which(d1[] == d1[1]))]
>      next
>    }
>
> Please, notice also the sub-session <PROCESS ALL DATA in tmp>  ends with the
> instructions that remove from d1
> all the entries whose name is the same. Maybe it is not clear from my
> concise version. In reality d1,d2,d3 have many fields.
> d1 can have many rows whose field "miRNA_ID" is the same, but the other
> fields contain different  pointers.
> For the sake of freeing some memory, I remove all the rows that pertain to
> the same miRNA_ID regardless whether the current
> miRNA has been found in d2 or d3 or not.
> Actually, perhaps I can operate a similar dynamic data pruning on d2 .. but
> I have first to check all the links carefully to avoid deleting data
> that may pertain to different miRNAs.
>
> In short, if the script handles its own exit through the instructions you
> suggested, even saving the environment variables, when the script is
> run again it will restart processing d1,d2,d3 from scratch.
> I am dreaming of a re-entrant process. Maybe I should use some environment
> (system) variables, not just the program variables, to save the necessary
> information to make next run a process resumption rather than a re-run
> Is it possible to access / create environment variables from inside an R
> script ?
>
> Regards,
> Maura
>
>
> -----Messaggio originale-----
> Da: jim holtman [mailto:jholtman at gmail.com]
> Inviato: dom 02/08/2009 21.54
> A: mauede at alice.it
> Cc: r-help at stat.math.ethz.ch
> Oggetto: Re: [R] how to avoid a script from hanging up
>
> You can use 'try' to catch errors and take corrective action.
> 'memory.size' and 'proc.time' will give you information on the memory
> usage of your application and the CPU time that has been used.
>
> On Sun, Aug 2, 2009 at 2:02 PM, <mauede at alice.it> wrote:
>> I am submitting this problem to the  R forum , rather than the
>> Bioconductor forum, because its nature is closer to programming style than
>> any  Bioinformatic contents.
>> I have implemented an R script to extracts many strings  through querying
>> 3 Bioinformatic databases in the same loop cycle. Ideally, the script should
>> perform as many cycles as necessary to extract all available data of
>> interest.
>> Inevitably it triggers a BioMart exception after running many cycles in a
>> row. The exception seems to be independent of the script instructions
>> because if I restart the script from the point where it got interrupted then
>> it runs for another while, extracting also the data where the exception
>> occurred with no problem at all.
>> Sometimes, though, the script does not respond any more, it hangs up, even
>> if no exception has apparently occurred, and the only way to regain control
>> is to kill the R process. This way I lose memory of how many data have been
>> processed and stored to disk files (unless I manually count them ... there
>> are thousands ..). If I restart the script then it restarts processing the
>> data strings from scratch. I guess it may be a memory problem as the task
>> manager (Windows/XP) shows that the hung-up R script is taking more than 70%
>> of the available RAM.
>> I wonder whether there is any system command to make the script self-aware
>> of its memory requirements and running time.
>> Ideally the script should be able to trap the exception and be sensitive
>> to its current RAM / CPU time requirements, self-exit after freezing and
>> saving the current program status so that when rerun it would not restart
>> from scratch but rather pick up from where it exited.
>> Maybe this is asking too much from a non-compiled language ?
>>
>> Thank you in advance,
>> Maura
>>
>>
>> tutti i telefonini TIM!
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
>
>
> Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e
> tutti i telefonini TIM!
> Vai su http://maileservizi.alice.it/alice_messenger/index.html?pmk=footer



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?




More information about the R-help mailing list