[Rd] save() and interrupts

Henrik Bengtsson hb at stat.berkeley.edu
Mon Apr 16 21:58:15 CEST 2007


On 4/16/07, Luke Tierney <luke at stat.uiowa.edu> wrote:
> On Mon, 16 Apr 2007, Bill Dunlap wrote:
>
> > On Sun, 15 Apr 2007, Henrik Bengtsson wrote:
> >
> >> On 4/15/07, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> >>> On Sun, 15 Apr 2007, Henrik Bengtsson wrote:
> >>>
> >>>> are there any (cross-platform) specs on what the saved filed is if
> >>>> save() is interrupted, e.g. by a user interrupt?   It could be
> >>>> non-existing, empty, partly written, or completed.
> >>>
> >>> My understanding is that you cannot user interrupt compiled code unless it
> >>> is set up to check interrupts.  Version 2 saves are done via the internal
> >>> saveToConn, and I don't see any calls to R_CheckUserInterrupt there. So
> >>> you only need to worry about user interrupts in the R code, and that has
> >>> an on.exit action to close the connection (which should be executed even
> >>> if you interrupt).  Which suggests that the file will be
> >>>
> >>> non-existent
> >>> empty
> >>> complete
> >>>
> >>> and the first two depend on interrupting in the millisecond or less before
> >>> the compiled code gets called.
> >>
> >> I'll put it on my todo list to investigate how to make save() more
> >> robust against interrupts before calling the internal code.  One
> >> option is to use tryCatch().  However, that does not handle too
> >> frequent user interrupts, e.g. if an interrupt is sent while in the
> >> "interrupt" call, that will interrupt the function.  So, tryCatch()
> >> alone will only lower the risk for incomplete empty files.  For data
> >> written to files, one alternative is to check for files of zero size
> >> in the on.exit() statement and remove such.
> >>
> >> /Henrik
> >>>
> >>> For other forms of interrupts, e.g. a Unix kill -9, the file state could
> >>> be anything.
> >>>
> >>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> >>> ...
> >
> > You could change the code to write to a temporary
> > file (in the directory you want the result in) and
> > when you successfully finish writing to the file
> > you rename it to the permanent name.  (On an interrupt
> > you remove the temp file, and on 'kill -9' the only
> > bad effect is the space used by the partially written
> > temp file.)  This has the added advantage that you don't
> > overwrite an existing save file by the given name until
> > you know a suitable replacement is ready.
> >
> > Perhaps we need a connection type that encapsulates this.
> >
> > ----------------------------------------------------------------------------
> > Bill Dunlap
> > Insightful Corporation
> > bill at insightful dot com
> > 360-428-8146
> >
> > "All statements in this message represent the opinions of the author and do
> > not necessarily reflect Insightful Corporation policy or position."
>
> We do this with save.image.  Since save is a little more general it is
> a bit less obvious what the right way to do this sort of thing is, or
> whether there is a single right way.  I think if I was concerned about
> this I would write something around the current save for particular
> kinds of connections rather than changing save itself.  The main
> reason for taking a different rout with save.image is that that gets
> called implicitly by q().
>
> [our current ability to manage user interrupts is not ideal--hopefully
> we can make a bit of progress on this soon.]

I was thinking about this last night:  It would be useful to have a
feature/construct to evaluate an R expression atomically where user
interrupts will *not have an affect until afterwards*, cf. calls to
native code.  This would solve the problem of getting interrupts while
in a tryCatch(..., interrupt=..., finally=...).  Of course this
requires caution by the programmer, but it is also unlikely to be used
by someone who do not know what the risks are.  I do not know the
different signals available, but one could consider such atomic calls
to be protected against different levels of signals.  In addition, one
could have an optional threshold of the number of interrupt signals it
takes to (even) interrupt an atomic evaluation.

/Henrik

>
> Best,
>
> luke
>
> --
> Luke Tierney
> Chair, Statistics and Actuarial Science
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa                  Phone:             319-335-3386
> Department of Statistics and        Fax:               319-335-3017
>     Actuarial Science
> 241 Schaeffer Hall                  email:      luke at stat.uiowa.edu
> Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
>



More information about the R-devel mailing list