[Rd] save.image Non-responsive to Interrupt

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Tue May 2 15:28:48 CEST 2023


>>>>> Ivan Krylov 
>>>>>     on Tue, 2 May 2023 14:59:36 +0300 writes:

    > В Sat, 29 Apr 2023 00:00:02 +0000
    > Dario Strbenac via R-devel <r-devel using r-project.org> пишет:

    >> Could save.image() be redesigned so that it promptly responds to
    >> Ctrl+C? It prevents the command line from being used for a number of
    >> hours if the contents of the workspace are large.

    > This is ultimately caused by serialize() being non-interruptible. A
    > relatively simple way to hang an R session for a long-ish time would
    > therefore be:

    > f <- xzfile(nullfile(), 'a+b')
    > x <- rep(0, 1e9) # approx. 8 gigabytes, adjust for your RAM size
    > serialize(x, f)
    > close(f)

    > This means that calling R_CheckUserInterrupt() between saving
    > individual objects is not enough: R also needs to check for interrupts
    > while saving sufficiently long vectors.

    > Since the serialize() infrastructure is carefully written to avoid
    > resource leaks on allocation failures, it looks relatively safe to
    > liberally sprinkle R_CheckUserInterrupt() where it makes sense to do
    > so, i.e. once per WriteItem() (which calls itself recursively and
    > non-recursively) and once per every downstream for loop iteration.
    > Valgrind doesn't show any new leaks if I apply the patch, interrupt
    > serialize() and then exit. R also passes make check after the applied
    > patch.

    > Do these changes make sense, or am I overlooking some other problem?

Thank you, Ivan!

They do make sense... but :

OTOH, in the past we have had to *disable*  R_CheckUserInterrupt()
in parts of R's code because it was too expensive,
{see current src/main/{seq.c,unique.c}  for a series of commented-out
 R_CheckUserInterrupt() for such speed-loss reasons}

so  adding these may need a lot of care when we simultaneously
want to remain  efficient for "morally valid" use of serialization...
where we really don't want to pay too much of a premium.

{{ saving the whole user workspace is not "valid" in that sense
   in my view.  I tell all my (non-beginner) Rstudio-using
   students they should turn *off* the automatic saving and
   loading at session end / beginning; and for reproducibility
   only saveRDS() [or save()] *explicitly* a few precious
   objects ..
}}

Again, we don't want to punish people who know what they are
doing, just because other R users manage to hang their R session
by too little thinking ... 

Your patch adds 15 such interrupt checking calls which may
really be too much -- I'm not claiming they are: with our
recursive objects it's surely not very easy to determine the
"minimally necessary" such calls.

In addition, we may still consider adding an extra optional
argument, say   `check.interrupt = TRUE`
which we may default to TRUE when  save.image() is called
but e.g., not when serialize() is called..

Martin

    > -- 
    > Best regards,
    > Ivan
    > x[DELETED ATTACHMENT external: Rd_IvanKrylov_interrupt-serialize.patch, text/x-patch]
    > ______________________________________________
    > R-devel using r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list