[Rd] tempdir() may be deleted during long-running R session

Martin Maechler maechler at stat.math.ethz.ch
Wed Apr 26 10:21:39 CEST 2017


>>>>>   <frederik at ofb.net>
>>>>>     on Tue, 25 Apr 2017 21:13:59 -0700 writes:

    > On Tue, Apr 25, 2017 at 02:41:58PM +0000, Cook, Malcolm wrote:
    >> Might this combination serve the purpose: 
    >> * R session keeps an open handle on the tempdir it creates, 
    >> * whatever tempdir harvesting cron job the user has be made sensitive enough not to delete open files (including open directories)

I also agree that the above would be ideal - if possible.

    > Good suggestion but doesn't work with the (increasingly popular)
    > "Systemd":

    > $ mkdir /tmp/somedir
    > $ touch -d "12 days ago" /tmp/somedir/
    > $ cd /tmp/somedir/
    > $ sudo systemd-tmpfiles --clean
    > $ ls /tmp/somedir/
    > ls: cannot access '/tmp/somedir/': No such file or directory

Some thing like your example is what I'd expect is always a
possibility on some platforms, all of course depending on low
things such as  root/syadmin/...  "permission" to clean up etc.

Jeroeen mentioned the fact that tempdir()s also can disappear
for other reasons {his was multicore child processes
.. bugously(?) implemented}.
Further reasons may be race conditions / user code bugs / user
errors, etc.
Note that the R process which created the tempdir on startup
always has the permission to remove it again.  But you can also
think a full file system, etc.

Current  R-devel's    tempdir(check = TRUE)   would create a new
one or give an error (and then the user should be able to use
    Sys.setenv("TEMPDIR" ...)
    to a directory she has write-permission )

Gabe's point of course is important too: If you have a long
running process that uses a tempfile,
and if  "big brother"  has removed the full tempdir() you will
be "unhappy" in any case.
Trying to prevent big brother from doing that in all cases seems
"not easy" in any case.

I did want to provide an easy solution to the OP situation:
Suddenly tmpdir() is gone, and quite a few things stop working
in the current R process {he mentioned  help(), e.g.}.
With new   tmpdir(check=TRUE)  facility, code could be changed
to replace

   tempfile("foo")

either by
   tempfile("foo", tmpdir=tempdir(check=TRUE))

or by something like

   tryCatch(tempfile("foo"),
             error = function(e)
	        tempfile("foo", tmpdir=tempdir(check=TRUE)))

or be even more sophisticated.

We could also consider allowing   check =  TRUE | NA | FALSE

and make  NA  the default and have that correspond to
check =TRUE  but additionally do the equivalent of
   warning("tempdir() has become invalid and been recreated")
in case the tempdir() had been invalid.

    > I would advocate just changing 'tempfile()' so that it recreates the
    > directory where the file is (the "dirname") before returning the file
    > path. This would have fixed the issue I ran into. Changing 'tempdir()'
    > to recreate the directory is another option.

In the end I had decided that

      tempfile("foo", tmpdir = tempdir(check = TRUE))

is actually better self-documenting than

      tempfile("foo", checkDir = TRUE)

which was my first inclination.

Note again that currently, the checking is _off_ by default.
I've just provided a tool -- which was relatively easy and
platform independent! --- to do more (real and thought)
experiments.

Martin



More information about the R-devel mailing list