[Rd] [R-pkg-devel] Run garbage collector when too many open files

luke-tier@ey m@ili@g off uiow@@edu luke-tier@ey m@ili@g off uiow@@edu
Tue Aug 7 17:07:28 CEST 2018


In R 3.5 and later you should not need to gc() -- that should happen
automatically within the connections code.

Nevertheless, I would recommend redesigning your approach to avoid
hanging onto open file connections as these are a scarce resource.
You can keep around your temporary files without having them open and
only open/close them on access, with the close run in an on.exit or a
tryCatch/finally clause.

Best,

luke

On Tue, 7 Aug 2018, Jan van der Laan wrote:

> Dear Uwe,
>
> (When replying to your message, I sent the reply to r-devel and not 
> r-package-devel, as Martin Meachler suggested that this thread would be a 
> better fit for r-devel.)
>
> Thanks. In the example below I used rm() explicitly, but in general users 
> wouldn't do that.
>
> One of the reasons for the large number of file handles is that sometimes 
> unnamed temporary objects are created. For example:
>
>> library(ldat)
>> libraty(lvec)
>>
>> a <- lvec(10, "integer")
> OPENFILE '/tmp/RtmpVqkDsw/file32145169fb06/lvec3214753f2af0'
>> b <- as_rvec(a[1:3])
> OPENFILE '/tmp/RtmpVqkDsw/file32145169fb06/lvec32146a50f383'
> OPENFILE '/tmp/RtmpVqkDsw/file32145169fb06/lvec3214484b652c'
>> print(b)
> [1] 0 0 0
>>
>>
>> gc()
> CLOSEFILE '/tmp/RtmpVqkDsw/file32145169fb06/lvec3214484b652c'
> CLOSEFILE '/tmp/RtmpVqkDsw/file32145169fb06/lvec32146a50f383'
>          used (Mb) gc trigger (Mb) max used (Mb)
> Ncells  796936 42.6    1442291 77.1  1168576 62.5
> Vcells 1519523 11.6    4356532 33.3  4740854 36.2
>
>
> For debugging, I log when files are opened and closed. The call a[1:3] (which 
> creates a slice of a) creates two temporary objects [1]. These are only 
> deleted when I explicitly call gc() or on some other random moment in time.
>
> I hope this illustrates the problem better.
>
>
> Best,
> Jan
>
>
> [1] One improvement would be to create less temporary files; often these 
> contain only very little information that is better kept in memory. But that 
> is only a partial solution.
>
>
>
>
> On 07-08-18 15:24, Uwe Ligges wrote:
>> Why not add functionality that allows to delete object + runs cleanup code?
>> 
>> Best,
>> Uwe Ligges
>> 
>> 
>> 
>> On 07.08.2018 14:26, Jan van der Laan wrote:
>>> 
>>> 
>>> In my package I open handles to temporary files from c++, handles to them 
>>> are returned to R through vptr objects. The files are deleted then the 
>>> corresponding R-object is deleted and the garbage collector runs:
>>> 
>>> a <- lvec(10, "integer")
>>> rm(a)
>>> 
>>> Then when the garbage collector runs the file is deleted. However, on some 
>>> platforms (probably with lower limits on the maximum number of file 
>>> handles a process can have open), I run into the problem that the garbage 
>>> collector doesn't run often enough. In this case that means that another 
>>> package of mine using this package generates an error when its tests are 
>>> run.
>>> 
>>> The simplest solution is to add some calls to gc() in my tests. But a more 
>>> general/automatic solution would be nice.
>>> 
>>> I thought about something in the lines of
>>> 
>>> robust_lvec <- function(...) {
>>>    tryCatch({
>>>      lvec(...)
>>>    }, error = function(e) {
>>>      gc()
>>>      lvec(...) # duplicated code
>>>    })
>>> }
>>> 
>>> e.g. try to open a file, when that fails call the garbage collector and 
>>> try again. However, this introduces duplicated code (in this case only one 
>>> line, but that can be more), and doesn't help if it is another function 
>>> that tries to open a file.
>>> 
>>> Is there a better solution?
>>> 
>>> Thanks!
>>> 
>>> Jan
>>> 
>>> ______________________________________________
>>> R-package-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu


More information about the R-devel mailing list