[Rd] Possible changes to connections

Roger Peng rdpeng at gmail.com
Wed May 30 22:28:19 CEST 2007


In a previous version of the 'filehash' package, the 'filehashDB1'
class had a slot for an open connection corresponding to the database
file.  I quickly learned that if the R object ever got removed or
reassigned I was left hanging with an open file connection.

If I remember correctly, I resorted to creating an environment in the
R object which stored the connection number for the the database file
connection.  Then I registered a finalizer for that environment which
grabbed the connection via 'getConnection' and then closed the
connection.

I eventually abandoned this approach since it was error-prone and I
often ran into strange difficult-to-reproduce situations where the R
object representing the database had been removed but the file
connection was still open because garbage collection had not yet
occurred.  I would have very much preferred a system where the file
connection was automatically closed once any references to it were
gone.

-roger

On 5/30/07, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> When I originally implemented connections in R 1.2.0, I followed the model
> in the 'Green Book' closely.  There were a number of features that forced
> a particular implementation, and one was getConnection() that allows one
> to recreate a connection object from a number.
>
> I am wondering if anyone makes use of this, and if so for what?
>
> It would seem closer to the R philosophy to have connection objects that
> get garbage collected when no R object refers to them.  This would allow
> for example
>
> readLines(gzfile("foo.gz"))
>
> which currently leaks a connection slot as the connection cannot be closed
> (except via closeAllConnections() or getConnection()) without an R object
> being returned.
>
> The correct usage currently is
>
> readLines(con <- gzfile("foo.gz")); close(con)
>
> which is a little awkward but more importantly seems little understood.
>
> Another issue is that the current connection objects can be saved and
> restored but refer to a global table that is session-specific so they lose
> their meaning (and perhaps gain an unintended one).
>
> What I suspect is that very few users are aware of the Green Book
> description and so we have freedom to make some substantial changes
> to the implementation.  Both issues suggest that connection objects should
> be based on external pointers (which did not exist way back in 1.2.0).
>
> [I know there is a call to getConnection in package gtools, but the return
> value is unused!]
>
> --
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/



More information about the R-devel mailing list