[Rd] S4 and connection slot [Sec=Unclassified]

Martin Morgan mtmorgan at fhcrc.org
Mon Jun 29 21:12:13 CEST 2009


Stavros Macrakis wrote:
> On Mon, Jun 29, 2009 at 9:19 AM, Martin Morgan <mtmorgan at fhcrc.org
> <mailto:mtmorgan at fhcrc.org>> wrote:
> 
>     ...I'm not sure that including a connection in a slot is going to be
>     a good
>     idea, though -- a connection has reference-like semantics, so you can
>     end up with multiple objects pointing to the same connection, Also when
>     an object is garbage collected the connection will not be closed
>     automatically.
> 
> 
> I'm not sure I understand your point here.  Having multiple objects
> refer to the same connection seems like a perfectly reasonable and
> useful thing, and garbage collection should work just fine -- when *all*
> of the objects referring to that connection are unreachable, then the
> connection is unreachable and will itself be garbage collected.  No?

I meant that, in practice, there were enough additional pitfalls that I
wouldn't choose this route (placing a connection in an S4 slot) for
myself. These little experiments were enough to make me cautious...

setOldClass(c("file", "connection"))

## Attempt one -- prototype
setClass("Element",
         representation=representation(conn="file"),
         prototype=prototype(conn=file()))

## oops, all instances share a single connection

close(new("Element")@conn)
## oops, all new instances now have a closed (invalid) connection


## Attempt two -- initialize
setClass("Element",
         representation=representation(conn="file"))

setMethod(initialize, "Element", function(.Object, ..., conn=file()) {
    callNextMethod(.Object, ..., conn=conn)
})

new("Element")
## oops, connection created but not closed; gc() closes (eventually)
## but with an ugly warning
## > gc()
##            used  (Mb) gc trigger  (Mb) max used  (Mb)
## Ncells   717240  38.4    1166886  62.4  1073225  57.4
## Vcells 37333395 284.9   63274729 482.8 60051033 458.2
## > gc()
##            used  (Mb) gc trigger  (Mb) max used  (Mb)
## Ncells   715906  38.3    1166886  62.4  1073225  57.4
## Vcells 37335626 284.9   63274729 482.8 60051033 458.2
## Warning messages:
## 1: closing unused connection 3 ()

setClass("ElementX", contains="Element")
## oops, two connections opened (!)
## > showConnections()
##   description class  mode text   isopen   can read can write
## 3 ""          "file" "w+" "text" "opened" "yes"    "yes"
## 4 ""          "file" "w+" "text" "opened" "yes"    "yes"

And while completely expected the action-at-a-distance of references has
the usual risks

> x <- y <- new("Element")
> close(x at conn)
> y
An object of class "Element"
Slot "conn":
Error in summary.connection(x) : invalid connection

One place I know where S4 and connections are used effectively is in the
AnnotationDbi package in Bioconductor (I did not participate in
developing this package, so am probably misrepresenting its
implementation). Here there is a database connection. But it is stored
in an environment in an S4 class. This part of the class is used
essentially as a singleton -- AnnotationDbi creates packages containing
instances, and users load the package and hence single instance; the
interface never exposes the connection itself to copying. The database
functionality exposed to the user is read-only, and querying the
connection does not change it's state (other than opening it if it were
closed). I think the underlying connection has C code to tidy itself up
appropriately (and quietly) when eventually garbage collected. So yes,
connections and multiple references to them can be useful.

Martin


>            -s
>



More information about the R-devel mailing list