[Rd] "Default" accessor in S4 classes

John Chambers jmc at r-project.org
Tue Jan 8 17:59:48 CET 2013


To respond a little more directly to what you seem to be asking for:

You would like an "automatic" conversion from your class (you don't give us its name, let's call it frameHDF for now) and "data.frame".

In R (and in OOP generally) this sounds like inheritance: you want  a frameHDF to be valid wherever a data.frame is wanted.

_IF_ that is really a good idea, it can be done by using setIs() to define the correspondence.  Then methods for data.frame objects (formal methods at least) will convert the argument automatically.

(As noted previously, the simple assignment operation doesn't check types.)

However, this doesn't sound like such a good idea.  The point of your class is to handle objects too large for ordinary data frames.  Converting automatically sounds like a recipe for unpleasant surprises.

A more cautious approach would be for the user to explicitly state when a conversion is needed.  The general tool for defining this is setAs(), very similar to setIs() but not making things automatic, the user then says as(x, "data.frame") to get conversion.

The online documentation for these two functions says some more; also section 9.3 of my 2008 book referenced in the documentation.

One more comment.  It would be likely that your HDF5 objects have reference semantics--any changes made are seen by all the functions using that object.  This is different from R's functional semantics as in S4 classes, and the differences can cause incorrect results in some situations. The more recent reference classes (?ReferenceClasses) were designed to mimic C++, Java, etc style behavior.  (They are used in Rcpp to import C++ classes.)

John


On Jan 7, 2013, at 3:23 PM, Chris Jewell wrote:

> Hi All,
> 
> I'm currently trying to write an S4 class that mimics a data.frame, but stores data on disc in HDF5 format.  The idea is that the dataset is likely to be too large to fit into a standard desktop machine, and by using subscripts, the user may load bits of the dataset at a time.  eg:
> 
>> myLargeData <- LargeData("/path/to/file")
>> mySubSet <- myLargeData[1:10, seq(1,15,by=3)]
> 
> I've therefore defined by LargeData class thus
> 
>> LargeData <- setClass("LargeData", representation(filename="character"))
>> setMethod("initialize","LargeData", function(.Object,filename) .Object at filename <- filename)
> 
> I've then defined the "[" method to call a C++ function (Rcpp), opening the HDF5 file, and returning the required rows/cols as a data.frame.
> 
> However, what if the user wants to load the entire dataset into memory?  Which method do I overload to achieve the following?
> 
>> fullData <- myLargeData
>> class(fullData)
> [1] "data.frame"
> 
> or apply transformations:
> 
>> myEigen <- eigen(myLargeData)
> 
> In C++ I would normally overload the "double" or "float" operator to achieve this -- can I do the same thing in R?
> 
> Thanks,
> 
> Chris
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list