[Rd] Question re:S4 classes and design; clashing classes?

Wed, 20 Mar 2002 12:00:52 -0800 (PST)

On Wed, 20 Mar 2002, John Chambers wrote:

> It wasn't entirely clear what your design goals were, but the general
> question of comparing functional method languages (S, CLOS, Dylan) and
> OOP languages (Java, etc) is of much interest to me, so here are some
> general comments.  Maybe we can then iterate on your example, with or
> without r-devel.

I'll keep it on r-devel until someone (anyone?) screams.  I suspect it might be useful, at least for a bit.

> The basic difference in approach is easy to state:
> 
>  Functional languages define methods based on the function and a
> signature (a list matching classes to formal arguments);  OOP languages
> attach all methods to a class definition.

I'll try to clarify the issues by typing through them, and then provide the problem I'd like to address, which was ill-stated in my last email. 

Issues:

1. I've got an object oriented toolkit which constructs visualization pipelines (i.e. sets of views connected in a data processing network).  These are constructed using a OOP paradigm, realized in Java, with the goal being flexible additions of new (higher level) interactive visualizations for exploration (of both data as well as visualization approaches).

2. I really want to connect this with an interactive data analysis language.  S (via R) is a no-brainer, in this respect.  However, it's a functional programming language.

3. I'd like to do so in such as way to play off of the strengths of both approaches (I'll label one, "the computing with data approach", and the second, the "OOP" approach). 

4. Note that there are two intertwined goals, here.  First, I'd like to be able to use the strengths of S, in the sense of being able to provide the right thing for the right set of inputs, (i.e. "plot" does the right thing).   So I'd like to continue to have this interactive strength.  Second, I'd like to be able to develop new visualizations rapidly, by subclassing, to use the strengths of Java and the flexibility of the classes that we designed for Orca.

So this leads me to pose the following hypothesis:  Since I'm thinking (in the OOP approach) of visualizations as subclassing numerical summaries (this could be controversial, but those who know me understand where I'm coming from), I'd also like to be able to "intertwine" the extensibility approaches.  For me, this would mean being able to build visualization summaries on top of the numerical summaries, as well as be able to provide a "feedback loop" in the sense of being able to apply numerical methods both to visualization objects (with resulting changes to visualization) as well as to subsets of the data extracted from various points of the visualization pipeline.  I'm rambling, but in a sense, I'm thinking of a possibly cyclic graph in the sense of the original ViSta workmaps, which would provide some form of audit-trail and reproducibility as needed.

> The distinction becomes relevant if a function does "multiple dispatch";
> that is, defining methods that depend on more than one argument.  (In
> that sense, S3 methods made no use of the functional nature of the
> language.)

Exactly.  Since I'm considering S4 methods, this is critical.  Especially since I'd like to do some form of interplay, if possible.

> In that sense, I read your description as essentially OOP.  You talk of
> "Orca objects having methods", but in a functional language the question
> is how the functions that do the computation depend on the objects that
> are their arguments.

This is what I'm trying to understand -- i.e. how to do both, in a flexible and extensible manner, from the R command line.

> Nothing wrong with the OOP view, but it means that your question goes
> more to how, and if, the underlying R code uses functions.  If it's just
> basically an interface to an OOP definition, then it can stay in that
> form.  (And I'd like to discuss in separate mail using the Omegahat OOP
> package to formalize the R side.)

Please do.  I'm busy, but it would help to ground my thinking about this general problem.

> Functional methods would become relevant if the R software involved
> multiple objects.  Not knowing the innards of how things work, let me
> make up an example.
> 
> If the current view included, say, an object describing a wireframe that
> was displayed around the points, then one might have useful classes of
> wireframe objects, then a function that stepped the display to the next
> view could have methods that depended on both the current data and the
> wireframe:
>   step(data, wireframe)
> (Definitely not asserting this is a relevant example, but perhaps the
> general distinction is clear.)

I'm not sure.  I think that what I'm trying to do is to explore the possibility of systems integration at the user-level.  

So, what I'd like is to be able to do is more like (note that this is a fake example, while the components are there, it isn't quite there yet):

grandTour(x)

setMethod("grandtour","matrix", 
          function(x) { gt <- .JNew(org.orca.viz.grandtour)
                        gt$initData(x)})

Note that the grandtour class is a pipeline of lots of things.  So, I'd like to do something similar with time series, so that:

setMethod("grandtour","ts",
          function(x) {gt <- .JNew(org.orca.viz.grandtour)
                       gt$insertPipe(knownLocation,
                            .JNew(org.omegahat.compiler( new presentation 
                                               code in Java)))
                       gt$initData(x)})

While it looks like I'm answering my own question -- Duncan and John have provided all the technology I need to work it out today, in a sense I'd like to have a tighter means for building the code in the second .JNew statement, i.e. the compiler code.

I could simply provide a functional mapping over, but then I seem to lose (Duncan, correct me if wrong) the advantage of understanding the Java interfaces and classes.  Yes, I can hack it out.  Yes I am hacking it out.  But it doesn't have the right aesthetic feel to it.

Now, to provide more context - we can treat as a constant, the knownLocation (it is either fixed or can be discovered via introspection).  What I'd like to be able to do is more like:

.JNew(subclass(org.orca.viz.grandtour)$setfinalDisplayPipe(TimeSeriesDisplayPipe))

i.e. something like: subclass the grandtour, replacing the finalDisplay pipe to be a TimeSeriesDisplayPipe

Note that Orca examples do look sort of functional-like (since we construct DAGs for the visualization pipeline).  So this would just mean that we need to replace a componentPipe (or insert a new componentPipe between two others).

Now, for a part that is incomplete for me; I still need to think through what I want with:

y <- grandtour(as.matrix(x))

Here, y will be the link to the grandtour, to the original data, and to any subsetted data pieces.

So, while I could let y be a closure/list of components to work with, another possibility, which "might" be useful, is for y to both represent the R "matrix" class (for data operations), as well as the Java "orca" class (for visualization extension operations).

best,
-tony

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._