[Rd] Suggestion: Dimension-sensitive attributes

Heinz Tuechler tuechler at gmx.at
Thu Jul 9 11:48:20 CEST 2009


At 11:14 09.07.2009, SIES 73 wrote:
> > If "objattr", "dimattr" and "cellattr" are 
> lists, they would offer save places for all 
> attributes that should be kept on subsetting.
>
>My proposed design would be that:
>
>         * "objattr" would be a list of 
> attributes (just preserved on subsetting)
>         * "dimattr" would be a list with as 
> many elements as array dimensions. Each element 
> can be any object whose length matches the 
> corresponding array dimension's length and that 
> can be itself subsetted with "[": so it could 
> be a vector, a list, a data frame...
>         * "cellattr" would be any object whose 
> dimensions match the array dimensions: another array, a data frame...
>
> > In my view this would be very useful, because 
> that way a general solution for data 
> description, like variabel names, variable labels, units, ... could be reached.
>
>Indeed, that's the objective: attaching 
>user-defined metadata that is automatically 
>synchronized with subsetting operations to the actual data.
>
>I've had dozens of use cases on my own R 
>programs that needed this type of pattern, and 
>seen it implemented in different ways in several 
>classes (xts, timeSeries, AnnotatedDataFrame, 
>etc.) As you point, this could offer a unified design for a common need.
>
>Enrique


For my personal use it was sufficient to create a 
class called "documented" with a corresponding 
subsetting method and one attribute, also called 
"documented". This attribute may contain 
'varlabel', 'varname', 'value.labels', 
'missing.values', 'code.ordered', 'comment', ...
It is copied on subsetting.
I think attributes concerning e.g. dimensions, 
i.e. parts of an object should stay in this 
object-related attribute and be extracted on 
subsetting. Since subsetting an object leads to a 
new object, this could then have its own, new persisting attribute.
The more difficult part may to be the binding of objects.

Heinz




>-----Original Message-----
>From: Heinz Tuechler [mailto:tuechler at gmx.at]
>Sent: jueves, 09 de julio de 2009 10:56
>To: Bengoechea Bartolomé Enrique (SIES 73); Tony Plate; r-devel at r-project.org
>Cc: Henrik Bengtsson
>Subject: Re: [Rd] Suggestion: Dimension-sensitive attributes
>
>At 10:01 09.07.2009, SIES 73 wrote:
> >I've also had several use cases where I needed "cell-like" attributes,
> >that is, attributes that have the same dimensions as the original array
> >and are subsetted in the same way --along all its dimensions.
> >
> >So we're talking about a way to add metadata to matrices/arrays at 3
> >possible levels:
> >
> >         1) at the "whole object" level:
> > attributes that are not dropped on subsetting
> >         2) at the "dimension" level: attributes that behave like
> > "dimnames", i.e. subsetted along each dimension
> >         3) at the "cell" level: attributes that are subsetted in the
> > same way as the original array
> >
> >My proposal would be simpler that Tony's
> >suggestion: like "dimnames", just have reserved attribute names for
> >each case, say "objdata", "dimdata", and "celldata" (or "objattr",
> >"dimattr" and "cellattr").
>
>If "objattr", "dimattr" and "cellattr" are 
>lists, they would offer save places for all 
>attributes that should be kept on subsetting. In 
>my view this would be very useful, because that 
>way a general solution for data description, 
>like variabel names, variable labels, units, ... could be reached.
>
>
> >On the other hand, Tony's pattern would allow as many attributes of
> >each type as necessary (some multiplicity is already possible with the
> >simpler design as dimdata or celldata could be lists of lists), at the
> >cost of a more complex scheme of attributes that needs to be "parsed"
> >each time.
> >
> >On Tony's suggestion, "attr.keep.on.subset" and "attr.dimname.like"
> >(and possible
> >"attr.cell.like") could be kept on a single list with 3 elements,
> >something like:
> >
> > > attr(x, "attr.subset.with") <- list(object=..., dims=..., cells=...)
> >
> >Would something like this make sense for R-core --either for standard
> >arrays or as a new class-- or would it be better implemented in a
> >package?
> >
> >Enrique
> >



More information about the R-devel mailing list