[R] How to make attributes persist after indexing?

Heinz Tuechler tuechler at gmx.at
Thu May 25 11:26:27 CEST 2006


Thank you Mark, for this helpful explanation. I noted the class "labelled"
in Hmisc, but did not yet study it well.
My example was just to see if attributes persist, but my main motivation is
to use  value labels for numerical variables and factors to solve my
problem of how to represent a metric categorical variable.
>From an other discussion (attributes of a data.frame, Tue 22 Nov 2005) I
understood that there is very little interest in labelling variables, and I
assume even less in labelling variable values.
In my work, however categorical variables with an inherent metric (e.g.
risk scores) occur frequently and I think they could best be treated as
numeric with attached value.labels so that they can be converted easily to
factors if needed for use in a function.
So I have to see how convenient it is to introduce a "labelled.labelled"
class.

Thanks,

Heinz



At 21:37 24.05.2006 -0500, Marc Schwartz wrote:
>On Thu, 2006-05-25 at 00:43 +0100, Heinz Tuechler wrote:
>> Thank you for your answer, Gabor. I will see, if I understood it.
>> 
>> Heinz
>> 
>> At 11:31 24.05.2006 -0400, Gabor Grothendieck wrote:
>> >You could create your own child class with its own [ method.
>> >
>> >"[.myfactor" <- function(x, ...) {
>> >       attr <- attributes(x)
>> >       x <- NextMethod("[")
>> >       attributes(x) <- attr
>> >       x
>> >}
>> >
>> >gx <- structure(fx, class = c("myfactor", class(fx)))
>> >attributes(gx[1])
>> >
>
>Heinz,
>
>What Gabor has proposed is essentially what Frank Harrell does in the
>Hmisc package (which I referenced), though in a more generic fashion
>with respect to the attributes that are saved and reset.
>
>By using:
>
>  gx <- structure(fx, class = c("myfactor", class(fx)))
>
>you are taking an object 'fx' and adding a child class attribute called
>"myfactor" to it. So it is in effect an object that retains the
>attributes of the original class of 'fx', plus the new class attribute
>'myfactor'.
>
>As a parallel, for example, consider that a square is a child class of a
>rectangle. A square inherits all of the attributes of a rectangle, plus
>the additional attribute that all four sides are of equal length. So for
>example, if 'x' is a square, you might see:
>
>> x
>[1] 4 4 4 4
>attr(,"class")
>[1] "square"    "rectangle"
>
>as compared to a rectangle 'y':
>
>> y
>[1] 4 2 4 2
>attr(,"class")
>[1] "rectangle"
>
>
>Note the two class attributes for 'x', with 'square' preceding
>'rectangle' in the order. The order is important, because in R, function
>methods are dispatched based upon the class attributes in the order that
>they appear in the vector.
>
>So...back to 'gx'.
>
>When the generic "[" function is called with "gx" (ie. gx[1]), the first
>class attribute 'myfactor' is picked up from 'gx'. Then, the method that
>Gabor has presented, "[.myfactor", is dispatched (executed). It is the
>generic function "[" with the specific class method defined by
>".myfactor".
>
>In that function, the first thing that takes place is that the
>attributes of the 'x' argument are saved in 'attr'. 
>
>  attr <- attributes(x)
>
>This would include any new attributes that you have defined and added,
>such as labels and comments.
>
>The next thing that happens is that the generic "[" is now called again:
>
>  x <- NextMethod("[")
>
>but this time, the method dispatched is based upon the "Next" entry in
>the class vector. In this case, whatever the original class of 'fx' was,
>which could be an atomic vector, a factor, a matrix or a data frame, for
>example.
>
>You can see the methods available for "[" by using:
>
>   methods("[")
>
>The appropriate method for "[" is then executed, resulting in 'x', which
>is the subset version of 'gx'. 
>
>Then:
>
>  attributes(x) <- attr
>
>restores the original attributes saved in 'attr' to 'x'.
>
>Then, finally, the 'x' object is returned to the calling environment.
>
>So, in effect, you have created a new subset function "[" that retains
>your new attributes. The great thing about this, is that once the method
>is defined in your working environment, all you have to do is to add the
>newly associated class attribute to the objects you want to subset and R
>does the rest transparently.
>
>Thus:
>
># Add the new method
>"[.myfactor" <- function(x, ...) {
>       attr <- attributes(x)
>       x <- NextMethod("[")
>       attributes(x) <- attr
>       x
>}
>
># Create fx
>fx <- factor(1:5, ordered = TRUE)
>attr(fx, 'comment') <- 'Comment for fx'
>attr(fx, 'label') <- 'Label for fx'
>attr(fx, 'testattribute') <- 'just for fun'
>
>> attributes(fx)
>$levels
>[1] "1" "2" "3" "4" "5"
>
>$class
>[1] "ordered" "factor"
>
>$comment
>[1] "Comment for fx"
>
>$label
>[1] "Label for fx"
>
>$testattribute
>[1] "just for fun"
>
>
># Create gx, which is identical to fx, with the 
># additional class attribute
>gx <- structure(fx, class = c("myfactor", class(fx)))
>
>> attributes(gx)
>$levels
>[1] "1" "2" "3" "4" "5"
>
>$class
>[1] "myfactor" "ordered"  "factor"   # <- Note change here
>
>$comment
>[1] "Comment for fx"
>
>$label
>[1] "Label for fx"
>
>$testattribute
>[1] "just for fun"
>
>
># Show the structure of fx[1]
># Note that other attributes are lost
>> str(fx[1])
> Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 1
>
>
># Show the structure of gx[1]
># Note that attributes are retained
>> str(gx[1])
> Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 1
> - attr(*, "comment")= chr "Comment for fx"
> - attr(*, "label")= chr "Label for fx"
> - attr(*, "testattribute")= chr "just for fun"
>
>
>HTH,
>
>Marc Schwartz
>
>
>



More information about the R-help mailing list