[Rd] Easily switchable factor levels

Barry Rowlingson b.rowlingson at lancaster.ac.uk
Wed Feb 23 23:23:58 CET 2011


I've recently been working with some California county-level data. The
counties can be referred to as either FIPS codes, eg F060102, friendly
names such as "Del Norte County", names without 'County' on the end,
names with 'CA' on the end ("Del Norte County, CA"). Different data
sets use slightly different forms and putting them all together is a
pain.

 So I was wondering about ways to attach multiple sets of level codes
to a factor. It would work something like this:

 > foo=multifactor(sample(letters,5),levels=letters,levelname="lower")
 > foo
 [1] m u i z b
 Levels: a b c d ... y z
 > levels(foo,"upper") = LETTERS
 > uselevels(foo,"upper")
 > foo
 [1] M U I Z B
  Levels: A B C D E F....Z
 > uselevels(foo,"lower")
 > foo
 [1] m u i z b
  Levels: a b c d ....z

In this way you could easily switch your levels from M and F to Male
and Female, or Hommes et Dames, without having to do levels(foo) =
something and hope to get the ordering right every time. Just do it
once, keep the multiple sets of level lables in the object.

I'd even throw in a function to print out all the level codes:

 > levels(foo,all=TRUE)
   upper  lower
[1] A  a
[2] B  b

etc

I can see assorted problems coding this up to cope with dropping
levels when making subsets... and possibly problems when code does
character matching of levels and expects them to be unchanged...

Has anyone bothered to write anything like this yet? Or is the
application a bit too rare to be worth it?

Barry



More information about the R-devel mailing list