[R] Classification

Karl Ove Hufthammer karl at huftis.org
Fri Nov 20 11:26:09 CET 2009


On Fri, 20 Nov 2009 10:43:19 +0100 smu <ml at z107.de> wrote:
> x <- c(3,5,7,3,9,7)
> > as.numeric(as.factor(x))
> [1] 1 2 3 1 4 3

While that is my preferred solution too, this may be easier to 
understand:

 match(x,sort(unique(x)))

(It is basically what 'factor' does.)

The question wasn't quite clear, though. Should the first occuring 
number get the category 1, or should the *lowest* number get this 
category. I.e., what should be the result of

x <- c(5,3,7,3,9,7)

Should it be 1, 2, 3, 2, 4, 3 or 2, 1, 3, 1, 4, 3?

The factor and match methods above give the second solution. To get the 
first solution, just remove the 'sort':

match(x,unique(x))

-- 
Karl Ove Hufthammer




More information about the R-help mailing list