[R] recode data according to quantile breaks

arun smartpink111 at yahoo.com
Tue Feb 19 20:03:11 CET 2013


HI Alain,

Try this:
df.breaks<-data.frame(id=df[,1],sapply(df[,-1],function(x) findInterval(x,quantile(x),rightmost.closed=TRUE)),stringsAsFactors=FALSE)
df.breaks
#   id a b c
#1 x01 1 1 1
#2 x02 1 1 1
#3 x03 2 2 2
#4 x04 3 3 3
#5 x05 4 4 4
#6 x06 4 4 4
A.K.



----- Original Message -----
From: D. Alain <dialvac-r at yahoo.de>
To: Mailinglist R-Project <r-help at r-project.org>
Cc: 
Sent: Tuesday, February 19, 2013 5:01 AM
Subject: [R] recode data according to quantile breaks

Dear R-List, 

I would like to recode my data according to quantile breaks, i.e. all data within the range of 0%-25% should get a 1, >25%-50% a 2 etc.
Is there a nice way to do this with all columns in a dataframe.

e.g.

df<- f<-data.frame(id=c("x01","x02","x03","x04","x05","x06"),a=c(1,2,3,4,5,6),b=c(2,4,6,8,10,12),c=c(1,3,9,12,15,18))

df
   id        a      b      c
1 x01     1      2      1
2 x02     2      4      3
3 x03     3      6      9
4 x04     4      8     12
5 x05     5     10     15
6 x06     6     12     18

#I can do it in very complicated way


apply(df[-1],2,quantile)
       a    b    c
0%   1.0  2.0  1.0
25%  2.2  4.5  4.5
50%  3.5  7.0 10.5
75%  4.8  9.5 14.2
100% 6.0 12.0 18.0

#then 

df$a[df$a<=2.2]<-1
...

#result should be


df.breaks

id        a        b        c
x01    1           1        1
x02    1          1        1
x03    2           2        2
x04    3           3        3
x05    4           4        4
x06    4           4        4 



But there must be a way to do it more elegantly, something like


df.breaks<- apply(df[-1],2,recode.by.quantile)

Can anyone help me with this?


Best wishes 


Alain      
    [[alternative HTML version deleted]]


______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list