[R] how to apply the function cut( ) to many columns in a data.frame?

Liaw, Andy andy_liaw at merck.com
Thu Mar 1 16:30:29 CET 2007


From: Chuck Cleland
> 
> ahimsa campos-arceiz wrote:
> > Dear useRs,
> > 
> > In a data.frame (df) I have several columns (x1, x2, x3....xn) 
> > containing data as a continuous numerical response:
> > 
> > df
> >  var     x1    x2     x3
> >   1    143   147   137
> >   2      93    93   117
> >   3    164    39   101
> >   4    123   118    97
> >   5     63   125     97
> >   6    129    83   124
> >   7    123    93   136
> >   8    123    80     79
> >   9     89   107   150
> > 10     78    95    121
> > 
> > I want to classify the values in the columns x1, x2, etc, 
> into bins of 
> > fix margins (0-5, 5-10, ....). For one vector I can do it 
> easily with 
> > the function cut:
> > 
> >> df$x1 <- cut(df$x1, br=5*(0:40), labels=5*(1:40))
> >> df$x1
> >  [1] 145 95  165 125 65  130 125 125 90  80 40 Levels: 5 10 
> 15 20 25 
> > 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 ...
> > 200
> > 
> > However if I try to use a subset of my data.frame:
> > 
> > df[,3:4] <- cut(df[,3:4], br=5*(0:40), labels=5*(1:40))
> > 
> > Error in cut.default(df[, 3:4], br = 5 * (0:40), labels = 5 
> * (1:40)) :
> >         'x' must be numeric
> > 
> > 
> > How can I make this work with data frames in which I want 
> to apply the 
> > function cut( ) to many columns in a data.frame?
> 
>   You have an answer within your question - use one of the 
> various "apply" functions.  For example:
> 
> lapply(df[,3:4], function(x){cut(x, br=5*(0:40), labels=5*(1:40))})

Or perhaps a bit more simply:

lapply(df[, 3:4], cut, br=5*(0:40), labels=5*(1:40)))

and if a data frame is desired as output, wrap the above in
as.data.frame().

(Just keep in mind that a data frame is like a list.)

Andy

 
> ?lapply
> ?sapply
> ?apply
> 
> > I guess that I might have to use something like for ( ) 
> (which I'm not 
> > familiar with), but maybe you know a straight forward method to use 
> > with data.frames.
> > 
> > 
> > Thanks a lot!
> > 
> > Ahimsa
> > 
> > *********************************************
> > 
> > # data
> > var <- 1:10
> > x1 <- rnorm(10, mean=100, sd=25)
> > x2 <- rnorm(10, mean=100, sd=25)
> > x3 <- rnorm(10, mean=100, sd=25)
> > df <- data.frame(var,x1,x2,x3)
> > df
> > 
> > # classifying the values of the vector df$x1 into bins of width 5
> > df$x1 <- cut(df$x1, br=5*(0:40), labels=5*(1:40))
> > df$x1
> > 
> > # trying it a subset of the data.frame df[,3:4] <- cut(df[,3:4], 
> > br=5*(0:40), labels=5*(1:40)) df[,3:4]
> 
> --
> Chuck Cleland, Ph.D.
> NDRI, Inc.
> 71 West 23rd Street, 8th floor
> New York, NY 10010
> tel: (212) 845-4495 (Tu, Th)
> tel: (732) 512-0171 (M, W, F)
> fax: (917) 438-0894
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 


------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments,...{{dropped}}



More information about the R-help mailing list