[R] Binning Question

David Winsemius dwinsemius at comcast.net
Tue Apr 13 06:06:21 CEST 2010


On Apr 12, 2010, at 9:07 PM, Noah Silverman wrote:

> Hi,
>
> I'm trying to setup some complicated binning with statistics and could
> use a little help.
>
> I've found the bin2 function from the ash package, but it doesn't do
> everything I need.  My intention is to copy some of their code and  
> then
> modify as needed.
>
> I have a vector of two columns:
>
> head(data)
>              r1          r2
> [1,]  0.03516559  0.03102128
> [2,]  0.02162539  0.14847034
> [3,]  0.02210339  0.06539623
> [4,] -0.07547792 -0.08859678
> [5,]  0.03655620  0.05412436
> [6,]  0.06513828  0.06053050
>
>
> I'd like to create a 2 dimension list of bins with the frequency  
> counts
> for each bin.  The bin2 function does this.  Then it gets interesting.
>
> I'd like to add a column to my vector that has the "bin label" for the
> bin that row would belong to.  (I can see how to do this with lots of
> nasty loops and greater-than, less-than calculations, but that gets
> messy.)  There must be an easier way.

Lets say you used the example in bin2:

dat <- as.data.frame(matrix( rnorm(200), 100 , 2)) # bivariate normal  
n=100

ab <- matrix( c(-5,-5,5,5), 2, 2) # interval [-5,5) x [-5,5)
nbin <- c( 20, 20) # 400 bins
bins <- bin2(dat, ab, nbin) # bin counts,ab,nskip

dat$r1.cat <- cut(dat[,1],  breaks=seq(ab[1,1], ab[1,2],  
length.out=nbin[1]+1 ) )
dat$r2.cat <- cut(dat[,2], breaks=seq(ab[1,1], ab[1,2],  
length.out=nbin[1]+1))
dat$bicat <- with(dat, paste( as.numeric(r1.cat), as.numeric(r2.cat),  
sep="."))

Or leave off the as.numeric if you want the labels to be more "cut"- 
like.


>
> So, If I made 10 bins for each column (r1,r2), I'd have 100 bins.
> (bin1, bin2, bin3, etc.)  I want to label each ROW in my data set with
> the bin it would belong to.  (I intend to do more work with them after
> this, but this starts.  Each row gets transformed depending on the bin
> it belongs to, etc..)
>
> Thanks,
>
> -N
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list