[R] descretizing xy data

Rubén Roa-Ureta rroa at udec.cl
Tue Nov 4 13:51:36 CET 2008


Rubén Roa-Ureta wrote:
> Jon A wrote:
>> Hello,
>> I have a dataset with a continuous independent variable (fish length, 
>> range:
>> 30-150 mm) and a binary response (foraging success, 0 or 1). I want to
>> discretize fish length into 5 mm bins and give the proportion of 
>> individuals
>> who successfully foraged in each each size bin. I have used the cut 
>> function
>> to discretize the length values into my desired bins, but I can't 
>> figure out
>> how to manipulate my response data in terms of the levels I've 
>> created. Any
>> advice on how to achieve my task?
>>
>> Thanks in advance.
>>   
> You have the option of using catspec.
> Here is another, more transparent solution, using hist().
> lb <- 30
> ub <- 150
> bk <- 5
> x <- data.frame(cbind(runif(1000,lb,ub),rbinom(1000,1,0.75)))
> x$X3 <- cut(x$X1,breaks=(ub-lb)/bk,labels=FALSE)
> y <- 
> data.frame(cbind(hist(x$X1,breaks=(ub-lb)/bk,plot=FALSE)$breaks[-1],hist(x$X1,breaks=(ub-lb)/bk,plot=FALSE)$counts,0)) 
>
> for (i in 1:length(y$X1)) {
>  for (j in 1:length(x$X1)) {
>     if(identical(x$X3[j],i)) y$X3[i] <- y$X3[i]+x$X2[j]
>  }
> }
> sum(x$X2) #check that counting was correct
> sum(y$X3)
> y$X4 <- ifelse(y$X3 > 0,y$X3/y$X2,NA)
> plot(y$X1,y$X4,pch=19)
> abline(v=hist(x$X1,breaks=(ub-lb)/bk,plot=FALSE)$breaks)
BTW, if you add the line below,
text(x=y$X1,y=y$X4+.01,format(y$X2),cex=.5)
you show the sample size at each proportion.
R.



More information about the R-help mailing list