Tue May 7 18:51:20 CEST 2013

Hello,

First of all, you don't need as.data.frame(cbind(...)). It's much better
to simply do data.frame(...).
As for the conversion, the following function doesn't use randomness but
gets the job done

df <- data.frame(snr=c(1,2,3,4,5,6,7,8,9,10),
k1=c(1,1,4,2,3,2,2,5,2,2),
k2=c(1,2,3,2,1,2,1,3,3,2),
result=c(4,3,5,4,2,6,4,4,2,3))

fun <- function(x){
n <- length(x)
y <- rep(NA, n)
y[x < median(x)] <- 0
y[x > median(x)] <- 1
w <- which(x == median(x))
y[w[seq_len(n/2 - length(which(x < median(x))))]] <- 0
y[is.na(y)] <- 1
y
}

fun(df\$k1)
fun(df\$k2)

Hope this helps,

Em 07-05-2013 17:20, D. Alain escreveu:
> Dear R-List,
>
> I would like to recode categorial variables into binary data, so that all values above median are coded 1 and all values below 0, separating each var into two equally large groups (e.g. good performers = 0 vs. bad performers =1).
>
> I have not succeeded so far in finding a nice solution to do that in R. I thought there might be a better way than ordering each column and recoding the first 50% into 0 and the second into 1. If I use ifelse I have a problem with cases that share the same rank being all median.
>
> e.g.
> df<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(1,1,4,2,3,2,2,5,2,2),k2=c(1,2,3,2,1,2,1,3,3,2),result=c(4,3,5,4,2,6,4,4,2,3)))
>
> now I want to recode k1 and k2 so that I have half of the values recoded 0 and half recoded 1, split around the median point. The median of k1 is 2 which would lead to unequal groupsize if used 2 as cutoff, so all values k1=2 should be recoded 1 or 0 randomly until both categories have the same length.
>
> something like
>
> df.rec<-as.data.frame(cbind(snr=c(1,2,3,4,5,6,7,8,9,10),k1=c(0,0,1,0,1,1,0,1,0,1),k2=c(0,1,1,0,0,1,0,1,1,0),result=c(4,3,5,4,2,6,4,4,2,3)))
>
> Can anyone help?
>
>
> Best wishes.
> Alain
>
>
>
