[R] Delete rows from dataset

Rui Barradas ruipbarradas at sapo.pt
Wed Jun 6 23:37:54 CEST 2012


Hello,

Try

fun <- function(x){
	one <- which(x$score == 1)  # rows to remove
	if(length(one) == 1)
		x
	else if(length(one) > 1)
 		x[-one[-sample(seq_along(one), 1)], ] # all but a randomly sampled row
}

res <- lapply(split(data.frame(dat), dat[, "group"]), fun)
res
do.call(rbind, res)


Hope this helps,

Rui Barradas

ck wrote
> 
> Dear R users, 
> 
> I am working on a big dataset and have got a problem with data cleaning.
> My data set looks like this: 
> 
> data <- cbind (group = c(1,1,1,2,2,3,3,3,4,4,4,4,4), member =
> c(1,2,3,1,2,1,2,3,1,2,3,4,5), score = c(0,1,0,0,0,1,0,1,0,1,1,1,0)) 
> 
> I just want to keep the group in which the sum of score is equal to 1 and
> remove the whole group in which the sum of score is equal to 0. For the
> group in which the sum of the score is greater than 1, e.g., sum of score
> = 3, I want to randomly select two group members with score equal to 1 and
> remove them from the group. Then the data may look like this: 
> 
> newdata <- cbind (group = c(1,1,1,3,3,4,4,4), member = c(1,2,3,2,3,1,3,5),
> score = c(0,1,0,0,1,0,1,0)) 
> 
> Does anybody can help me get this done? Thank you in advance. 
> 
> ck
> 


--
View this message in context: http://r.789695.n4.nabble.com/Delete-rows-from-dataset-tp4632427p4632606.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list