[R] Remove records from a large dataframe

Rui Barradas ruipbarradas at sapo.pt
Mon Oct 22 14:07:44 CEST 2012


Hello,

If I understand it well,

idx <- !dat$id %in% bad$id
dat[idx, ]


Also, to create bad you are complicating, this would do:

bad <- data.frame(id = c(1,4))

Hope this helps,

Rui Barradas
Em 22-10-2012 12:04, penguins escreveu:
> Hi, I am trying to remove a series of records from a large dataframe. The
> script I have written works fine but takes a long time to run. Can anyone
> suggest a quicker way to do this?
>
> Here is an example of the code I've written. The end result of this bit of
> code would be a dataframe with any records relating to ID 1 or ID 4 removed:
>
> #dataframe
> id <-     c(1,1,1,1,2,2,2,2,2, 3,3,3, 4,4)
> year <- c(1,1,1,2, 2,2,3,2,2, 2,3,4, 8,8)
> age <-  c("Adult",NA,NA,NA, "Adult",NA,NA,NA, "Adult",
>           NA,"Adult",NA, NA,"Adult")
> dat <- data.frame(id, year, age)
> dat.id<-unique(dat$id)
>
> #ID numbers for removal
> bad<- data.frame(c(1,4))
> names(bad)<-"id"
> remove.value<-bad$id
>
>
> good.id<- dat.id[!dat.id%in%remove.value]
>
> #Combine all good ID numbers
> if(exists("dat.2")){ rm(dat.2)}
>
> for(i in good.id){
>      lala<-dat[which(dat$id==i),]
>
>       if(!exists("dat.2")) {
>        dat.2 <- lala } else {
>        dat.2 <- rbind(dat.2, lala)
>        }
> }
>
> Many thanks in advance for any suggestions
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Remove-records-from-a-large-dataframe-tp4646990.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list