[R] Data Manipulation using R

Stephen Tucker brown_emu at yahoo.com
Wed Apr 18 19:50:29 CEST 2007


...is this what you're looking for?

donedat <- subset(data,ID < 6000 | ID >= 7000)
findat <- donedat[-unique(rapply(donedat,function(x)
                                 which( x < 0 ))),,drop=FALSE]

the second line looks through each column, and finds the indices of negative
values - rapply() returns all of them as a vector; unique() removes
duplicated elements, and with negative indexing you remove these values from
donedat.

--- Anup Nandialath <anup_nandialath at yahoo.com> wrote:

> Dear Friends,
> 
> I have data set with around 220,000 rows and 17 columns. One of the columns
> is an id variable which is grouped from 1000 through 9000. I need to
> perform the following operations. 
> 
> 1) Remove all the observations with id's between 6000 and 6999
> 
> I tried using this method. 
> 
> remdat1 <- subset(data, ID<6000)
> remdat2 <- subset(data, ID>=7000)
> donedat <- rbind(remdat1, remdat2)
> 
> I check the last and first entry and found that it did not have ID values
> 6000. Therefore I think that this might be correct, but is this the most
> efficient way of doing this?
> 
> 2) I need to remove observations within columns 3, 4, 6 and 8 when they are
> negative. For instance if the number in column 3 is -4, then I need to
> delete the entire observation. Can somebody help me with this too.
> 
> Thank and Regards
> 
> Anup
> 
>        
> ---------------------------------
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list