[R] subset data.frame with value != in all columns

Gabor Grothendieck ggrothendieck at myway.com
Thu Feb 3 21:46:02 CET 2005


Do the -99 entries really mean NA?  In that case, I think it
would be clearer to recode your data frame with NAs and then select
out the complete or incomplete rows:

x[x == -99] <- NA

x[compete.cases(x),]   # or na.omit(x)
x[!complete.cases(x),]


Tim Howard <tghoward <at> gw.dec.state.ny.us> writes:

: 
: apply, of course, does the trick exceptionally well. Thank you,
: everyone, for the help.
: 
: tim
: 
: >>> Chuck Cleland <ccleland <at> optonline.net> 02/03/05 03:10PM >>>
: How about this?
: 
: #extract data.frame of rows with -99 in them
: 
: subset(x, apply(x, 1, function(x){any(x == -99)}))
: 
: #extract data.frame of rows not containing -99 in them
: 
: subset(x, apply(x, 1, function(x){all(x != -99)}))
: 
: hope this helps,
: 
: Chuck Cleland
: 
: Tim Howard wrote:
: > I am trying to extract rows from a data.frame based on the
: > presence/absence of a single value in any column.  I've figured out
: how
: > to do get the positive matches, but the remainder (rows without this
: > value) eludes me.  Mining the help pages and archives brought me,
: > frustratingly,  very close, as you'll see below. 
: > 
: > My goal: two data frames, one with -99 in at least one column in
: each
: > row, one with no occurrences of -99. I want to preserve rownames in
: > each.
: > 
: > My questions: 
: > Is there a cleaner way to extract all rows containing a specified
: > value?
: > How can I extract all rows that don't have this value in any col?
: > 
: > #create dummy dataset
: > x <- data.frame(
: > c1=c(-99,-99,-99,4:10),
: > c2=1:10,
: > c3=c(1:3,-99,5:10),
: > c4=c(10:1),
: > c5=c(1:9,-99))
: > 
: > #extract data.frame of rows with -99 in them
: > for(i in 1:ncol(x))
: > {
: > y<-subset(x, x[,i]==-99, drop=FALSE);
: > ifelse(i==1, z<-y, z <- rbind(z,y));
: > }
: > 
: > #various attempts to get rows not containing "-99":
: > 
: > # this attempt was to create, in "list", the exclusion formula for
: each
: > column.
: > # Here, I couldn't get subset to recognize "list" as the correct
: type.
: > # e.g. it works if I paste the value of list in the subset command
: > {
: > for(i in 1:ncol(x)){
: > if(i==1)
: > list<-paste("x[",i,"]!=-99", sep="")
: > else
: > list<-paste(list," ", " & x[",i,"]!=-99", sep="")
: > }
: > y<-subset(x, list, drop=FALSE);
: > }
: > 
: > # this will do it for one col, but if I index more
: > # it returns all rows
: > y <- x[!(x[,3] %in% -99),]
: > 
: > # this also works for one col
: > y<-x[x[,1]!=-99,]
: > 
: > # but if I index more, I get extra rows of NAs
: > y<-x[x[,1:5]!=-99,]
: > 
: > Thanks in advance.
: > Tim Howard
: > 
: > platform i386-pc-mingw32
: > arch     i386           
: > os       mingw32        
: > system   i386, mingw32  
: > status                  
: > major    2              
: > minor    0.1            
: > year     2004           
: > month    11             
: > day      15             
: > language R
: > 
: > ______________________________________________
: > R-help <at> stat.math.ethz.ch mailing list
: > https://stat.ethz.ch/mailman/listinfo/r-help 
: > PLEASE do read the posting guide!
: http://www.R-project.org/posting-guide.html 
: > 
:




More information about the R-help mailing list