[R] subset data.frame with value != in all columns

Tim Howard tghoward at gw.dec.state.ny.us
Thu Feb 3 20:57:58 CET 2005


I am trying to extract rows from a data.frame based on the
presence/absence of a single value in any column.  I've figured out how
to do get the positive matches, but the remainder (rows without this
value) eludes me.  Mining the help pages and archives brought me,
frustratingly,  very close, as you'll see below. 

My goal: two data frames, one with -99 in at least one column in each
row, one with no occurrences of -99. I want to preserve rownames in
each.

My questions: 
Is there a cleaner way to extract all rows containing a specified
value?
How can I extract all rows that don't have this value in any col?

#create dummy dataset
x <- data.frame(
c1=c(-99,-99,-99,4:10),
c2=1:10,
c3=c(1:3,-99,5:10),
c4=c(10:1),
c5=c(1:9,-99))

#extract data.frame of rows with -99 in them
for(i in 1:ncol(x))
{
y<-subset(x, x[,i]==-99, drop=FALSE);
ifelse(i==1, z<-y, z <- rbind(z,y));
}

#various attempts to get rows not containing "-99":

# this attempt was to create, in "list", the exclusion formula for each
column.
# Here, I couldn't get subset to recognize "list" as the correct type.
# e.g. it works if I paste the value of list in the subset command
{
for(i in 1:ncol(x)){
if(i==1)
list<-paste("x[",i,"]!=-99", sep="")
else
list<-paste(list," ", " & x[",i,"]!=-99", sep="")
}
y<-subset(x, list, drop=FALSE);
}

# this will do it for one col, but if I index more
# it returns all rows
y <- x[!(x[,3] %in% -99),]

# this also works for one col
y<-x[x[,1]!=-99,]

# but if I index more, I get extra rows of NAs
y<-x[x[,1:5]!=-99,]

Thanks in advance.
Tim Howard

platform i386-pc-mingw32
arch     i386           
os       mingw32        
system   i386, mingw32  
status                  
major    2              
minor    0.1            
year     2004           
month    11             
day      15             
language R




More information about the R-help mailing list