[R] subset using noncontiguous variables by name (not index)

Muenchen, Robert A (Bob) muenchen at utk.edu
Sun Aug 26 22:37:53 CEST 2007


Hi All,

I'm using the subset function to select a list of variables, some of
which are contiguous in the data frame, and others of which are not. It
works fine when I use the form:

subset(mydata,select=c(x1,x3:x5,x7) )

In reality, my list is far more complex. So I would like to store it in
a variable to substitute in for c(x1,x3:x5,x7) but cannot get it to
work. That use of the c function seems to violate R rules, so I'm not
sure how it works at all. A small simulation of the problem is below. 

If the variable names & orders were really this simple, I could use
indices like 

summary( mydata[ ,c(1,3:5,7) ] ) 

but alas, they are not. 

How does the c function work this way in the first place, and how can I
make this substitution?

Thanks,
Bob

mydata <- data.frame(
  x1=c(1,2,3,4,5),
  x2=c(1,2,3,4,5),
  x3=c(1,2,3,4,5),
  x4=c(1,2,3,4,5),
  x5=c(1,2,3,4,5),
  x6=c(1,2,3,4,5),
  x7=c(1,2,3,4,5)
)
mydata

# This does what I want.
summary( 
  subset(mydata,select=c(x1,x3:x5,x7) ) 
)

# Can I substitute myVars?
attach(mydata)
myVars1 <- c(x1,x3:x5,x7)

# Not looking good!
myVars1

# This doesn't do the right thing.
summary( 
  subset(mydata,select=myVars1 ) 
)

# Total desperation on this attempt:
myVars2 <- "x1,x3:x5,x7"
myVars2

# This doesn't work either.
summary( 
  subset(mydata,select=myVars2 )
)



=========================================================
Bob Muenchen (pronounced Min'-chen), Manager 
Statistical Consulting Center
U of TN Office of Information Technology
200 Stokely Management Center, Knoxville, TN 37996-0520
Voice: (865) 974-5230 
FAX: (865) 974-4810
Email: muenchen at utk.edu
Web: http://oit.utk.edu/scc, 
News: http://listserv.utk.edu/archives/statnews.html



More information about the R-help mailing list