[R] Selecting A List of Columns

Sparks, John James jspark4 at uic.edu
Fri May 17 08:51:56 CEST 2013


Dear R Helpers,

I need help with a slightly unusual situation in which I am trying to
select some columns from a data frame.  I know how to use the subset
statement with column names as in:


x=as.data.frame(matrix(c(1,2,3,
        1,2,3,
        1,2,2,
        1,2,2,
        1,1,1),ncol=3,byrow=T))

all.cols<-colnames(x)
to.keep<-all.cols[1:2]

Kept<-subset(x,select=to.keep)
Kept

However, if I want to select some columns based on a selection of the most
important variables from a random forest then I find myself stuck.  The
example below demonstrates the problem.


library(randomForest)

data(mtcars)
mtcars.rf <- randomForest(mpg ~ ., data=mtcars,importance=TRUE)
Importance<-data.frame(mtcars.rf$importance)
Importance



MSEImportance<-head(Importance[order(Importance$X.IncMSE,
decreasing=TRUE),],3)
MSEVars<-row.names(MSEImportance)
MSEVars<-data.frame(MSEVars,stringsAsFactors = FALSE)
colnames(MSEVars)<-"Vars"

NodeImportance<-head(Importance[order(Importance$IncNodePurity,decreasing=TRUE),],
3)
NodeVars<-row.names(NodeImportance)
NodeVars<-data.frame(NodeVars,stringsAsFactors = FALSE)
colnames(NodeVars)<-"Vars"


ImportantVars<-rbind(MSEVars,NodeVars)
ImportantVars<-unique(ImportantVars)
nrow(ImportantVars)
ImportantVars<-as.character(ImportantVars)
ImportantVars
CarsVarsKept<-subset(mtcars,select=ImportantVars)
Error in `[.data.frame`(x, r, vars, drop = drop) :
  undefined columns selected

Any help on how to select these columns from the data frame would be most
appreciated.

--John J. Sparks, Ph.D.



More information about the R-help mailing list