[R] grep on vectors?

Chuck White chuckwhite8 at charter.net
Tue Jun 30 18:53:24 CEST 2009


Input: dataframe with 300+columns for a regression. It consists of sets of factors whose names have the same structure. For example, aa1,aa2,aa3 could be one set of factors.

After reading in the dataframe, I would like to compute the density (%nonzeroes) for certain groups of factors and delete the factors which are below the density threshold. I would like to use regular expressions to specify the factor names.

density.factor = c("^aaa","^bbb")
density.faccol=c()
for(fac in density.factor){
    density.faccol=c(density.faccol,grep(fac,names(data.df)))
}
data.df=data.df[,-density.faccol]

Is there a way to avoid the for loop? The following seems to work:
  lapply(density.factor,grep,names(data.df))
However, that produces a list of lists which need to be merged. Note that in the above example since we have 2 regular expressions, there will be two lists but in the general case there will be many more.

Questions (i) how do I merge the lists into a single list (ii) is there a better way to achieve the "vectorized" grep?

Thanks.




More information about the R-help mailing list