[R] Variable substitution in grep pattern

Alberto Fornasier gimli at email.it
Thu Jan 29 20:07:05 CET 2004


Hi everibody.
I'm working with a dataframe with many character vector in which each
observation is made of one or more unique values.
Example:

> Licenza[56:58]
[1] BSD License, GNU Library or Lesser General Public License (LGPL)
[2] Qt Public License (QPL)
[3] GNU General Public License (GPL)
66 Levels:  ... Zope Public License

As you can see, the observation can have one or more Licenses associated
with them.
I want to build a vector with the number of times every element (e.g.
"BSD License") occurs in the vector, by itself or in association with
others (i.e. I want to count the elements containing "BSD License" as
well as those containing "BSD License, GNU Library or Lesser General
Public License (LGPL)", and so on).

I've tried to use a "for" loop as follows:

> for(i in Licenza.elenco) {
+ Licenza.elenco.prova[Licenza.elenco==i] <-
  length(grep(".*i.*",as.character(Licenza)))}

In which Licenza.elenco is a character vector containing all unique
values I need to match (e.g. BSD License, Qt Public License (QPL), GNU
General Public License (GPL)).
However R handles as I expect only the first variable substitution (the
index), but grep matches all strings containing the letter "i", that is
100% of the vector, except NAs of course.
After running the above code I get:

> Licenza.elenco.prova
[1] 2235 2235 2235

I've tried escaping the variable name, enclosing it in brackets, but
nothing works as I want.
I'm sure I'm doing something wrong, but what?

Thaks in advance

Alberto Fornasier




More information about the R-help mailing list