[R] sub question

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Sun Feb 1 00:01:45 CET 2009

Gabor Grothendieck wrote:
> On Sat, Jan 31, 2009 at 4:46 PM, Wacek Kusnierczyk
> <Waclaw.Marcin.Kusnierczyk at idi.ntnu.no> wrote:
>> David Hajage wrote:
>>> Thank you, it's perfect.
>> to extend the context, if you were to solve the problem in perl, the
>> regex below would work in perl 5.10, but not in earlier versions of
>> perl;  another approach is to replace the unwanted leading characters
>> with equally many replacement characters at once.
>> $string = 'aabaab';
>> # perl 5.10
>> $string =~ s/a|(*COMMIT)(*FAIL)/c/g
>> # $string is 'ccbaab'
>> # any recent perl
>> $string =~ s/^a*/'c' x length $&/e;
>> # $string is 'ccbaab'
>> i don't know how (if) the latter could be done in r.
> This seems quite analogous:
> library(gsubfn)
> s <- "aabaab"
> gsubfn("^a*", ~ paste(rep("c", nchar(x)), collapse = ""), s)[[1]]

indeed, as does the following variant:

gsubfn("^a*", ~ gsub(".", "c", x), s)[[1]]

with some additional boring pedantry wrt. ?gsubfn, which says:

" If 'replacement' is a formula instead of a function then a one
     line function is created whose body is the right hand side of the
     formula and whose arguments are the left hand side separated by
     '+' signs (or any other valid operator).  The environment of the
     function is the environment of the formula.  If the arguments are
     omitted then the free variables found on the right hand side are
     used in the order encountered.  "

to my little mind, all of 'paste', 'rep', 'nchar', and 'x' in the
example above are *free variables* on the right of the formula.  you
might want to specify what 'free variable' means here, so that it will
be clear why the following still work:

s <- "aabaab"

x <- 'fooo'
gsubfn("^a*", ~ paste(rep("c", nchar(x)), collapse = ""), s)[[1]]
# "ccbaab", not "ccccbaab"

gsubfn("^a*", ~ paste(paste, collapse = ""), s)[[1]]
# "ccbaab", rather than a coercion error as in paste(paste)

gsubfn("^a*", ~ paste(version, collapse = ""), s)[[1]]
# "ccbaab", rather than rubbish containing the content of version

it seems that 'free variables' are those that do not appear in an
operator position.


More information about the R-help mailing list