[R] Doubt about pattern

Peter J. Acklam pjacklam at online.no
Fri Jan 30 14:34:43 CET 2004


Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:

>Peter J. Acklam wrote:
>
> > Marcelo Luiz de Laia wrote:
> >
> > > > files <- dir(pattern="*.sens")
> >
> > That's not even a valid regular expression in most applications,
> > but "dir" does allow it, for some reason.  Anyway, I think you
>
> It is a valid regex in GNU's regex code as used by R, and all the GNU
> and non-GNU applications I tried accepted it.  `*' matches itself when
> not used as a repetition qualifier.  (I tried several greps, including
> those claiming strict POSIX compliance.)
>
> So, can you please list the `most applications' you tried or give a
> reference for your assertion?

I should have said "many applications", not "most applications."
I use Solaris, and I don't know one Solaris application which allows
it, including egrep, oawk, and nawk.  And no version of Perl allows it.

GNU-tools allow it, but in an inconsistent way.  GNU Emacs treats `*x' as
`\*x' (match a literal star and a literal x), but GNU grep (version 2.5)
treats `*x' as `.*x' (match anything up to and including the first x),
which is something quite different.

It's a mess, it's inconsistent, and it's not portable.  I suggest people
stop using "*..." and use "\*..." or ".*..." depending on what is wanted.

Peter

-- 
Peter J. Acklam - pjacklam at online.no - http://home.online.no/~pjacklam




More information about the R-help mailing list