[Rd] gregexpr - match overlap mishandled (PR#13391)

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Fri Dec 19 10:32:13 CET 2008


Prof Brian Ripley wrote:
> Please do your own homework: the help page says
>
>      For 'gregexpr' a list of the same length as 'text' each element of
>      which is an integer vector as in 'regexpr', except that the
>      starting positions of every (disjoint) match are given.
>                                   ^^^^^^^^
>
> If that is still not clear enough for you, please ask your supervisor
> for remedial help.
>

i must say that even knowing what is meant i have trouble finding it
written in the docs.  someone who's looking for help because he does not
know the answer yet will most certainly get confused here.

the point is that gregexpr returns the starting positions of *mutually
non-overlapping*, or *pairwise disjoint* matches (which still does not
tell the whole story).  the expression 'every (disjoint) match' is
nonsense, as it suggests that only those matches which are individually
disjoint (and what would this mean?) are reported, as if of all possible
matches found only those that pass some disjointness test in a grep-like
filter were left.

i'd humbly suggest that once users are referred to specific man pages,
those pages are reassessed and improved where necessary -- this one
would be a good candidate.

vQ



More information about the R-devel mailing list