[Rd] [g]sub behaviour with NA (PR#6451)

maechler at stat.math.ethz.ch maechler at stat.math.ethz.ch
Thu Jan 22 17:51:55 MET 2004


Thank you for the report.

Yes, a slight inconsistency, but not a bug, really.

>>>>> "JonS" == JonS  <JonS at swintons.net>
>>>>>     on Wed, 21 Jan 2004 17:36:40 +0100 (CET) writes:


    JonS> Attempting to substitute a NA causes an
    JonS> error in sub.

    >> sub(x=NA,pattern="x",replacement="y")
    JonS> Error in sub(pattern, replacement, x,
    JonS> ignore.case, extended) : invalid argument

    >> sub(x=NA,pattern=NA,replacement="y")
    JonS> [1] NA

but the point is that NA's are not equal:
A simple 'NA' is logical, so one could argue 
t should give an error in any case.

First note the following:

> str(sub(x=NA,pattern=NA, replacement="y"))
 chr NA
> str(cNA <- as.character(NA))
 chr NA
> str(sub(x=cNA,pattern=cNA, replacement="y"))
 chr NA
> str(sub(x=cNA,pattern="x", replacement="y"))
 chr NA

So sub() works fine for character NA's .

    JonS> The help page for sub says only For
    JonS> 'regexpr' it is an error for 'pattern' to be
    JonS> 'NA', otherwise 'NA' is permitted and
    JonS> matches only itself.  so that this behaviour
    JonS> is undocumented.

    JonS> I believe that sub(x=NA,pattern,replacement)
    JonS> should always be NA.

We could extend sub() / gsub() to coerce its `x' to character,
then this would work, 
or we should make it give an error because NA is logical, not a
character.

The first option would be inline with already allowing logical
NA for `pattern' in sub/gsub.

Opinions?



More information about the R-devel mailing list