[R] issue with nzchar() ?

R. Michael Weylandt michael.weylandt at gmail.com
Mon Aug 6 17:27:00 CEST 2012


On Mon, Aug 6, 2012 at 9:53 AM, Liviu Andronic <landronimirc at gmail.com> wrote:
> On Mon, Aug 6, 2012 at 4:48 PM, Liviu Andronic <landronimirc at gmail.com> wrote:
>> string, something that I find strange. At best NA is the equivalent of
>> an empty string.

Certainly not to my mind, unless you think that zero and NA should be
the same for integers and doubles as well. NA (in whatever form) is,
to my mind, _unknown_ which is very different than knowing 0.

>> In this sense, if you Hmisc::describe() the vector
>> you get, as I would expect, that in the context of character vectors
>> NA and '' values are considered together:
>>
>
> By the way, same question holds for nchar(): Should NA values be
> reported as 2-char strings, or as 0-char empty/missing values?
>> x <- c(letters, NA, '')
>> nchar(x)
>  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 0
>

I'm not sure why that's the case, but it's documented on the help page
(under value):

 For ‘nchar’, an integer vector giving the sizes of each element,
     currently always ‘2’ for missing values (for ‘NA’).

so I don't see any bug.

My guess is that it's this way for back-compatability from a time when
there probably wasn't a proper NA_character_ (that's the parser
literal for a character NA) and they really were just "NA" (the
string) -- perhaps in some far distant R 3.0 we'll see
nchar(NA_character_) = NA_integer_

Best,
Michael

>
> Liviu
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list