[R] Characters vectors, NA's and "" in merges

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Sep 26 14:41:11 CEST 2001


On Wed, 26 Sep 2001, David Kane  <David Kane wrote:

> I often use merge with dataframes that contain character vectors which have
> elements that are sometimes "NA" (meaning the string NA, not the same thing,
> obviously, as NA in a numeric or factor vector). For example, the stock ticker
> for Nabisco was "NA". Unfortunately (for me), it seems like merge insists on
> inserting "NA" for missing values. My question: Is there some way around this?

> Here is a simple example:
>
> > version
>          _
> platform sparc-sun-solaris2.6
> arch     sparc
> os       solaris2.6
> system   sparc, solaris2.6
> status
> major    1
> minor    3.0
> year     2001
> month    06
> day      22
> language R
>
> > a <- data.frame(x = 1:4)
> > b <- data.frame(x = 1:3, y = c("NA", "a", "b"))

Take a look.  b$y is a factor with levels "a" and "b", and a missing first
value.

> > merge(a, b, all.x = TRUE)
>   x  y
> 1 1 NA
> 2 2  a
> 3 3  b
> 4 4 NA
>
> Rows 1:3 are what I expect them to be. Row 4 is "wrong" in the sense that
> dataframe b did not contain a row for x = 4. Of course, there is a sense that
> *any* value, including "", that is placed in row 4 is potentially
> misleading. Perhaps I am misunderstanding the meaning of "NA" in a character
> vector (i.e., I am not allowed to have "real" values that are that string).

That is the correct answer. Because you asked for all.x=TRUE, you
got a missing value there in row 4 col 2.

> If there were some way (an "nomatch" argument?) that the user could specify
> what missing values are used for character strings, then I would be
> fine. Again, I suspect that my real problem is not understanding how to specify
> "NA" -- meaning Nabisco's ticker symbol -- in a character vector.

You cannot avoid it being taken as the missing value, AFAIK.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list