[Rd] Future plans for raw data type?

David Hinds dhinds at sonic.net
Wed Sep 28 18:27:40 CEST 2005


On Wed, Sep 28, 2005 at 10:59:39AM +0100, Prof Brian Ripley wrote:
> 
> They _are_ `first class data types', atomic vectors, just like integers. 
> The intent remains that their contents should not be interpreted, just as 
> in the Green Book.  One comsequential difference from other atomic vectors 
> is that there is no notion of NA for raw elements.

That's reasonable.  I should have known to be more specific in saying
what I meant by "first class data type", but drawing the line at
interpreting the contents of a raw seems fine.

> It is desirable that they can be members of data frames, which is why they 
> _can_ be:
> 
> >y <- charToRaw("test")
> >z <- data.frame(y)

Hmmm, that's interesting; I wonder how I missed the fact that this
worked.  Somehow I managed to only try things that didn't, even though
the obvious case does work.  Here are some things that are broken:

  x <- data.frame(a=1:10)
  x$b <- as.raw(1:10)
  x[[2]] <- as.raw(1:10)
  x <- data.frame(as.raw(1:10))
  x[1,]
  x[1,1]

> Given that you cannot interpret raw elements, you cannot unambiguously 
> coerce logical to raw.  In particular there is no way to coerce logical NA 
> to raw.  So what should ifelse(NA, yes, no) be?  There is no good answer, 
> which is why the status quo is desirable.  (as.raw warns if you attempt 
> this.)
>
> You are vague as to which `automatic coercions' you think could be added, 
> but at least this one was deliberately not added.

Because of how ifelse() is implemented, for type 'X', it requires both
'X' <- logical and logical <- 'X' coercions.  The 'X' <- logical
coercion is used for handling NA elements in 'test' even if there are
none.  The logical <- 'X' coercion is required due to how ifelse()
constructs the result vector from 'yes' and 'no'.  The logical <- raw
coercion seems unambiguous.  Arguably, ifelse() should not care about
the ability to represent NA if there are no NA values in 'test', and
could do:

    if (any(nas)) ans[nas] <- NA

instead of:

    ans[nas] <- NA

I think this is a little bit more consistent with how ifelse() handles
the 'yes' and 'no' arguments, as well (they are only evaluated if they
are actually used).  But as you say, ifelse() is pretty easily to work
around.

> Digging around I did find one unanticipated problem.  If z is a list z$a 
> <- raw_vector works but z[["a"]] <- raw_vector does not.  The reason is 
> that for atomic vectors the latter first coerces the rhs to a list and 
> then extracts the first element.  Which is clearly wasteful (and not 
> documented), and I will take a closer look at it for 2.3.0, but I've added 
> sticking plaster for 2.2.0.

I think this is related to the problems I described above, and I
suspect that your fix is the same as mine (i.e. handle "case 1924" in
subassign.c).

-- Dave



More information about the R-devel mailing list