[Rd] Future plans for raw data type?

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Sep 28 11:59:39 CEST 2005


On Tue, 27 Sep 2005 dhinds at sonic.net wrote:

> I've been working with raw vectors quite a bit and was wondering if
> the R team might comment on where they see raw vector support going in
> the long run.  Is the intent that 'raw' will eventually become a first
> class data type on the same level as 'integer'?  Or should 'raw' have
> more limited support, by design?

They _are_ `first class data types', atomic vectors, just like integers. 
The intent remains that their contents should not be interpreted, just as 
in the Green Book.  One comsequential difference from other atomic vectors 
is that there is no notion of NA for raw elements.

This means that there are basically no plans to add support for 
manipulation of raw vectors.  We have already gone quite a lot further 
than S does, and quite a few things have been considered undesirable (see 
below).

> For example, with very minor changes to subassign.c to implement some
> automatic coercions, raw vectors can become arguments to ifelse() and
> can be members of data frames.  Would this be desirable?

It is desirable that they can be members of data frames, which is why they 
_can_ be:

> y <- charToRaw("test")
> z <- data.frame(y)

format() was not handling raw until recently, but now does.  Thus z can 
now be printed.  (Again, it is somewhat dubious that one should be able to 
format/print raw vectors as that imposes an interpretation, but it is 
convenient.)

ifelse() is coded in a peculiar way that needs logical to be coercible 
(for some values of 'test') to a common mode for 'yes' and 'no'. 
Alternatives are given on its help page.

Given that you cannot interpret raw elements, you cannot unambiguously 
coerce logical to raw.  In particular there is no way to coerce logical NA 
to raw.  So what should ifelse(NA, yes, no) be?  There is no good answer, 
which is why the status quo is desirable.  (as.raw warns if you attempt 
this.)

You are vague as to which `automatic coercions' you think could be added, 
but at least this one was deliberately not added.

Digging around I did find one unanticipated problem.  If z is a list z$a 
<- raw_vector works but z[["a"]] <- raw_vector does not.  The reason is 
that for atomic vectors the latter first coerces the rhs to a list and 
then extracts the first element.  Which is clearly wasteful (and not 
documented), and I will take a closer look at it for 2.3.0, but I've added 
sticking plaster for 2.2.0.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list