[Rd] Package digest broken under R v2.4.0 devel

Henrik Bengtsson hb at stat.berkeley.edu
Fri Jul 28 17:52:04 CEST 2006


Found the reason for the bug.  Patch available online;

  source("http://www.braju.com/R/patches/digest.R")

In digest() the .Call() statement takes the serialized objected
converted to a string as its second argument;

    val <- .Call("digest", as.character(object), as.integer(algoint),
        as.integer(length), PACKAGE = "digest")

This relies on the fact that 'object' is a single character string not
a vector. Try object <- "a" and object <- c("a", "b"), and you'll get
the same result.

To generate the 'object' string, digest() calls serialize() before.
Now, in R v2.3.1 serialize(input, connect=NULL, ascii=TRUE) returns a
single string, but in R v2.4.0 it returns a raw vector.  This is [of
course ;)] document:

>From ?serialize Rv2.3.1:
For 'serialize', 'NULL' unless 'connection=NULL', when the result is
stored in the first element of a character vector (but is not a normal
character string unless 'ascii = TRUE'

>From ?serialze Rv2.4.0:
For serialize, NULL unless connection=NULL, when the result is stored
in a raw vector.

So the quick a dirty fix of digest() is to do:

 object <- serialize(object, connection=NULL, ascii=TRUE)
 object <- paste(object, collapse="")

This should work in either R version.  I've made this patch available
online. Just call:

  source("http://www.braju.com/R/patches/digest.R")

Its possible that it is faster to serialize to a 'textConnection'.
However, it might be even faster if your internal code, i.e.
.Call("digest", ...), accepts vectors so this does not have to be done
at the R level?

Cheers

Henrik

On 7/27/06, Henrik Bengtsson <hb at stat.berkeley.edu> wrote:
> [cc:ing to the maintainer of digest]
>
> FYI, package 'digest' (v0.2.1 2005/11/04 04:45:53) generates the same
> output regardless of input with R v2.4.0 devel (2006-07-25 r38698).
> Starting a vanilla R session you get:
>
> > library(digest)
> > digest(1)
> [1] "3416a75f4cea9109507cacd8e2f2aefc"
> > digest(2)
> [1] "3416a75f4cea9109507cacd8e2f2aefc"
> > digest(rnorm(10))
> [1] "3416a75f4cea9109507cacd8e2f2aefc"
>
> It works as expected with R v2.3.1 patched (2006-07-25 r38698):
> > library(digest)
> > digest(1)
> [1] "577e0eb2f3253fc5a8c4a287f7c10e7f"
> > digest(2)
> [1] "75eb91f4559682af50c21212d0dc013b"
>
> digest() uses serialize() internally, but it has nothing to do with
> that.  I managed to track it down to the call to .Call("digest", ...).
>
> BTW, thanks for a very useful package.
>
> Henrik
>



More information about the R-devel mailing list