[Rd] Save and serialize

Henrik Bengtsson hb at biostat.ucsf.edu
Tue Feb 8 01:21:50 CET 2011


On Mon, Feb 7, 2011 at 3:15 PM, Hadley Wickham <hadley at rice.edu> wrote:
> Thanks to you both for the information - that's exactly the level of
> detail I was looking for.  I ask because I want to play around with a
> function to automatically cache expensive operations to disk, in a way
> that can be lazy loaded on the next run.

So starting with digest v0.3.0 (April 2007), the digest() method can
be considered consistent across R version (in addition to across R
sessions).


FYI, recently in R-devel serialize(), which digest() relies on, gained
a 'version' argument reserved for future usage.  From NEWS:

- serialize() and unserialize() are no longer described as
‘experimental’. The interface is now regarded as stable, although the
serialization format may well change in future releases. (serialize()
has a new argument version which would allow the current format to be
written if that happens.)

I've tested, and the introduction of this argument was done such that
the serialized object is identical to as before (R <= 2.12.x).  Thus,
digest() will generate the same output also in R v2.13.0.  At some
point, we will add an option to digest() for specifying what 'version'
value should be passed to serialize(), but it doesn't sound like it is
too urgent to add that.  Any updates to digest() will also be backward
compatible, so as long as you use digest() you shouldn't have to worry
about consistency.

/Henrik

>
> Hadley
>
> On Mon, Feb 7, 2011 at 4:06 PM, Henrik Bengtsson <hb at biostat.ucsf.edu> wrote:
>> Also, if it it adds any value to what you are looking for, the output
>> of serialize() also has header information, cf. R-devel thread 'Small
>> inconsistency in serialize() between R versions and     implications on
>> digest()' started March 7, 2007:
>>
>>  http://www.mail-archive.com/r-devel@r-project.org/msg07931.html
>>
>> It caused us some headaches when trying to generate identical output
>> of the same input using different versions of R.  It was solved in
>> that thread.  See code for digest::digest() on how to skip/ignore that
>> header.
>>
>> /Henrik
>>
>>
>> On Mon, Feb 7, 2011 at 1:51 PM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
>>> On Mon, 7 Feb 2011, Hadley Wickham wrote:
>>>
>>>> Hi all,
>>>>
>>>> Is there any relationship between save and serialize?  Do they use the
>>>> same algorithm?
>>>
>>> See the R-internals manual: there is more info in the R-devel version, not
>>> least because saveRDS() is added to the mix.
>>>
>>> But basically serialize() and saveRDS() use the same format, and save()
>>> writes a header and then serializes a pairlist of the objects given.
>>>
>>> 'The same algorithm' is somewhat misleading here: strictly no, as they
>>> manage to use four entry points to the code base.
>>>
>>>>
>>>> Hadley
>>>>
>>>> --
>>>> Assistant Professor / Dobelman Family Junior Chair
>>>> Department of Statistics / Rice University
>>>> http://had.co.nz/
>>>>
>>>> ______________________________________________
>>>> R-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>
>>> --
>>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>>> University of Oxford,             Tel:  +44 1865 272861 (self)
>>> 1 South Parks Road,                     +44 1865 272866 (PA)
>>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>
>
>
>
> --
> Assistant Professor / Dobelman Family Junior Chair
> Department of Statistics / Rice University
> http://had.co.nz/
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list