[Rd] undesirable rounding off due to 'read.table' (PR#8974)

Hin-Tak Leung hin-tak.leung at cimr.cam.ac.uk
Tue Jun 13 16:22:15 CEST 2006


overeem at knmi.nl wrote:
> Full_Name: Aart Overeem
> Version: 2.2.0
> OS: Linux
> Submission from: (NULL) (145.23.254.155)
> 
> 
> Construct a dataframe consisting of several variables by using 'data.frame' and
> 'cbind' and write it to a file with 'write.table'. The file consists of headers
> and values, such as 12.4283675334551 (so 13 numbers behind the decimal point).
> If this dataframe is read with 'read.table(filename, skip = 1)' or
> 'read.table(filename, header = TRUE') the values only have 7 numbers behind the
> decimal point, e.g. 12.42837. So, the reading rounds off the values. This is not
> mentioned in the manual. Although the values still have many numbers behind the
> decimal point, rounding off is, in my view, never desirable.

Hmm, this is probably due to conversion by the scanf family of functions
(I don't know the precise location or mechanism of R doing it, this is a 
guess). It is mentioned in my manpage of sscanf:

        f      Matches an optionally signed floating-point number;
                the next pointer must be a pointer to float.
        e      Equivalent to f.
        g      Equivalent to f.
        E      Equivalent to f.
        a      (C99) Equivalent to f.

So printf/fprintf/sprintf and scanf/sscanf/fscanf are not symmetrical,
and you lose precision from 15 (double) to 7 (float). It is a
generic problem with ANSI C's printf/scanf, not specific to R.

Why don't you use save() or save.image() instead for saving and 
reloading data.frame ? It is *much faster*, you get much smaller file,
and also more accurate. Just my two cents.

HTL



More information about the R-devel mailing list