[Rd] Undocumented bahavior of as.integer() (PR#2430)

Philippe Grosjean phgrosjean@sciviews.org
Wed Jan 8 14:21:02 2003


The problem is that it can lead to bugs in code using X[y] with y being a
vector of doubles, and these bugs are very difficult to track. It is teach
to R users that they do not need to bother too much about the storage mode.
For instance, in the 'Introduction to R' manual, p.8:

"For most purposes the user will not be concerned if the "numbers" in a
numeric vector are integers, reals or even complex. Internally calculations
are done as double precision real numbers, or double precision complex
numbers if the input data are complex."

This is correct: "for _most_ purposes". A language that manages type
conversion automatically is nice, but it is fair also to place warnings
where there are exceptions to those "_most_ purposes". That is why I
consider a warning is required... An example to illustrate it should be
useful too (but perhaps more useful in Extract.Rd than in as.integer.Rd).
Something like:

>ind <- 10 * 73.1 + (10 * seq(14)) - 76)
>ind
>mat <- seq(1420)
>submat1 <- mat[ind]        #  NOT what I expected
>submat2 <- mat[round(ind)] #  much better

which is derived from Tom Blackwell's example discussed on R-help a few days
ago.

Best,

Philippe

-----Original Message-----
From: Martin Maechler [mailto:maechler@stat.math.ethz.ch]
Sent: mercredi 8 janvier 2003 11:50
To: phgrosjean@sciviews.org
Cc: R-bugs@biostat.ku.dk
Subject: Re: [Rd] Undocumented bahavior of as.integer() (PR#2430)


>>>>> "PhGr" == Philippe Grosjean <phgrosjean@sciviews.org>
>>>>>     on Wed, 8 Jan 2003 11:24:40 +0100 (MET) writes:

    PhGr> as.integer() truncates doubles toward zero, as Splus
    PhGr> does (at least v. 6.1 under Windows does). Thus:

(fortunately this is not OS dependent!)

    >> look <- (10 * seq(14)) - 76
    >> 10 * (73.1 + look)
    PhGr>[1] 71 171  271  371  491  586  681  791  886  981 1101 1201 1301
1401
    >> as.integer(10 * (73.1 + look))
    PhGr>[1] 70 170  270  370  490  586  681  791  886  981 1101 1201 1301
1401

    PhGr> ... It is not documented in R! I propose appending the following
to
    PhGr> as.integer.Rd:

I agree the doc should mention it.
I disagree with the warning section.
In R, our code really just uses something like

   int asInt(double x) { return x; }

which makes use of C's  "implicit casting".
I know looked in Kernighan & Ritchie (2nd Ed.; {3rd Ed would be better})
and found (p.197)

   >>  "A6.3 Integer and Floating"
   >>
   >>  When a value of floating type is converted to integral type,
   >>  the fractional part is discarded. ...................

Hence this is (fortunately!) part of the C standard.
But I really think any decent programming language would do it
like that (many would not allow implicit coercion though..).
That's the reason I think the warning is not necessary; I'd
rather mention it by the way.

    PhGr> \section{ WARNING }{ During coercion of doubles, real
    PhGr> numbers are not rounded but truncated (the closest
    PhGr> integer towards zero is returned).  Attributes are
    PhGr> deleted.}

    PhGr> And I suggest adding the previous exemple in the
    PhGr> corresponding section in as.integer.Rd. Moreover, the
    PhGr> subset operation [] uses as.integer() and
    PhGr> consequently, can suffer from the same syndrome. A
    PhGr> WARNING section in Extract.Rd would be welcome too.

"suffer" and "syndrome"  are not appropriate here IMHO.

Martin Maechler <maechler@stat.math.ethz.ch>	http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO C16	Leonhardstr. 27
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1228			<><