[Rd] type.convert and doubles

Martin Maechler maechler at stat.math.ethz.ch
Sat Apr 19 15:00:10 CEST 2014

>>>>> McGehee, Robert <Robert.McGehee at geodecapital.com>
>>>>>     on Thu, 17 Apr 2014 19:15:47 -0400 writes:

    >> This is all application specific and
    >> sort of beyond the scope of type.convert(), which now behaves as it
    >> has been documented to behave.

    > That's only a true statement because the documentation was changed to reflect the new behavior! The new feature in type.convert certainly does not behave according to the documentation as of R 3.0.3. Here's a snippit:
    > The first type that can accept all the
    > non-missing values is chosen (numeric and complex return values
    > will represented approximately, of course).

    > The key phrase is in parentheses, which reminds the user to expect a possible loss of precision. That important parenthetical was removed from the documentation in R 3.1.0 (among other changes).

    > Putting aside the fact that this introduces a large amount of unnecessary work rewriting SQL / data import code, SQL packages, my biggest conceptual problem is that I can no longer rely on a particular function call returning a particular class. In my example querying stock prices, about 5% of prices came back as factors and the remaining 95% as numeric, so we had random errors popping in throughout the morning.

    > Here's a short example showing us how the new behavior can be unreliable. I pass a character representation of a uniformly distributed random variable to type.convert. 90% of the time it is converted to "numeric" and 10% it is a "factor" (in R 3.1.0). In the 10% of cases in which type.convert converts to a factor the leading non-zero digit is always a 9. So if you were expecting a numeric value, then 1 in 10 times you may have a bug in your code that didn't exist before.

    >> options(digits=16)
    >> cl <- NULL; for (i in 1:10000) cl[i] <- class(type.convert(format(runif(1))))
    >> table(cl)
    > cl
    > factor numeric
    > 990    9010


Murray's point is valid, too.

But in my view, with the reasoning we have seen here,
*and* with the well known software design principle of
 "least surprise" in mind,
I also do think that the default for type.convert() should be what
it has been for > 10 years now.


More information about the R-devel mailing list