[Rd] log(i, base=i) not giving 1

peter dalgaard pdalgd at gmail.com
Wed Sep 3 12:56:27 CEST 2014


Two notes:

1) For version number comparisons, we could just add a small fuzz like floor(log(x,8)+.Machine$double.eps) and begone with the issue.

2) I did see some other spots where we use log10() to compute field widths for decimal representation of numbers. It might be good to check whether there are cases of log10(10^n) < n and if so, apply fuzz as above. (Ensuring that both (log10(10^n) + fuzz > n) and (log10(10^n-1) + fuzz < n) could get tricky, but maybe the problem just isn't there.)

-p

On 03 Sep 2014, at 10:47 , Martin Maechler <maechler at stat.math.ethz.ch> wrote:

>>>>>> Prof Brian Ripley <ripley at stats.ox.ac.uk>
>>>>>>    on Wed, 3 Sep 2014 06:46:47 +0100 writes:
> 
>> On 02/09/2014 22:43, Ben Bolker wrote:
>>> On 14-09-02 08:48 AM, Martin Maechler wrote:
>>>>>>>>> peter dalgaard <pdalgd at gmail.com>
>>>>>>>>> on Tue, 2 Sep 2014 13:43:21 +0200 writes:
>>>> 
>>>>> Impressive. Never ceases to amaze me what computers can do these days. ;-)
>>>> 
>>>> Indeed,
>>>> 
>>>>> It's even more impressive given that we have
>>>> 
>>>> 
>>>>> static double logbase(double x, double base)
>>>>> {
>>>>> #ifdef HAVE_LOG10
>>>>> if(base == 10) return x > 0 ? log10(x) : x < 0 ? R_NaN : R_NegInf;
>>>>> #endif
>>>>> #ifdef HAVE_LOG2
>>>>> if(base == 2) return x > 0 ? log2(x) : x < 0 ? R_NaN : R_NegInf;
>>>>> #endif
>>>>> return R_log(x) / R_log(base);
>>>>> }
>>>> 
>>>>> which, except possibly for base-10 and base-2, would seem to be quite hard to convince to return anything other than 1 if x == base....
>>>> 
>>>> Amazingly indeed, it does:  From the few platforms I can try
>>>> here, I only see the problem
>>>> on 32 bit Linux, both an (old) ubuntu 12.04.5 and Fedora 19.
>>>> 
>>>>> i <- 2:99; i[log(i, base=i) != 1]
>>>> [1]  5  8 14 18 19 25 58 60 64 65 66 67 75 80 84 86 91
>>>> 
>>>> so '8' is not so special (as Ben suggested) and also not the
>>>> only power of two for which this happens :
>>>> 
>>>>> i <- 2^(1:20); i[log(i, base=i) != 1]
>>>> [1] 8    64   128  4096  8192 16384
>>>> 
>>>> Interestingly, it does not happen on 32-bit Windows, nor on any
>>>> 64-bit version of R I've tried.
>>>> Yes, it is amazing that with computer arithmetic, we can't
>>>> even guarantee that   U / U = 1  exactly.
>>>> 
>>>> Is it basically free to check for x == base in there, so we
>>>> could change to
>>>> 
>>>> return (x == base) ? 1. : R_log(x) / R_log(base);
>>>> 
>>>> ?
>>> 
>>> Should I submit a formal bug report about this, or should I assume it
>>> has now been sufficiently noted by R-core?
> 
>> It is under consideration by the author (not me).
> 
>>> Opinions about whether it is better to fix in logbase (according to
>>> Martin's suggestion) or in version.R (e.g. BDR's suggestion of using
>>> log2(x)/3 or some other way that makes the function less sensitive) or both?
> 
>> Yet again, it is better to write code which does not make false 
>> assumptions about machine arithmetic.  Fixing one very rare case at the 
>> expense of speed for all others is not a good idea.
> 
> Well, of course, that's why I asked about the expense. If it is
> neglible, as it may well be here,  we have in other circumstances
> opted for more platform-independence with very slight penalties,
> e.g., when wrapping system library functions by our own.
> 
> After all, arithmetic in R *has* been somewhat more portable
> between platforms than arithmetic in C (C++, Fortran,..) exactly
> because of that.
> For the present case, I'm not making a strong argument, and find
> it ok to keep the current implementation, though possibly we
> should mention the issue in the documentation then. 
> 
>> Studying http://www.validlab.com/goldberg/addendum.html is a good idea 
>> for those unfamiliar with the pitfalls of i86 FPUs.  It has an example 
>> essentially the same as this one right at the top.
> 
> Indeed.  
> OTOH, with this reasoning:
> 
>> Also, Ben Bolker needs to change his compiler options to ones better 
>> than the defaults in his unstated Linux distro.   Why anyone is using 
>> ix86 Linux nowadays is hard to understand when x86_64 is (on all but the 
>> very smallest machines) faster and more reliable.
> 
> one could argue that the platforms where  log(i, base=i) != 1
> should become much more rare in the future, and when 99.x % of
> implementations provide a feature, chances are very high that user
> code will make this assumption {after all, R core code just did
> for a while!} in cases that are untested and occuring very rarely,
> exactly the dangerous kinds of bugs.
> 
> As R code--even R packages--will continue to be written mostly by
> people who do not know about possible FP pitfalls (*) and would
> attribute them to their "software", i.e. in this case R in this case, 
> I still see a reason for special casing, possibly even nicely
> #ifdef'ed  , not causing any penalty on "-ffloat-store" or
> x86_64 platforms ?
> 
> (*) Our 99.x % of R users would lack  too much background
>    context to understand the problem even when urged to "learn"
>    about it in some way.  They will always blame R in such cases.
> 
> Martin
> 
> 
>>> 
>>> Ben Bolker
>>> 
>>>> 
>>>>> On 02 Sep 2014, at 03:27 , Ben Bolker <bbolker at gmail.com> wrote:
>>>> 
>>>>>> log(8, base=8L)-1
>>>>>> log(8, base=8)-1
>>>>>> logvals <- setNames(log(2:25,base=2:25)-1,2:25)
>>>>>> logvals[logvals!=0]  ## 5,8,14,18,19,25 all == .Machine$double.eps/2
>>>> 
>>>>> --
>>>>> Peter Dalgaard, Professor,
>>>>> Center for Statistics, Copenhagen Business School
>>>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>>>> Phone: (+45)38153501
>>>>> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>>>> 
>>>>> ______________________________________________
>>>>> R-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>> 
>>> 
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
> 
> 
>> -- 
>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>> Emeritus Professor of Applied Statistics, University of Oxford
>> 1 South Parks Road, Oxford OX1 3TG, UK
> 
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-devel mailing list