[Rd] mean relative differences from all.equal() (PR#9276)

Thu Oct 5 04:57:41 CEST 2006

On Wed, 2006-10-04 at 20:22 -0500, Marc Schwartz wrote:
> On Thu, 2006-10-05 at 03:10 +0200, bchristo at email.arizona.edu wrote:
> > Full_Name: Brad Christoffersen
> > Version: 2.3.1
> > OS: Windows XP
> > Submission from: (NULL) (128.196.193.132)
> > 
> > 
> > Why is the difference between two numbers so different from the "mean relative
> > difference" output from the all.equal() function?  Is this an artifact of the
> > way R stores numerics?  I could not find this problem as I searched through the
> > submitted bugs. But I am brand new to R so I apologize if there is something
> > obvious I'm missing here.
> > 
> > rm(list=ls(all=TRUE))  ## Remove all objects that could hinder w/ consistent
> > output
> > a <- 204
> > b <- 203.9792
> > all.equal(a,b)
> > [1] "Mean relative  difference: 0.0001019608"
> > a - b
> > [1] 0.0208
> 
> Read the Details section of ?all.equal, which states:
> 
> Numerical comparisons for scale = NULL (the default) are done by first
> computing the mean absolute difference of the two numerical vectors. If
> this is smaller than tolerance or not finite, absolute differences are
> used, otherwise relative differences scaled by the mean absolute
> difference.
> 
> If scale is positive, absolute comparisons are made after scaling
> (dividing) by scale
> 
> 
> Thus on R version 2.4.0 (2006-10-03):
> 
> > all.equal(a, b, scale = 1)
> [1] "Mean scaled  difference: 0.0208"
> 
> 
> Please do not report doubts about behavior as bugs.  Simply post a query
> on r-help first. If it is a bug, somebody will confirm it and you can
> then report it as such.
> 
> BTW, time to upgrade...Go Wildcats!
> 
> HTH,
> 
> Marc Schwartz

[OFFLIST and PRIVATE]

Brad,

A couple of comments.

First, welcome to R. I hope that you enjoy it and find it of value.

If you are not used to open source software and communities (ie. Linux,
etc.), you will find that this community, unlike commercial paid support
forums, tends to be direct with respect to comments. Don't take it
personally.

Be aware that nobody is getting paid to support R. It is developed and
supported on a voluntary basis by a large body of folks, mainly those
known as "R Core". Some of them have quite literally risked their
academic careers and livelihood to facilitate R's existence.

You will, over time, get a flavor for the nature of the community and
the interchange that takes place. As a result of the voluntary nature of
the community, there is an a priori expectation that you will have put
forth reasonable efforts to avail yourself of the various support
resources before posting. Especially in the case of a bug report, as a
member of R Core has to manually manage the handling and resolution of
bug reports.

A good place to start is to review the R Posting Guide:

http://www.r-project.org/posting-guide.html

which covers many of these issues and how to go about getting support
via the various sources provided.

That all being said, you will find that R's support mechanisms and
resources are second to none and I would challenge any commercial
software vendor to provide a comparable level of support and expertise.

With respect to your specific question above and how the result is
obtained:

> (a - b) / a
[1] 0.0001019608

Here, 'a' is used as the scaling factor, since you only passed single
values. If these were 'vectors' of values, the scaling factor would be
impacted accordingly.

As a result of R's open source nature, you have access to all of the
source code that is R. You can download the source tarball (archive)
from one of the CRAN mirrors, if you so desire.

In this case, the actual function that is used is called
all.equal.numeric(). This is a consequence of how R uses 'dispatch
methods' after a call to a 'generic' function, such as all.equal(). If
you are not familiar with these terms, the available R documentation is
a good place to start, if you should decide to pursue moving into that
level of detail. If you have experience in other programming languages,
this may be second nature already.

In many cases, R's functions are written in R itself. Others are written
in FORTRAN and/or C that is compiled and linked to R via various calling
mechanisms. Since R is an interpreted language, you can have easy access
to many of the functions within the R console.

Thus, at the R command prompt, you can type:

> all.equal.numeric

[Note without the parens]

which will then display a representation of the function's source code,
enabling you to review how the function works. If you desire to become a
better R user/programmer, this approach provides a reasonable way to see
how functions are coded and to investigate algorithms and techniques.

I hope that the above is helpful.

Best regards,

Marc