[Rd] confusing all.equal output

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Fri Mar 3 18:01:06 CET 2023


>>>>> peter dalgaard 
>>>>>     on Thu, 2 Mar 2023 19:47:59 +0100 writes:

    > I believe the wording goes back to Martin Maechler many
    > moons ago (AFAICT towards the end of the last millennium.)
    > We might leave it to him to change it?
    > - Peter D.

Thank you, Peter.

Yes, this is *very* old.  I could claim that R users seem to get
more and more confused over time, because nobody had ever
complained for a quarter of a century .. (;-) ;-)

I know I had been inspired by the all.equal() implementation of
S-PLUS version 3.x (x = 4, IIRC) at the time, but then I also think
that I have to take the "full blame" on this :

Trying to think like myself "yesterday, when I was young ..",
I guess the argumentation for using  is.NA  was what I
considered helpful to the non experienced S / R user at the time:
Everybody has seen 'NA' before (and they see it in their objects
in this case) but only somewhat more experienced useRs would
know about is.na(). .. and it may be that at the time I found it
"slick" to combine the "NA" and "is.na" into  "is.NA" ...

About the other wording and how the mismatches should be counted, I
have no recollection.

But indeed, already in 1999, i.e., before R 1.0.0 existed,
that part of the code was

    out <- is.na(target)
    if(any(out != is.na(current)))
	return(paste("`is.NA' value mismatches:", sum(is.na(current)),
		     "in current,", sum(out), " in target"))

- - - 

Ok, now I need to work to commit a (completely orthogonal) change to
all.equal.numeric()  which had been lying around with me for
about a year at least... so I can start looking at your proposed
changes ...

Martin


    >> On 2 Mar 2023, at 19:30 , avi.e.gross using gmail.com wrote:
    >> 
    >> I think if you step back, you can ask what the purpose of
    >> an error message is and who designs it.
    >> 
    >> Is the message for the developer or others on their team
    >> or something an end-user knowing nothing about R will
    >> see.
    >> 
    >> This reminds me a bit of legal mumbo jumbo that turns
    >> many reading it off as it keeps talking about the party
    >> of the first part or the plaintiff as compared to
    >> somewhat straighter talk.
    >> 
    >> The scenario is that you are comparing two things. Their
    >> names are not things like "target" or "current" so even
    >> other programmers not involved in your code will pause
    >> and wonder.
    >> 
    >> One view is to use phrases like first and second
    >> arguments/lists/whatever.  You might talk about the one
    >> on the left (but using LHS is a bit opaque) versus the
    >> one on the right.
    >> 
    >> But sometimes it can be too verbose. Sometimes the error
    >> message is being generated not where everything is clear.
    >> 
    >> So ideally you could say:
    >> 
    >> WARNING Danger Will Robinson.  Comparing two things for
    >> equality.  Result finds mismatches.  There were NA found
    >> on the (left or right) that were not matched on the other
    >> side.  Number of such found: 2
    >> 
    >> If you had a Systems Engineer write detailed requirements
    >> that included something a bit better than the example and
    >> the programmer was able to supply the data using the
    >> words and guidelines, it might fit some needs but maybe
    >> not satisfy other programmers. But there are human
    >> factors people whose job it is to help choose among
    >> alternatives and although they may not choose well,
    >> letting a programmer come up with whatever they feel like
    >> is generally worse.
    >> 
    >> Yes, in their microcosm centered on a dozen lines of
    >> code, "current" and "target" may have meaning. But are
    >> they the intended user of the product?
    >> 
    >> -----Original Message----- From: R-devel
    >> <r-devel-bounces using r-project.org> On Behalf Of Antoine
    >> Fabri Sent: Thursday, March 2, 2023 12:23 PM To: peter
    >> dalgaard <pdalgd using gmail.com> Cc: R-devel
    >> <r-devel using r-project.org> Subject: Re: [Rd] confusing
    >> all.equal output
    >> 
    >> Good points. I don't mind the terminology since target
    >> and current are the names of the arguments. As the
    >> function is already designed to stop at the first failing
    >> check we might not need to enumerate or count the
    >> mismatches, instead we could have "`NA` found in `target`
    >> but not in `current` at position <FIRST_MISMATCH>"
    >> 
    >> [[alternative HTML version deleted]]
    >> 
    >> ______________________________________________
    >> R-devel using r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 
    >> ______________________________________________
    >> R-devel using r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel

    > -- 
    > Peter Dalgaard, Professor, Center for Statistics,
    > Copenhagen Business School Solbjerg Plads 3, 2000
    > Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23
    > Email: pd.mes using cbs.dk Priv: PDalgd using gmail.com



More information about the R-devel mailing list