[R] Why does match() treat NaN's as compables; Bug or Feature?

Bert Gunter bgunter.4567 at gmail.com
Sun Feb 28 04:06:05 CET 2016


(on list, since others might not have gotten it either).

OK, I get it now. It was I who misunderstood.

But isn't the bug in the **misuse** of match() in ecdf() (by failing
to specify the nomatch argument). Jeff says comparisons with NaN
should return an unordered result, which NaN is afaics:

> NaN < 0
[1] NA
> NaN > 0
[1] NA

match() just does its thing:

> match(c(NA,NaN),c(1,2,NA,3,4,NaN,5))
[1] 3 6

It's up to the caller to use it correctly, which apparently ecdf() fails to do.

Am I missing something here?

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Feb 27, 2016 at 3:49 PM, Jason Thorpe <jdthorpe at gmail.com> wrote:
> The bug is that NaN is not part of any cumulative distribution...
>
> -Jason
> sent from my mobile device
>
> On Feb 27, 2016 3:34 PM, "Bert Gunter" <bgunter.4567 at gmail.com> wrote:
>>
>> If I understand you correctly, the "bug" is that you do not understand
>> match(). See inline comment below and note carefully the "Value"
>> section of ?match.
>>
>> Cheers,
>> Bert
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Sat, Feb 27, 2016 at 2:52 PM, Jason Thorpe <jdthorpe at gmail.com> wrote:
>> > For some reason `match()` treats `NaN`'s as comparables by default:
>> >
>> >> x <- c(1,2,3,NaN,4,5)
>> >> match(x,x)
>> > [1] 1 2 3 4 5 6
>> >
>> > which I can override when using `match()` directly:
>> >
>> >> match(x,x,incomparables=NaN)
>> > [1]  1  2  3 NA  5  6
>> >
>> > but not necessarily when calling a function that uses `match()`
>> > internally:
>> >
>> >> stats::ecdf(x)(x)
>> > [1] 0.2 0.4 0.6 0.8 0.8 1.0
>> >
>> > Obviously there are workarounds for any given scenario, but the bigger
>> > problem is that this behavior causes difficult to discover bugs.  For
>> > example, the behavior of stats::ecdf is definitely a bug introduced by
>> > it's
>> > use of `match()` (unless you think NaN == 4 is correct).
>>
>> No, you misunderstand. match() returns the POSITION of the match, and
>> clearly NaN in the 4th position of table =x matches NaN in x. e.g.
>>
>> > match(c(x,NaN),x)
>> [1] 1 2 3 4 5 6 4
>>
>>
>>
>> >
>> > Is there a good reason that NaN's are treated as comparables by match(),
>> > or
>> > his this a bug?
>> >
>> > For reference, I'm using R version 3.2.3
>> >
>> > -Jason
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list