[R] [External] Weird behaviour of order() when having multiple ties

Avi Gross @v|gro@@ @end|ng |rom ver|zon@net
Tue Feb 1 18:19:47 CET 2022


Stefan,

You are thinking of sorting and indeed if you sort, you get what you think:

> sort(c(2,3,4,1,1,1,1,1))
[1] 1 1 1 1 1 2 3 4

BUT order() does not return any of its values. NONE. It returns the indexes of sorted values and you can, if you choose, use the index to sort one or more values.

Imagine you have two independent vectors (not a data.frame) containing names and ages. You want to rearrange the names in alphabetical order but do not want to lose the ages. When done you want the correspondence to remain.

> Names <- c("Marie", "Jacques", "Zelda", "Jean")
> Ages <- c(12, 32, 3, 102)
> (index <- order(Names))
[1] 2 4 1 3
> (sortedNames <- Names[index])
[1] "Jacques" "Jean"    "Marie"   "Zelda"  
> (sortedAges <- Ages[index])
[1]  32 102  12   3

By getting the order you should take entries from the original, you can apply that order to both of the vectors, or anything else linked such as their birthday. Yes, many people avoid this by simply connecting all the vectors in a data.frame, but under the sheets, the code you use to manipulate things will often do something similar to actually make what you want happen.

In your example, you can get the sorted version like this:

> # Get vector of indices and print
> (index <- order(c(2,3,4,1,1,1,1,1)))
[1] 4 5 6 7 8 1 2 3
> 
> # Use the index on the vector to reaarange the order and print
> (c(2,3,4,1,1,1,1,1)[index])
[1] 1 1 1 1 1 2 3 4
> 
> # Use the index reversed to print in descending order.
> (c(2,3,4,1,1,1,1,1)[rev(index)])
[1] 4 3 2 1 1 1 1 1

And note you can print it forward and backward without calling sort() twice, not that this is important!

I hope that clarifies why the name is "order" and not "sort". 


-----Original Message-----
From: Stefan Fleck <stefan.b.fleck using gmail.com>
To: Richard M. Heiberger <rmh using temple.edu>
Cc: r-help using r-project.org <r-help using r-project.org>
Sent: Sun, Jan 30, 2022 3:07 pm
Subject: Re: [R]  [External] Weird behaviour of order() when having multiple ties

it's not about the sort order of the ties, shouldn't all the 1s in
order(c(2,3,4,1,1,1,1,1)) come before 2,3,4? because that's not what
happening

On Sun, Jan 30, 2022 at 9:00 PM Richard M. Heiberger <rmh using temple.edu> wrote:

> when there are ties it doesn't matter which is first.
> in a situation where it does matter, you will need a tiebreaker column.
> ------------------------------
> *From:* R-help <r-help-bounces using r-project.org> on behalf of Stefan Fleck <
> stefan.b.fleck using gmail.com>
> *Sent:* Sunday, January 30, 2022 4:16:44 AM
> *To:* r-help using r-project.org <r-help using r-project.org>
> *Subject:* [External] [R] Weird behaviour of order() when having multiple
> ties
>
> I am experiencing a weird behavior of `order()` for numeric vectors. I
> tested on 3.6.2 and 4.1.2 for windows and R 4.0.2 on ubuntu. Can anyone
> confirm?
>
> order(
>   c(
>     0.6,
>     0.5,
>     0.3,
>     0.2,
>     0.1,
>     0.1
>   )
> )
> ## Result [should be in order]
> [1] 5 6 4 3 2 1
>
> The sort order is obviously wrong. This only occurs if i have multiple
> ties. The problem does _not_ occur for decreasing = TRUE.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=04%7C01%7Crmh%40temple.edu%7Cbae20314c2314a5cc7cd08d9e429e33f%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C637791692024451993%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=O6R%2FNM6IdPzP8RY3JIWfLgmkE%2B0KcVyYBxoRMo8v2dk%3D&reserved=0
> PLEASE do read the posting guide
> https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=04%7C01%7Crmh%40temple.edu%7Cbae20314c2314a5cc7cd08d9e429e33f%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C637791692024451993%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=6hlfMjZLzopVzGnFVWlGnoEqvZBQwXPlxMuZ2sglEUk%3D&reserved=0
> and provide commented, minimal, self-contained, reproducible code.
>

    [[alternative HTML version deleted]]

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list