# [R] Why is merge sorting even when sort = F?

DIGHE, NILESH [AG/2362] nilesh.dighe at monsanto.com
Thu Mar 9 13:46:33 CET 2017

```Using the "join" function from the plyr package preserves the data order
library(plyr)

Nilesh
-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Dimitri Liakhovitski
Sent: Wednesday, March 08, 2017 12:45 PM
To: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
Cc: r-help <r-help at r-project.org>
Subject: Re: [R] Why is merge sorting even when sort = F?

The point is that sort = TRUE that doesn't sort is plain confusing.
Instead, the option should have been something like efficient = TRUE or FALSE. At least then no one would stupidly expect sort = TRUE to sort and sort = FALSE to NOT sort.

On Wed, Mar 8, 2017 at 12:51 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
> If you are still wondering, try re-reading my answer. FALSE is more efficient, TRUE is sorted. Lack of sorting has nothing to do with preserving order.
> --
> Sent from my phone. Please excuse my brevity.
>
> On March 8, 2017 8:55:06 AM PST, Dimitri Liakhovitski <dimitri.liakhovitski at gmail.com> wrote:
>>Thank you. I was just curious what sort=FALSE had no impact.
>>Wondering what it is there for then...
>>
>>On Wed, Mar 8, 2017 at 11:43 AM, Jeff Newmiller
>><jdnewmil at dcn.davis.ca.us> wrote:
>>> Merging is not necessarily an order-preserving operation, but
>>> sorting
>>can make the operation more efficient. The sort=TRUE argument forces
>>the result to be sorted, but sort=FALSE is in not a promise that order
>>will be preserved. (I think the imperfect sorting occurs when there
>>are multiple keys but am not sure.) You can add columns to the input
>>data that let you restore some semblance of the original ordering
>>afterward, or you can roll your own possibly-less-efficient merge
>>using match and
>>indexing:
>>>
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> On March 8, 2017 8:07:27 AM PST, Dimitri Liakhovitski
>><dimitri.liakhovitski at gmail.com> wrote:
>>>>Hello!
>>>>I have a vector 'grades' and a data frame 'info':
>>>>
>>>>  desc = c("Excellent", "Good", "Poor"),
>>>>  fail = c(F, F, T)
>>>>)
>>>>
>>>>I want to get the info for all grades I have in info:
>>>>
>>>>This solution resorts everything in the order of column 'grade':
>>>>
>>>>Could you please explain why this solution also resorts - despite
>>sort
>>>>= FALSE?
>>>>merge(grades2, info, by = "grade", all.x = T, all.y = F, sort =
>>FALSE)
>>>>
>>>>Thanks a lot!

--
Dimitri Liakhovitski

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help