[R] Why is merge sorting even when sort = F?

Dimitri Liakhovitski dimitri.liakhovitski at gmail.com
Wed Mar 8 19:45:23 CET 2017


I understood your answer.
The point is that sort = TRUE that doesn't sort is plain confusing.
Instead, the option should have been something like efficient = TRUE
or FALSE. At least then no one would stupidly expect sort = TRUE to
sort and sort = FALSE to NOT sort.

On Wed, Mar 8, 2017 at 12:51 PM, Jeff Newmiller
<jdnewmil at dcn.davis.ca.us> wrote:
> If you are still wondering, try re-reading my answer. FALSE is more efficient, TRUE is sorted. Lack of sorting has nothing to do with preserving order.
> --
> Sent from my phone. Please excuse my brevity.
>
> On March 8, 2017 8:55:06 AM PST, Dimitri Liakhovitski <dimitri.liakhovitski at gmail.com> wrote:
>>Thank you. I was just curious what sort=FALSE had no impact.
>>Wondering what it is there for then...
>>
>>On Wed, Mar 8, 2017 at 11:43 AM, Jeff Newmiller
>><jdnewmil at dcn.davis.ca.us> wrote:
>>> Merging is not necessarily an order-preserving operation, but sorting
>>can make the operation more efficient. The sort=TRUE argument forces
>>the result to be sorted, but sort=FALSE is in not a promise that order
>>will be preserved. (I think the imperfect sorting occurs when there are
>>multiple keys but am not sure.) You can add columns to the input data
>>that let you restore some semblance of the original ordering afterward,
>>or you can roll your own possibly-less-efficient merge using match and
>>indexing:
>>>
>>> info[ match( grades2$grade, info$grade ), ]
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> On March 8, 2017 8:07:27 AM PST, Dimitri Liakhovitski
>><dimitri.liakhovitski at gmail.com> wrote:
>>>>Hello!
>>>>I have a vector 'grades' and a data frame 'info':
>>>>
>>>>grades2 <- data.frame(grade = c(1,2,2,3,1))
>>>>info <- data.frame(
>>>>  grade = 3:1,
>>>>  desc = c("Excellent", "Good", "Poor"),
>>>>  fail = c(F, F, T)
>>>>)
>>>>
>>>>I want to get the info for all grades I have in info:
>>>>
>>>>This solution resorts everything in the order of column 'grade':
>>>>merge(grades2, info, by = "grade", all.x = T, all.y = F)
>>>>
>>>>Could you please explain why this solution also resorts - despite
>>sort
>>>>= FALSE?
>>>>merge(grades2, info, by = "grade", all.x = T, all.y = F, sort =
>>FALSE)
>>>>
>>>>Thanks a lot!



-- 
Dimitri Liakhovitski



More information about the R-help mailing list