# [R] Trying to understand how to sort a DF on two columns

Stephen Ellison S@E|||@on @end|ng |rom LGCGroup@com
Wed Aug 14 16:10:32 CEST 2019

```> I want to sort a DF, temp, on two columns, patid and time. I have searched
> the internet and found code that I was able to modify to get my data sorted.
> Unfortunately I don't understand how the code works. I would appreciate it
> if someone could explain to me how the code works. Among other
> questions, despite reading, I don't understand how with() works, nor what it
> does in the current setting.
>
> code:
data4xsort<-temp[
with( temp, order(temp[,"patid"], temp[,"time"])),
]

With apologies for brevity-induced brusqueness:

1) You don't need 'with' in the code. You could say
data4xsort<- temp[order(temp[,"patid"], temp[,"time"]), ]
or
data4xsort<- temp[order(temp\$patid, temp\$time), ]

2) If you _did_ use 'with', you could say
data4xsort<- temp[with(temp, order(patid,time)), ]

Basically, 'with(x, ...)' says 'look in x first for anything in '...'.

3. order. order is a bit of a mindbender. It gives you the numeric indices you need to convert an unsorted object into a sorted obbject.
If we said
a <- c(2,3,1)
order(a)
by default, we get back
# [1] 3 1 2

These are indexes into a that put the elements of a in ascending order. a[3] is 1, a[1] is 2 and so on.
So if we say
oo <- order(a)
a[oo]

we get
[1] 1 2 3
... which is a, in ascending order. And to do that, we used oo as indexes in a.

4. For a data frame, you generally want to sort rows into a particular order. So let's say we have a data frame like
d <- data.frame(a=c(2,3,1,3,1,2), b=c(1,2,2,1,1,2))
d
a b
1 2 1
2 3 2
3 1 2
4 3 1
5 1 1
6 2 2

We can say
oo.d <- with(d, order(a, b)) #which says 'look in 'd' to find 'a' and 'b'
#We could also have said oo.d <- order(d\$a, d\$b)

This gives us the row numbers of d, arranged to give us the row ordering we asked 'order' to generate.
Now, if we say
d[oo.d, ]     #where we need the empty second index so that the first is treated as a row index
# we get d, with rows sorted by a first and then b:
a b
5 1 1
3 1 2
1 2 1
6 2 2
4 3 1
2 3 2

#You might notice that the default row numbers from d - the left hand colum above - are now identical to oo.d;
# this is particular to default row numbers, though.

5. If you want to pack that into one line without assigning the ordering to oo.d, it goes (for example)
d[ with(d, order(a, b)), ]

... which is pretty much what your code is doing.

The only thing I've missed is that when you wrap something like
order(temp[,"patid"], temp[,"time"])
in 'with', 'with' is not doing anything useful for you.
temp[,"patid"] has already told R where to look for patid,
so R doesn’t need to look anywhere else.

Does that help?

Steve Ellison

*******************************************************************
This email and any attachments are confidential. Any use, copying or
disclosure other than by the intended recipient is unauthorised. If