[R] Trying to understand how to sort a DF on two columns

David Winsemius dw|n@em|u@ @end|ng |rom comc@@t@net
Tue Aug 13 04:43:42 CEST 2019


On 8/12/19 7:20 PM, Sorkin, John wrote:
> I want to sort a DF, temp, on two columns, patid and time. I have searched the internet and found code that I was able to modify to get my data sorted. Unfortunately I don't understand how the code works. I would appreciate it if someone could explain to me how the code works. Among other questions, despite reading, I don't understand how with() works, nor what it does in the current setting.
>
> code:
> data4xsort<-temp[
>    with( temp, order(temp[,"patid"], temp[,"time"])),
> ]
>
> Thank you,
> John


The `order`-function returns a numeric vector which is the length of its 
inputs (and there is recycling when the inputs are of different length). 
The numbers are the order in which the items would be if they were 
sorted smallest to largest. There are arguments that let you control the 
behavior in the case of ties. So when used in an indexing application as 
seen here, it results in the dataframe returned with its rows in 
ascending order based primarily on its first argument, patid,  and in 
case of ties on the second argument, time. If you put a minus sign in 
from of the argument it the ordering is largest to smallest.


If that is code you are getting from elsewhere, you should realize that 
it is somewhat redundant and you should question the level of R skills 
of its author.  In this code it is doing absolutely nothing. The `with( 
...) is not needed because the arguments already have an unambiguous 
place to get the column names.  A more compact expression if you were 
going to use `with` would be:

data4xsort<-temp[ with( temp, order(patid, time)), ]

But using `with` carries risks because there are sometimes confusion about which environment it will find the named objects or columns.

Safer would be to not using it in this situation.

Your headers suggest you are using Outlook. Surely there must be a way to specify a plain text format for outgoing emails. This is a plain text mailing list.

David.

>
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list