[R] Adding SORT to UNIQUE

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Mon Dec 20 18:51:03 CET 2021


On 20/12/2021 12:32 p.m., Martin Maechler wrote:
>>>>>> Rui Barradas
>>>>>>      on Mon, 20 Dec 2021 17:05:33 +0000 writes:
> 
>      > Hello,
>      > Package stringr has functions str_sort and str_order, both with an
>      > argument 'numeric' that will sort the numbers correctly.
>      > Maybe that's what you are looking for, see the example below.
> 
> 
>      > x <- sample(sprintf("ab%d", 1:20))     # shuffle the vector
>      > stringr::str_sort(x, numeric = TRUE)   # sort considering the numbers
> 
> Again:
> There's really no need to use non-base R here (and in almost all
> such questions about string handling!)
> as Avi Gross' answer shows.

That gives a different sort order:

  stringr::str_sort(x, numeric = TRUE)

gives

  [1] "ab1"  "ab2"  "ab3"  "ab4"  "ab5"  "ab6"  "ab7"  "ab8"  "ab9" 
"ab10" "ab11" "ab12" "ab13" "ab14" "ab15" "ab16" "ab17"
[18] "ab18" "ab19" "ab20"

(with the numbers in order), while sort(x) gives

  [1] "ab1"  "ab10" "ab11" "ab12" "ab13" "ab14" "ab15" "ab16" "ab17" 
"ab18" "ab19" "ab2"  "ab20" "ab3"  "ab4"  "ab5"  "ab6"
[18] "ab7"  "ab8"  "ab9"

with the characters in order.  I don't think the "numeric" option is 
available in base R (though of course you could write a function to do 
it, so there's no *need*, but it's certainly more convenient to use the 
stringr function if that's the order you want).

Duncan Murdoch


> 
> 
>      > Hope this helps,
> 
>      > Rui Barradas
> 
> 
>      > Às 16:58 de 20/12/21, Stephen H. Dawson, DSL via R-help escreveu:
>      >> Hi,
>      >>
>      >>
>      >> Running a simple syntax set to review entries in dataframe columns. Here
>      >> is the working code.
>      >>
>      >> Data <- read.csv("./input/Source.csv", header=T)
>      >> describe(Data)
>      >> summary(Data)
>      >> unique(Data[1])
>      >> unique(Data[2])
>      >> unique(Data[3])
>      >> unique(Data[4])
>      >>
>      >> I would like to add sort the unique entries. The data in the various
>      >> columns are not defined as numbers, but also text. I realize 1 and 10
>      >> will not sort properly, as the column is not defined as a number, but
>      >> want to see what I have in the columns viewed as sorted.
>      >>
>      >> QUESTION
>      >> What is the best process to sort unique output, please?
>      >>
>      >>
>      >> Thanks.
> 
>      > ______________________________________________
>      > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>      > https://stat.ethz.ch/mailman/listinfo/r-help
>      > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>      > and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list