[R] Sorting of character vectors

Rui Barradas ruipbarradas at sapo.pt
Tue Nov 8 14:43:07 CET 2016


Hello,

What is your sessionInfo()?
With me it works as expected:

 > sort(c("-", "+"))
[1] "-" "+"
 > sort(c("+", "-"))
[1] "-" "+"
 > x5 <- c("+Aa","-Ab")
 > sort(x5)
[1] "-Ab" "+Aa"

 > sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=Portuguese_Portugal.1252 
LC_CTYPE=Portuguese_Portugal.1252
[3] LC_MONETARY=Portuguese_Portugal.1252 LC_NUMERIC=C 

[5] LC_TIME=Portuguese_Portugal.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

Hope this helps,

Rui Barradas


Em 08-11-2016 12:18, Pascal A. Niklaus escreveu:
> I just got caught by the way in character vectors are sorted.
>
> It seems that on my machine "sort" (and related functions like "order")
> only consider characters related to punctuation (at least here the "+"
> and "-") when there is no difference in the remaining characters:
>
>  > x1 <- c("-A","+A")
>  > x2 <- c("+A","-A")
>  > sort(x1)    # sorting is according to "-" and "+"
> [1] "-A" "+A"
>  > sort(x2)
> [1] "-A" "+A"
>
>  > x3 <- c("-Aa","-Ab")
>  > x4 <- c("-Aa","+Ab")
>  > x5 <- c("+Aa","-Ab")
>  > sort(x3)
> [1] "-Aa" "-Ab" # here the "+" and "-" are ignored
>  > sort(x4)
> [1] "-Aa" "+Ab"
>  > sort(x5)
> [1] "+Aa" "-Ab"
>
> I understand from the help that this depends on how characters are
> collated, and that this scheme follows the multi-level comparison in
> unicode (http://www.unicode.org/reports/tr10/).
>
> However, what I need is a strict left-to-right comparison of the sort
> provided by strcmp or wcscmp in glibc. The particular ordering of
> special characters is not so important, but there should be no
> "multi-level" aspect to the sorting.
>
> Is there a way to achieve this in R?
>
> Thanks for your help
>
> Pascal
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list