[Rd] internal string comparison (Scollate)

Kevin Ushey kevinushey at gmail.com
Thu Mar 27 20:01:46 CET 2014


I too think it would be useful if R exported some version of its
string sorting routines, since sorting strings while respecting
locale, and doing so in a portable fashion while respecting the user's
environment, is not trivial. R holds a fast, portable, well-tested
solution, and I think package developers would be very appreciative if
some portion of this was exposed at the C level.

If not `Scollate`, then perhaps other candidates could be the more
generic `sortVector`, or the more string-specific (and NA-respecting)
`scmp`.

I understand that the volunteers at R Core have limited time and
resources, and exposing an API imposes additional maintenance burdens
on an already thinly stretched team, but this is a situation where the
R users and package authors alike could benefit. Or, if there are
other reasons why exporting such routines is not possible nor
recommended, it would be very informative to know why.

Thanks,
Kevin

On Thu, Mar 27, 2014 at 11:08 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
>
> On 26 March 2014 at 19:09, Romain François wrote:
> | That's one part of the problem. Indeed I'd rather use something rather than
> | copy and paste it and run the risk of being outdated. The answer to that is
>
> We all would. But "they" won't let us by refusing to create more API access points.
>
> | testing though. I can develop a test suite that can let me know I'm out of
>
> Correct.
>
> | date and I need to copy and paste some new code, etc ... Done that before, this
> | is tedious, but so what.
> |
> | The other part of the problem (the real part of the problem actually) is that,
> | at least when R is built with ICU support, Scollate will depend on a the
> | collator pointer in util.c
> | https://github.com/wch/r-source/blob/trunk/src/main/util.c#L1777
> |
> | And this can be controlled by the base::icuSetCollate function. Of course the
> | collator pointer is not public.
>
> So the next (and even less pleasant) answer is to build a new package which
> links to, (or worse yet, embeds) libicu.
>
> As you want ICU behaviour, you will need ICU code.
>
> Dirk
>
> --
> Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list