[R] Mixed sorting/ordering of strings acknowledging roman numerals?

Henrik Bengtsson hb at biostat.ucsf.edu
Wed Aug 27 02:24:47 CEST 2014


Hi,

does anyone know of an implementation/function that sorts strings that
*contain* roman numerals (I, II, III, IV, V, ...) which are treated as
numbers.  In 'gtools' there is mixedsort() which does this for strings
that contains (decimal) numbers.  I'm looking for a "mixedsortroman()"
function that does the same but with roman numbers, e.g.

## DECIMAL NUMBERS
> x <- sprintf("chr %d", 12:1)
> x
 [1] "chr 12" "chr 11" "chr 10" "chr 9"  "chr 8"
 [6] "chr 7"  "chr 6"  "chr 5"  "chr 4"  "chr 3"
[11] "chr 2"  "chr 1"

> sort(x)
 [1] "chr 1"  "chr 10" "chr 11" "chr 12" "chr 2"
 [6] "chr 3"  "chr 4"  "chr 5"  "chr 6"  "chr 7"
[11] "chr 8"  "chr 9"

> gtools::mixedsort(x)
 [1] "chr 1"  "chr 2"  "chr 3"  "chr 4"  "chr 5"
 [6] "chr 6"  "chr 7"  "chr 8"  "chr 9"  "chr 10"
[11] "chr 11" "chr 12"


## ROMAN NUMBERS
> y <- sprintf("chr %s", as.roman(12:1))
> y
 [1] "chr XII"  "chr XI"   "chr X"    "chr IX"
 [5] "chr VIII" "chr VII"  "chr VI"   "chr V"
 [9] "chr IV"   "chr III"  "chr II"   "chr I"

> sort(y)
 [1] "chr I"    "chr II"   "chr III"  "chr IV"
 [5] "chr IX"   "chr V"    "chr VI"   "chr VII"
 [9] "chr VIII" "chr X"    "chr XI"   "chr XII"

> mixedsortroman(y)
 [1] "chr I"    "chr II"   "chr III"  "chr IV"
 [5] "chr V"    "chr VI"   "chr VII"  "chr VIII"
 [9] "chr IX"   "chr X"    "chr XI"   "chr XII"

The latter is what I'm looking for.

Before hacking together something myself (e.g. identify roman numerals
substrings, translate them to decimal numbers, use gtools::mixedsort()
to sort them and then translate them back to roman numbers), I'd like
to hear if someone already has this implemented/know of a package that
does this.

Thanks,

Henrik



More information about the R-help mailing list