[Rd] Use of 'match' in end part of 'levels<-.factor'

Suharto Anggono Suharto Anggono suharto_anggono at yahoo.com
Thu Nov 29 17:27:48 CET 2012

match(xlevs[x], nlevs)
is equivalent to
match(xlevs, nlevs)[x]
The latter has an advantage. In the latter, an element of 'xlevs' is onlz once matched against 'nlevs'. In the former, the same element is repeatedly matched if it is selected multiple times by 'x'.

In end part of the code of function 'levels<-.factor', there is
y <- match(xlevs[x], nlevs)
It is still there in R 2.15.2. I suggest changing it to
y <- match(xlevs, nlevs)[x]

match(xlevs[x], nlevs)
is more efficient than
match(xlevs, nlevs)
if xlevs[x] is short compared to xlevs. In 'levels<-.factor', a compromise may be using something like
y <- if (length(x) <=
match(xlevs[x], nlevs) else
match(xlevs, nlevs)[x]

More information about the R-devel mailing list