[R] please comment on my function

Sam Steingold sds at gnu.org
Fri Sep 14 21:13:41 CEST 2012


> * jim holtman <wubygzna at tznvy.pbz> [2012-09-14 13:10:37 -0400]:
>
> more than half the time is in 'tolower' and 'nchar', so it is not all
> 'sub's problem.

aha, thanks!

> This version runs a little faster since it does not need the 'tolower':
>
> canonicalize.language <- function (s) {
>   # s <- tolower(s)
>   long <- nchar(s) == 5
>   s[long] <- sub("^([[:alpha:]]{2})[-_][[:alpha:]]{2}$","\\1",s[long])
>   s[nchar(s) != 2 & s != "c"] <- "unknown"
>   s
> }

but it does not convert "EN" to "en", so it is not good for my purposes.

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://thereligionofpeace.com http://mideasttruth.com
http://iris.org.il http://honestreporting.com http://memri.org
Life is like Tetris: failures accumulate, successes fade.




More information about the R-help mailing list