[R] recursive relevel

baptiste auguie ba208 at exeter.ac.uk
Fri Jan 9 16:49:51 CET 2009


Thanks Thierry,

A quick test shows almost equivalent timing with the modification of  
relevel() suggested earlier:


> relevel <-
> function (x, ref, ...)
> {
>     lev <- levels(x)
>     if (is.character(ref))
>         ref <- match(ref, lev)
>     if (any(is.na(ref)))
>         stop("'ref' must be an existing level")
>     nlev <- length(lev)
>     if (any(ref < 1 | ref > nlev))
>         stop(gettextf("ref = %d must be in 1:%d", ref, nlev),
>             domain = NA)
>     factor(x, levels = lev[c(ref, seq_along(lev)[-ref])])
> }

> > system.time(relevel(y, c("D", "B")))
>    user  system elapsed
>   5.972   0.258   6.395
> >
> > system.time(order.factor3(y, c("D", "B")))
>    user  system elapsed
>   5.962   0.274   6.459


It's always good to learn other options, though.

Thanks,

baptiste

On 9 Jan 2009, at 15:50, ONKELINX, Thierry wrote:

> Dear Baptiste,
>
> You can avoid the recursive stuff. And it will run about twice as  
> fast.
>
>> order.factor <- function (x, ref)
> +  {
> +  last.index <- length(ref) # convenience for matlab's end keyword
> +  if(last.index == 1) return(relevel(x, ref)) # end case, normal case
> +  my.new.list <- list(x=relevel(x, ref[last.index]), ref=ref[- 
> last.index])
> +  return(do.call(order.factor, my.new.list)) # recursive call
> +  }
>>
>> order.factor2 <- function(x, ref){
> +     factor(x, levels = c(ref, sort(levels(x)[!levels(x) %in% ref])))
> + }
>> order.factor3 <- function(x, ref){
> +     factor(x, levels = c(ref, sort(levels(x)[!levels(x) %in%  
> ref])), labels = c(ref, sort(levels(x)[!levels(x) %in% ref])))
> + }
>>
>> x <- factor(sample(LETTERS[1:5], 10000000, replace = TRUE))
>> y <- factor(sample(LETTERS[1:20], 10000000, replace = TRUE))
>> system.time(order.factor(x, c("D", "B")))
>   user  system elapsed
>   5.69    0.38    6.09
>> system.time(order.factor2(x, c("D", "B")))
>   user  system elapsed
>   3.90    0.20    4.12
>> system.time(order.factor3(x, c("D", "B")))
>   user  system elapsed
>   3.26    0.19    3.46
>> system.time(order.factor(y, c("D", "B")))
>   user  system elapsed
>  17.43    0.39   17.84
>> system.time(order.factor3(y, c("D", "B")))
>   user  system elapsed
>   8.25    0.17    8.46
>
>
> HTH,
>
> Thierry
>
>
> ----------------------------------------------------------------------------
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for  
> Nature and Forest
> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,  
> methodology and quality assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium
> tel. + 32 54/436 185
> Thierry.Onkelinx at inbo.be
> www.inbo.be
>
> To call in the statistician after the experiment is done may be no  
> more than asking him to perform a post-mortem examination: he may be  
> able to say what the experiment died of.
> ~ Sir Ronald Aylmer Fisher
>
> The plural of anecdote is not data.
> ~ Roger Brinner
>
> The combination of some data and an aching desire for an answer does  
> not ensure that a reasonable answer can be extracted from a given  
> body of data.
> ~ John Tukey
>
> -----Oorspronkelijk bericht-----
> Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r- 
> project.org] Namens baptiste auguie
> Verzonden: vrijdag 9 januari 2009 15:11
> Aan: R R-help
> Onderwerp: [R] recursive relevel
>
> Dear list,
>
> I'm having second thoughts after solving a very trivial problem: I
> want to extend the relevel() function to reorder an arbitrary number
> of levels of a factor in one go. I could not find a trivial way of
> using the code obtained by getS3method("relevel","factor"). Instead, I
> thought of solving the problem in a recursive manner (possibly after
> reading Paul Graham essays on Lisp too recently). Here is my attempt :
>
>>
>> order.factor <- function (x, ref)
>>      {
>>
>>      last.index <- length(ref) # convenience for matlab's end keyword
>>      if(last.index == 1) return(relevel(x, ref)) # end case, normal  
>> case
>> of relevel
>>      my.new.list <- list(x=relevel(x, ref[last.index]),  # creating a
>> list with updated parameters,
>>                                                                                                              # going 
>>  through the list in reverse order
>>                                                      ref=ref[- 
>> last.index]) # chop the vector from its last level
>>      return(do.call(order.factor, my.new.list)) # recursive call
>> }
>>
>> ff <- factor(c("a", "b", "c", "d"))
>> ff
>> relevel(ff, levels(ff)[1])
>> relevel(ff, levels(ff)[2]) # that's the usual case: you want to put
>> a level first
>>
>> order.factor(x=ff, ref=c("a", "b"))
>> order.factor(x=ff, ref=c("c"))
>> order.factor(x=ff, ref=c("c", "d")) # that's my wish: put c and d in
>> that order as the first two levels
>>
>
>
> I'm hoping this can be improved in several aspects:
>
> - there is probably already a better function I missed or overlooked
> (I'd still be curious about the following points, though)
>
> - after reading a few threads, it appears that some recursive
> functions are fragile in some sense, and I'm not sure what this means
> in practice. (Should I use Recall, somehow?)
>
> - it's probably quite slow for large data.frames
>
> - I could not think of a good name, this one might clash with some S3
> method perhaps?
>
> - any other thoughts welcome!
>
>
> Best wishes,
>
> Baptiste
> _____________________________
>
> Baptiste Auguié
>
> School of Physics
> University of Exeter
> Stocker Road,
> Exeter, Devon,
> EX4 4QL, UK
>
> Phone: +44 1392 264187
>
> http://newton.ex.ac.uk/research/emag
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> Dit bericht en eventuele bijlagen geven enkel de visie van de  
> schrijver weer
> en binden het INBO onder geen enkel beding, zolang dit bericht niet  
> bevestigd is
> door een geldig ondertekend document. The views expressed in  this  
> message
> and any annex are purely those of the writer and may not be regarded  
> as stating
> an official position of INBO, as long as the message is not  
> confirmed by a duly
> signed document.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

_____________________________

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag




More information about the R-help mailing list