[R] recursive relevel

ONKELINX, Thierry Thierry.ONKELINX at inbo.be
Fri Jan 9 15:50:57 CET 2009


Dear Baptiste,

You can avoid the recursive stuff. And it will run about twice as fast.

>  order.factor <- function (x, ref)
+  {
+  last.index <- length(ref) # convenience for matlab's end keyword
+  if(last.index == 1) return(relevel(x, ref)) # end case, normal case  
+  my.new.list <- list(x=relevel(x, ref[last.index]), ref=ref[-last.index])
+  return(do.call(order.factor, my.new.list)) # recursive call
+  }
> 
> order.factor2 <- function(x, ref){
+     factor(x, levels = c(ref, sort(levels(x)[!levels(x) %in% ref])))
+ }
> order.factor3 <- function(x, ref){
+     factor(x, levels = c(ref, sort(levels(x)[!levels(x) %in% ref])), labels = c(ref, sort(levels(x)[!levels(x) %in% ref])))
+ }
> 
> x <- factor(sample(LETTERS[1:5], 10000000, replace = TRUE))
> y <- factor(sample(LETTERS[1:20], 10000000, replace = TRUE))
> system.time(order.factor(x, c("D", "B")))
   user  system elapsed 
   5.69    0.38    6.09 
> system.time(order.factor2(x, c("D", "B")))
   user  system elapsed 
   3.90    0.20    4.12 
> system.time(order.factor3(x, c("D", "B")))
   user  system elapsed 
   3.26    0.19    3.46 
> system.time(order.factor(y, c("D", "B")))
   user  system elapsed 
  17.43    0.39   17.84 
> system.time(order.factor3(y, c("D", "B")))
   user  system elapsed 
   8.25    0.17    8.46 


HTH,

Thierry 


----------------------------------------------------------------------------
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be 
www.inbo.be 

To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-----Oorspronkelijk bericht-----
Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Namens baptiste auguie
Verzonden: vrijdag 9 januari 2009 15:11
Aan: R R-help
Onderwerp: [R] recursive relevel

Dear list,

I'm having second thoughts after solving a very trivial problem: I  
want to extend the relevel() function to reorder an arbitrary number  
of levels of a factor in one go. I could not find a trivial way of  
using the code obtained by getS3method("relevel","factor"). Instead, I  
thought of solving the problem in a recursive manner (possibly after  
reading Paul Graham essays on Lisp too recently). Here is my attempt :

>
> order.factor <- function (x, ref)
> 	{
> 		
> 	last.index <- length(ref) # convenience for matlab's end keyword
> 	if(last.index == 1) return(relevel(x, ref)) # end case, normal case  
> of relevel
> 	my.new.list <- list(x=relevel(x, ref[last.index]),  # creating a  
> list with updated parameters,
> 														# going through the list in reverse order
> 							ref=ref[-last.index]) # chop the vector from its last level
> 	return(do.call(order.factor, my.new.list)) # recursive call
> }
>
> ff <- factor(c("a", "b", "c", "d"))
> ff
> relevel(ff, levels(ff)[1])
> relevel(ff, levels(ff)[2]) # that's the usual case: you want to put  
> a level first
>
> order.factor(x=ff, ref=c("a", "b"))
> order.factor(x=ff, ref=c("c"))
> order.factor(x=ff, ref=c("c", "d")) # that's my wish: put c and d in  
> that order as the first two levels
>


I'm hoping this can be improved in several aspects:

- there is probably already a better function I missed or overlooked  
(I'd still be curious about the following points, though)

- after reading a few threads, it appears that some recursive  
functions are fragile in some sense, and I'm not sure what this means  
in practice. (Should I use Recall, somehow?)

- it's probably quite slow for large data.frames

- I could not think of a good name, this one might clash with some S3  
method perhaps?

- any other thoughts welcome!


Best wishes,

Baptiste
_____________________________

Baptiste Auguié

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.




More information about the R-help mailing list