[R] Weird behavior of aggregate() function

Bastien.Ferland-Raymond at mffp.gouv.qc.ca Bastien.Ferland-Raymond at mffp.gouv.qc.ca
Mon Jan 26 17:30:59 CET 2015


Hello list,

I have found a weird behavior of the aggregate() function when used with characters. I think the problem as to do with converting characters to factors.

I'm trying to aggregate a character vector using an homemade function.  My function is giving me all the possible pairs of modalities observed.


Reproducible code:

#######
### my grouping variable
gr <- c("A","A","B","B","C","C","C","D","D","E","E","E")
### my variable
vari <- c("rs2","rs2","mj2","mj1","rs1","rs1","rs2","mj1","mj1","rs1","mj1","mj2")

### what the table would look like
cbind(gr,vari)

###  My function that gives every pairs of variables possible (my real function can go up to length(TE)==5, but for the sake of the example, I've reduced it here)
faire.paires <- function(TE){
gg <- rbind(c(TE[1],TE[2]),
            c(TE[1],TE[3]))
gg <- gg[rowSums(is.na(gg))==0,,drop=F]
gg
}

###  The function gives exactly what I want when I run it on a specific entry
faire.paires(TE = vari[gr=="B"])

###  But with aggregate(), it transforms everything into integer
res <- aggregate(list(TE = vari), by=list(gr),faire.paires)
res
str(res)

###  it's like it's using factor than losing the key to tell me which integer
###  mean which modality


###  if I give it directly factors:
res2 <- aggregate(list(TE = as.factor(vari)), by=list(gr),faire.paires)
res2
str(res2)

###  does not fix the problem.
############

Any idea?

I know my function may not be the best or most efficient way to succeed. However, I'm still puzzled on
why aggregate gives me this weird output.

Best regards,

Bastien Ferland-Raymond, M.Sc. Stat., M.Sc. Biol.
Division des orientations et projets spéciaux
Direction des inventaires forestiers
Ministère des Forêts, de la Faune et des Parcs 



More information about the R-help mailing list