[R] Order of factor levels

William Dunlap wdunlap at tibco.com
Tue Jan 12 02:52:49 CET 2016


I left out the example:

> set.seed(1)
> df <- data.frame(x1 = rpois(1000,4), x2 = rpois(1000,8))
> helper_fun <- function(x) {
+     cut(x, breaks = unique(quantile(x, seq(0, 1, 1/10), na.rm = TRUE)),
+              include.lowest = TRUE)
+ }
> df2 <- data.frame(lapply(df, helper_fun))
> lapply(df2, levels)
$x1
[1] "[0,2]"  "(2,3]"  "(3,4]"  "(4,5]"  "(5,6]"  "(6,7]"  "(7,14]"

$x2
[1] "[1,4]"   "(4,5]"   "(5,6]"   "(6,7]"   "(7,8]"   "(8,9]"   "(9,10]"
[8] "(10,12]" "(12,18]"


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Mon, Jan 11, 2016 at 11:34 AM, William Dunlap <wdunlap at tibco.com> wrote:

> Don't use vapply() here - use lapply() instead and then leave cut's output
> alone.
>
> vapply() will combine its outputs to create a character matrix and
> data.frame will pull apart the character matrix into its columns.  Skipping
> the matrix intermediary solves
> lots of issues.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Mon, Jan 11, 2016 at 11:24 AM, Guelman, Leo <leo.guelman at rbc.com>
> wrote:
>
>> Dear list,
>>
>> What is a better way relative to the one below to keep the order of
>> factor levels created from cut()? Notice, I'm simply pasting letters to
>> levels before converting to character so to keep the desired order of
>> levels. This is not very elegant... I'm converting to character so I can
>> call the helper fun with vapply() from the main fun.
>>
>> Removing this line of code "  levels(xc) <- paste(letters[1:nlevels(xc)],
>> levels(xc), sep=":")" would result in factor levels that are not ordered
>> according to x1.
>>
>> set.seed(1)
>> df <- data.frame(x1 = rnorm(1000), x2 = rnorm(1000))
>>
>> main_fun <- function(data) {
>>   data.frame(vapply(data, helper_fun, character(nrow(df))))
>> }
>>
>> helper_fun <- function(x) {
>>   xc <-  cut(x, breaks = unique(quantile(x, seq(0, 1, 1/10), na.rm =
>> TRUE)),
>>              include.lowest = TRUE)
>>   levels(xc) <- paste(letters[1:nlevels(xc)], levels(xc), sep=":")
>>   as.character(xc)
>>
>> }
>>
>>
>> res <- main_fun(df)
>> levels(res$x1)
>> levels(res$x1)
>>  [1] "a:[-3.01,-1.34]"    "b:(-1.34,-0.882]"   "c:(-0.882,-0.511]"
>> "d:(-0.511,-0.296]"  "e:(-0.296,-0.0353]"
>>  [6] "f:(-0.0353,0.245]"  "g:(0.245,0.536]"    "h:(0.536,0.854]"
>> "i:(0.854,1.32]"     "j:(1.32,3.81]"
>> >
>>
>> Thanks
>> Leo.
>>
>> _______________________________________________________________________
>> If you received this email in error, please advise the sender (by return
>> email or otherwise) immediately. You have consented to receive the attached
>> electronically at the above-noted email address; please retain a copy of
>> this confirmation for future reference.
>>
>> Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur
>> immédiatement, par retour de courriel ou par un autre moyen. Vous avez
>> accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à
>> l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de
>> cette confirmation pour les fins de reference future.
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list