[R] Re-order levels of a categorical (factor) variable

Bert Gunter gunter.berton at gene.com
Thu Jan 22 03:52:12 CET 2015


Bill/Ravi:

I believe the problem is that the factor is automatically created when
a data frame is created by read.table(). By default, the levels are
lexicographically ordered. The following reproduces the problem and
gives a solution.

>library(lattice)

> z <- data.frame(y = 1:9, x = rep(c("pre", "day2","day10")))
> xyplot(y~x,data=z) ## x axis order is day 10, day2, pre

> levels(z$x)
[1] "day10" "day2"  "pre"

> z$x <- factor(as.character(z$x),levels=c(levels(z$x)[3:1])) ## explicitly defines level order
> xyplot(y~x,data=z) ##  desired plot

Cheers,
Bert


Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Wed, Jan 21, 2015 at 4:36 PM, William Dunlap <wdunlap at tibco.com> wrote:
> Are you sure the factors of T are in the order you think they are?  (Are you
> sure you are using the expected version of T.)   Use print(levels(T)) to
> make
> sure.
>
> I tried
>    timeCats <- c("Presurgery", "Day 30", "Day 60",  "Day 180", "Day 365")
>    d <- data.frame(T = factor(rep(timeCats, 11:15), levels=timeCats),
>       Y=seq_len(sum(11:15)))
>    boxplot(Y ~ T, data=d)
> and the boxes and labels are in the order given in 'timeCats'.
>
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Wed, Jan 21, 2015 at 2:37 PM, Ravi Varadhan <ravi.varadhan at jhu.edu>
> wrote:
>
>> Hi,
>> I have a fairly elementary problem that I am unable to figure out.  I have
>> a continuous variable, Y, repeatedly measured at multiple times, T.  The
>> variable T is however is coded as a factor variable having these levels:
>> c("Presurgery", "Day 30", "Day 60",  "Day 180", "Day 365").
>> When I plot the boxplot, e.g., boxplot(Y ~ T), it displays the boxes in
>> this order:  c("Day 180", "Day 30", "Day 365", "Day 60",  "Presurgery").
>> Is there a way to control the order of the boxes such that they are
>> plotted in a particular order that I want, for example:  c("Presurgery",
>> "Day 30", "Day 60",  "Day 180", "Day 365")?
>>
>> More generally, is there a simple way to redefine the ordering of the
>> categorical variable such that this ordering will be used in whatever
>> operation is done?  I looked at relevel, reorder, etc., but they did not
>> seem to be applicable to my problem.
>>
>> Thanks for any help.
>>
>> Best,
>> Ravi
>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list