[Rd] reorder [stats] and reorder.factor [lattice]

Deepayan Sarkar deepayan at stat.wisc.edu
Mon Sep 13 21:54:52 CEST 2004


Before it's too late for R 2.0.0, do we have a final decision yet on 
having a reorder method for "factor" in stats?

Deepayan

On Friday 03 September 2004 11:36, Warnes, Gregory R wrote:
> I also have a reorder.factor defined in the gregmisc package.  It has
> a slighly different behavior.
> It allows
>  - sorting the factor level names via 'mixedsort', which sorts mixed
>         numeric/character intelligently so that 'Cis 10mg' comes
> after 'Cis 5mg' and before 'Taxol 10mg'
>  - reordering by a numeric order
>  - reordering by named factor labels.
>
> Combining it with the version provided by Deepayan we get:
>
> reorder.factor <- function(x,
>                            order,
>                            X,
>                            FUN,
>                            sort=mixedsort,
>                            make.ordered = is.ordered(x),
>                            ... )
>   {
>     constructor <- if (make.ordered) ordered else factor
>
>     if (!missing(order))
>       {
>         if (is.numeric(order))
>           order = levels(x)[order]
>         else
>           order = order
>       }
>     else if (!missing(FUN))
>       order = names(sort(tapply(X, x, FUN, ...)))
>     else
>       order = sort(levels(x))
>
>     constructor( x, levels=order)
>
>   }
>
> Yielding:
> >    # Create a 4 level example factor
> >    trt <- factor( sample( c("PLACEBO","300 MG", "600 MG", "1200
> > MG"),
>
> +                   100, replace=TRUE ) )
>
> >    summary(trt)
>
> 1200 MG  300 MG  600 MG PLACEBO
>      18      24      30      28
>
> >    # Note that the levels are not in a meaningful order.
> >
> >
> >    # Change the order to something useful
> >    # default "mixedsort" ordering
> >    trt2 <- reorder(trt)
> >    summary(trt2)
>
>  300 MG  600 MG 1200 MG PLACEBO
>      24      30      18      28
>
> >    # using indexes:
> >    trt3 <- reorder(trt, c(4,2,3,1))
> >    summary(trt3)
>
> PLACEBO  300 MG  600 MG 1200 MG
>      28      24      30      18
>
> >    # using label names:
> >    trt4 <- reorder(trt, c("PLACEBO","300 MG", "600 MG", "1200 MG")
> > ) summary(trt4)
>
> PLACEBO  300 MG  600 MG 1200 MG
>      28      24      30      18
>
> >    # using frequency
> >    trt5 <- reorder(trt, X=as.numeric(trt), FUN=length)
> >    summary(trt5)
>
> 1200 MG  300 MG PLACEBO  600 MG
>      18      24      28      30
>
> >    # drop out the '300 MG' level
> >    trt6 <- reorder(trt, c("PLACEBO", "600 MG", "1200 MG") )
> >    summary(trt6)
>
> PLACEBO  600 MG 1200 MG    NA's
>      28      30      18      24
>
>
>
> -Greg
>
> (the 'mixedsort' function is available in the gregmisc package, or on
> request)
>
> > -----Original Message-----
> > From: r-devel-bounces at stat.math.ethz.ch
> > [mailto:r-devel-bounces at stat.math.ethz.ch]On Behalf Of Deepayan
> > Sarkar Sent: Friday, August 27, 2004 3:32 PM
> > To: Prof Brian Ripley
> > Cc: r-devel at stat.math.ethz.ch
> > Subject: Re: [Rd] reorder [stats] and reorder.factor [lattice]
> >
> > On Friday 27 August 2004 11:17, Prof Brian Ripley wrote:
> > > On Fri, 27 Aug 2004, Deepayan Sarkar wrote:
> > > > It was recently pointed out on the lists that the S-PLUS
> >
> > Trellis suite
> >
> > > > has a function called reorder.factor that's useful in
> >
> > getting useful
> >
> > > > ordering of factors for graphs. I happily went ahead and
> >
> > implemented it,
> >
> > > > but it turns out that R (not S-PLUS) has a generic called
> >
> > reorder (with a
> >
> > > > method for "dendrogram"). Naturally, this causes R to
> >
> > think I'm defining
> >
> > > > a method for "factor", and gives a warning during check because
> > > > of mismatching argument names.
> > > >
> > > > Any suggestions as to what I should do? Retaining S
> >
> > compatibility doesn't
> >
> > > > seem to be an option. I could make a reorder method for
> >
> > "factor" (which
> >
> > > > sounds like a good option to me), or rename it to something
> > > > like reorderFactor.
> > >
> > > I am pretty sure you don't want to copy the Trellis call, which
> > > is
> > >
> > > function(Factor, X, Function = mean, ...)
> > >
> > > and suggests it dates from the days when S3 lookup could
> >
> > not distingush
> >
> > > functions from other objects by context, hence the
> >
> > capitalization.  Even
> >
> > > then, it is inconsistent with tapply etc which use FUN.
> > >
> > > reorder.factor <- function(x, X, FUN=mean)
> > >
> > > looks about right.  Another problem though: in Trellis
> >
> > reorder.factor
> >
> > > doesn't just reorder the factor, it makes it an ordered
> >
> > factor.  I don't
> >
> > > really see why, especially as the modelling functions
> >
> > assume that ordered
> >
> > > means equally spaced.  If this is to be used more generally
> >
> > (as Kjetil
> >
> > > Halvorsen suggests) then it should record the scores used to do
> > > the ordering in an attribute.
> >
> > Well, the ordered factor issue had come up before, and I
> > currently define this
> > as
> >
> >
> > which is different from the S-PLUS version in 2 ways:
> >
> > 1. for (unordered) factors, it changes the levels, but keeps
> > it a factor
> >
> > 2. it works for non-factors as well (which is moot if it's
> > going to be
> >    a factor method)
> >
> >
> > Implementation details aside, this seems to me like a good
> > candidate for stats
> > (although it'll probably be used very little). lattice
> > functions (will) have
> > an alternative way of determining panel order based on panel
> > contents, which
> > makes more sense to me in the plotting context.
> >
> > Deepayan
> >
> > ______________________________________________
> > R-devel at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> LEGAL NOTICE
> Unless expressly stated otherwise, this message is confidential and
> may be privileged. It is intended for the addressee(s) only. Access
> to this E-mail by anyone else is unauthorized. If you are not an
> addressee, any disclosure or copying of the contents of this E-mail
> or any action taken (or not taken) in reliance on it is unauthorized
> and may be unlawful. If you are not an addressee, please inform the
> sender immediately.



More information about the R-devel mailing list