[R] bug in rle?

William Dunlap wdunlap at tibco.com
Wed Jan 8 18:30:18 CET 2014


If you need an rle for factor data (or lists, or anything for
which match(), unique(), and x[i] act in a coherent way), try the
following.  It is based on the S+, all-S code, version of rle.

(It does not work on data.frames because unique is row oriented
and match is column oriented for data.frames.  If that were
changed, it still would need a x[ends,] instead of x[ends] in the
closing statement.)

myRle <- function (x) 
{
    if (length(x) == 0) {
        list(lengths = integer(0L), values = x)
    }
    else {
        x.int <- match(x, unique(x))
        ends <- c(diff(x.int) != 0L, TRUE)
        list(lengths = diff(c(0L, seq(along = x)[ends])), values = x[ends])
    }
}

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Bert Gunter
> Sent: Wednesday, January 08, 2014 8:56 AM
> To: Prof Brian Ripley
> Cc: r-help at r-project.org
> Subject: Re: [R] bug in rle?
> 
> Thank you Brian for your clear and informative answer. I was
> (obviously!) unaware of this and appreciate the response.
> 
> Best,
> Bert
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
> 
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
> H. Gilbert Welch
> 
> 
> 
> 
> On Wed, Jan 8, 2014 at 8:53 AM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> > On 08/01/2014 16:23, Bert Gunter wrote:
> >>
> >> Is the following a bug?
> >> ##(R version 3.0.2 (2013-09-25)
> >> ## Platform: i386-w64-mingw32/i386 (32-bit))
> >>
> >>
> >> d <- data.frame(a=rep(letters[1:3],4:6))
> >>   rle(d$a)
> >> ##Error in rle(d$a) : 'x' must be an atomic vector
> >>
> >> is.atomic(d$a)
> >> ##[1] TRUE
> >
> >
> > But
> >
> >> is.vector(d$a)
> > [1] FALSE
> >
> > The discrepancies in what a 'vector' is in R are very long standing, but a
> > factor is not a vector.
> >
> >
> >> rle(c(d$a))
> >
> >
> > That loses the class and other attributes, giving a vector.
> >
> >> ## Run Length Encoding
> >> ##  lengths: int [1:3] 4 5 6
> >>   ##  values : int [1:3] 1 2 3
> >>
> >> Cheers,
> >> Bert
> >>
> >> Bert Gunter
> >> Genentech Nonclinical Biostatistics
> >> (650) 467-7374
> >
> >
> >
> > --
> > Brian D. Ripley,                  ripley at stats.ox.ac.uk
> > Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> > University of Oxford,             Tel:  +44 1865 272861 (self)
> > 1 South Parks Road,                     +44 1865 272866 (PA)
> > Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list