[R] Replacing elements of a list over a certain threshold

William Dunlap wdunlap at tibco.com
Tue Jun 22 20:44:38 CEST 2010


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of David Winsemius
> Sent: Tuesday, June 22, 2010 11:24 AM
> To: johannes at huesing.name
> Cc: r-help at r-project.org
> Subject: Re: [R] Replacing elements of a list over a certain threshold
> 
> 
> On Jun 22, 2010, at 2:14 PM, Johannes Huesing wrote:
> 
> > Jim Hargreaves <james at ipec.co.uk> [Mon, Jun 21, 2010 at 12:34:01PM  
> > CEST]:
> >> Dear List,
> >>
> >> I have a list of length ~1000 filled with numerics. I need to
> >> replace the elements of this list that are above a certain 
> numerical
> >> threshold with the value of the threshold.
> >>
> >> e.g
> >> example=list(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 9, 8, 7, 6, 5, 
> 4, 3, 2,  
> >> 1)

Is it essential that the dataset 'example' be a 'list'
and not a 'numeric' object (created in this case by
calling 'c' instead of 'list')?

list's can contain elements of various types and there
are usually time and memory penalties for allowing that
flexibility.  numeric (or character or complex or logical)
objects contain only one type of element and generally
use less space than the equivalent list and processing them
generally takes less time.

E.g., here are timings for several of the algorithms
that have been suggested on equivalent list and numeric
objects of length 10^5:
  > x.orig <- runif(10^5, min=0, max=10) # numeric object
  > xl.orig <- as.list(x.orig) # list object, one scalar numeric vector per element
  > x <- x.orig ; system.time(x[x>5] <- 5)
     user  system elapsed
    0.000   0.000   0.004
  >  x <- x.orig ; system.time(x <- pmin(x, 5))
     user  system elapsed
    0.010   0.000   0.002
  > xl <- xl.orig ; system.time(xl[xl>5] <- 5)
     user  system elapsed
    0.020   0.000   0.013
  >  xl <- xl.orig ; system.time(xl <- pmin(xl, 5))
     user  system elapsed
    0.080   0.000   0.084
  > xl <- xl.orig ; system.time(xl <- lapply(xl, min, 5))
     user  system elapsed
    0.130   0.000   0.135

In addition to the time penalty, it just seems unnatural
to use a list to store numbers when a numeric object could
do the job.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> >> threshold=5
> >> <magic code goes here>
> >> example=(1, 2, 3, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 4, 3, 2, 1).
> >>
> >
> > lapply(example, min, 5)
> 
> Perhaps wrapped in unlist( ) if a vector is desired.
> 
> The same strategy would work with pmin and probably be faster 
> (albeit  
> not a big deal if the list is only 1000 elements long:
> 
> unlist( pmin(example, 5) )
> 
> 
> >
> > -- 
> > Johannes Hüsing
> 
> 
> David Winsemius, MD
> West Hartford, CT
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list