[R] Automatic routine - help

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Jun 23 12:42:06 CEST 2004


On Wed, 23 Jun 2004, Barry Rowlingson wrote:

> Uwe Ligges wrote:
> > Monica Palaseanu-Lovejoy wrote:
> > 
> >> Hi,
> >>
> >> I would like to write a little automatic routine in R, but i am a too 
> >> much of a beginner for that. I will appreciate any help regarding this 
> >> particular problem.
> 
> > If all columns of your data.frame are numeric:
> > 
> > z[z<0] <- 0
> > z[z>1] <- 1
> > 
> 
>   For added fun, you can wrap any of the methods given on the list into 
> a function. For example:
> 
>   hardLimit <- function(z, min=0, max=1){
>     z[z < min] <- min
>     z[z > max] <- max
>     return(z)
> }
> 
>   Then you can do:
> 
>   z <- hardLimit(z)
> 
>   if you want to overwrite z, or:
> 
>   y <- hardLimit(z)
> 
>   to create a new data frame.
> 
>   Note how the default min and max arguments are 0 and 1, and make the 
> function more flexible. You can also do:
> 
>   x <- hardLimit(z, min=-1)
> 
>   and that sets everything below -1 to the value -1.
> 
> Welcome to the world of R development!

First off, if you do start programming, you need to program up the 
comments too.  So as Uwe said

> > If all columns of your data.frame are numeric:

you need (untested)

hardLimit <- function(z, min=0, max=1)
{
    if(!(is.numeric(z) || all(sapply(z, is.numeric)) )) 
        stop("z is not a numeric vector, array or data frame")
    z[z < min] <- min
    z[z > max] <- max
    z
}

since the code will also work for numeric vectors and arrays.  Then you 
need to check if min < max or the order matters ....


However, if you want to do this at all efficiently for a data frame, start
with my solution not Uwe's, which creates several arrays the size of the
one you started with (two for the logical values, and one of the
intermediate answer) and does a for loop over columns internally at least
four times.

When operating on data frames it is usually best to work column by column, 
hence the

newDF[] <- lapply(DF, some_function_for_one_column)

paradigm.  (Changing just the values keeps all the attributes such as row 
names and col names.)


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list