[R] Removing constants from a data frame

Martin Maechler maechler at stat.math.ethz.ch
Mon Sep 20 13:19:41 CEST 2004


>>>>> "AndyL" == Liaw, Andy <andy_liaw at merck.com>
>>>>>     on Mon, 20 Sep 2004 06:22:33 -0400 writes:

    >> From: Kjetil Brinchmann Halvorsen
    >> 
    >> David Forrest wrote:
    >> 
    >> >Suppose I have
    >> >
    >> >x<-data.frame(v1=1:4, v2=c(2,4,NA,7), v3=rep(1,4),
    >> >     v4=LETTERS[1:4],v5=rep('Z',4))
    >> >
    >> >or a much larger frame, and I wish to test for and remove 
    >> the constant
    >> >numeric columns.
    >> >
    >> >I made:
    >> >
    >> >   is.constant<-function(x){identical(min(x),max(x))}
    >> >
    >> >and
    >> >   apply(x,2,is.constant) # Works for numerics
    >> >   x[,-which(apply(x,2,is.constant))]
    >> >
    >> >I'd really like to be able to delete the constant columns 
    >> without losing
    >> >my non-numerics.  Ignoring the character columns would be OK.
    >> >
    >> >Any suggestions?
    >> >
    >> >Dave
    >> >  
    >> >
    >> what about defing is.constant as
    >> is.constant <-  function(x) {
    >>    if (is.numeric(x))  identical(min(x), max(x)) else  FALSE }

    AndyL> identical() is probably not the safest thing to use:
    >> x <- c(1, 2, NA)
    >> is.constant(x)
    AndyL> [1] TRUE

    AndyL> For data such as c(1, 1, 1, NA), I should think the
    AndyL> safest answer should be NA, because one really
    AndyL> doesn't know whether that last number is 1 or not.

yes.

Also note that is.numeric() is not what you want for data.frame
columns since it isn't true for factors and you may want to
remove constant factor columns as well.

Martin Maechler




More information about the R-help mailing list