[R] eliminating constant variables

Gabor Grothendieck ggrothendieck at gmail.com
Sun Jul 11 02:11:35 CEST 2010


On Sat, Jul 10, 2010 at 6:28 PM, pdb <philb at philbrierley.com> wrote:
>
> Hi all,
>
> I have a large data set and want to immediately build a 'blind' model
> without first examining the data. Now it appears in the data there are a lot
> of fields that are constant or all missing values - which prevents the model
> from being built.
>
> Can someone point me the right direction as to how I can automatically purge
> my data file of these useless fields.
>

Try this. It will remove constant columns (such as column b below),
all NA columns (such as column a below) and columns which are constant
aside from NAs (such as column d below).  In this example only column
c should survive:

# test data
DF <- data.frame(a = NA, b = 1, c = 1:5, d = c(NA, NA, 1, 1, 1))
sd. <- sd(DF, na.rm = TRUE)
DF[!is.na(sd.) & sd. > 0]



More information about the R-help mailing list