[R] predict: remove columns with new levels automatically

Andreas Wittmann andreas_wittmann at gmx.de
Tue Nov 24 20:24:23 CET 2009


Dear R-users,

in the follwing thread

http://tolstoy.newcastle.edu.au/R/help/03b/3322.html

the problem how to remove rows for predict that contain levels which are 
not in the model.

now i try to do this the other way round and want to remove columns 
(variables) in the model which will be later problematic with new levels 
for prediction.

## example:
set.seed(0)
x <- rnorm(9)
y <- x + rnorm(9)

training <- data.frame(x=x, y=y, z=c(rep("A", 3), rep("B", 3), rep("C", 3)))
test <- data.frame(x=t<-rnorm(1), y=t+rnorm(1), z="D")

lm1 <- lm(x ~ ., data=training)
## prediction does not work because the variable z has the new level "D"
predict(lm1, test)

## solution: the variable z is removed from the model
## the prediction happens without using the information of variable z
lm2 <- lm(x ~ y, data=training)
predict(lm2, test)

How can i autmatically recognice this and calculate according to this?

Thanks

Andreas




More information about the R-help mailing list