[R] can predict ignore rows with insufficient info

Peter Whiting pete at sprint.net
Wed Sep 17 00:54:56 CEST 2003


On Tue, Sep 16, 2003 at 04:31:29PM -0400, Thomas W Blackwell wrote:
> Peter  -
> 
> Error !!
> I forgot a "not" in the third line inside the function supported().
> And, my mail editor doesn't balance parentheses, so I don't guarantee
> that my code is even syntatically correct.
> 
> Corrected and re-named version of function:
> 
> unsupported <- function(i,y,d)  {
>    result <- rep(F, dim(d)[1])      # default return value when
>    if (is.factor(d[[i]]))           #  d[[i]] is not a factor.
>      result <- !(d[[i]] %in% unique(d[[i]][ !is.na(d[[y]]) ]))
>    result  }
> 
> tmp.1 <- lapply(seq(along=const), unsupported, "days", const)
> tmp.2 <- matrix(unlist(tmp.1[ names(const) != "days" ]), nrow=dim(const)[1])
> tmp.3 <- as.logical(as.vector(tmp.2 %*% rep(1, dim(tmp.2)[2])))
> 
> x <- predict(g, const[ is.na(const$days) & !tmp.3, ])

this still suffers from the fact that the factor for city
still has "ALBANY" in it (even though it doesn't occur in the
subset).  It can be fixed by creating yet another tmp variable
and refactoring... Kinda painful with multiple predictors in
addition to city, but it is workable. 

> const
  state city days
1    s1   c1    1
2    s1   c1   NA
3    s2   c2    1
4    s2   c2    1
5    s1   c3   NA
> tmp.1 <- lapply(seq(along=const), unsupported, "days", const)
> tmp.2 <- matrix(unlist(tmp.1[ names(const) != "days" ]), nrow=dim(const)[1])
> tmp.3 <- as.logical(as.vector(tmp.2 %*% rep(1, dim(tmp.2)[2])))
> x <- predict(g, const[ is.na(const$days) & !tmp.3, ])
Error in model.frame.default(object, data, xlev = xlev) :
        factor city has new level(s) c3
> tmp.4 <- subset(const,is.na(const$days) & !tmp.3)
> x <- predict(g, tmp.4)
Error in model.frame.default(object, data, xlev = xlev) :
        factor city has new level(s) c3
> tmp.4$city=factor(tmp.4$city)
> x <- predict(g, tmp.4)
> 

pete




More information about the R-help mailing list