[R] Missing variable in new dataframe for prediction

Gabor Grothendieck ggrothendieck at gmail.com
Tue Feb 13 18:31:27 CET 2007


The call to library(splines) is missing and also try replacing the
line b <- ... with

 fo <- as.formula(sprintf("y ~ s(x0) + s(x1) + ns(%s, 3)", names(Mydata)[i]))
 b <- do.call("gam", list(fo, data = Mydata))

to dynamically recreate the formula on each iteration of the loop
with the correct name, x2 or x3, inserted.

On 2/13/07, LE TERTRE Alain <a.letertre at invs.sante.fr> wrote:
> Hi,
> I'm using a loop to evaluate several models by taking adjacent variables from my dataframe.
> When i try to get predictions for new values, i get an error message about a missing variable in my new dataframe.
>
> Below is an example adapted from ?gam in mgcv package
> library(mgcv)
> set.seed(0)
> n<-400
> sig<-2
> x0 <- runif(n, 0, 1)
> x1 <- runif(n, 0, 1)
> x2 <- runif(n, 0, 1)
> x3 <- runif(n, 0, 1)
> f0 <- function(x) 2 * sin(pi * x)
> f1 <- function(x) exp(2 * x)
> f2 <- function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
> f3 <- function(x) 0*x
> f <- f0(x0) + f1(x1) + f2(x2)
> e <- rnorm(n, 0, sig)
> y <- f + e
> Mydata<-data.frame(y=y,x0=x0,x1=x1,x2=x2,x3=x3)
> remove(list=c("y","x0","x1","x2","x3"))
>
> # Note below the syntax of the 3rd variable required for my loop
> for (i in 4:5){
>  b<-gam(y~s(x0)+ s(x1)+ ns(Mydata[,i], 3), data=Mydata)
>
> newd <- data.frame(x0=(0:399)/30,x1=(0:399)/30,x2=(0:399)/30,x3=(0:399)/30)
> pred <- predict.gam(b,newd)
> }
> Erreur dans model.frame(formula, rownames, variables, varnames, extras, extranames,  :
>        type (list) incorrect pour la variable 'Mydata'
> De plus : Warning message:
> not all required variables have been supplied in  newdata!
>  in: predict.gam(b, newd)
>
> #Defining the name for the variable as in the gam function doesn't solve the problem
>  newd <- data.frame(x0=(0:399)/30,x1=(0:399)/30,x2=(0:399)/30,"Mydata[,i]"=(0:399)/30)
>
> Erreur dans model.frame(formula, rownames, variables, varnames, extras, extranames,  :
>        type (list) incorrect pour la variable 'Mydata'
> De plus : Warning message:
> not all required variables have been supplied in  newdata!
>  in: predict.gam(b, newd)
>
> How should i define my new dataset to be able to get my predictions ?
>
> Thanks in advance
>
>
> O__ ---- Alain Le Tertre
>  c/ /'_ --- Institut de Veille Sanitaire (InVS)/ Département Santé Environnement
> (*) \(*) -- Responsable de l'unité Systèmes d'Information & Statistiques
> ~~~~~~~~~~ - 12 rue du val d'Osne
> 94415 Saint Maurice cedex FRANCE
> Voice: 33 1 41 79 68 76 Fax: 33 1 41 79 67 68
> email: a.letertre at invs.sante.fr
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list