[R] Missing variable in new dataframe for prediction

Gabor Grothendieck ggrothendieck at gmail.com
Tue Feb 13 18:38:24 CET 2007


Actually this simpler replacement for the b <- ... line would work just as well:

fo <- as.formula(sprintf("y ~ s(x0) + s(x1) + ns(%s, 3)", names(Mydata)[i]))
b <- gam(fo, data = Mydata)


On 2/13/07, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> The call to library(splines) is missing and also try replacing the
> line b <- ... with
>
>  fo <- as.formula(sprintf("y ~ s(x0) + s(x1) + ns(%s, 3)", names(Mydata)[i]))
>  b <- do.call("gam", list(fo, data = Mydata))
>
> to dynamically recreate the formula on each iteration of the loop
> with the correct name, x2 or x3, inserted.
>
> On 2/13/07, LE TERTRE Alain <a.letertre at invs.sante.fr> wrote:
> > Hi,
> > I'm using a loop to evaluate several models by taking adjacent variables from my dataframe.
> > When i try to get predictions for new values, i get an error message about a missing variable in my new dataframe.
> >
> > Below is an example adapted from ?gam in mgcv package
> > library(mgcv)
> > set.seed(0)
> > n<-400
> > sig<-2
> > x0 <- runif(n, 0, 1)
> > x1 <- runif(n, 0, 1)
> > x2 <- runif(n, 0, 1)
> > x3 <- runif(n, 0, 1)
> > f0 <- function(x) 2 * sin(pi * x)
> > f1 <- function(x) exp(2 * x)
> > f2 <- function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
> > f3 <- function(x) 0*x
> > f <- f0(x0) + f1(x1) + f2(x2)
> > e <- rnorm(n, 0, sig)
> > y <- f + e
> > Mydata<-data.frame(y=y,x0=x0,x1=x1,x2=x2,x3=x3)
> > remove(list=c("y","x0","x1","x2","x3"))
> >
> > # Note below the syntax of the 3rd variable required for my loop
> > for (i in 4:5){
> >  b<-gam(y~s(x0)+ s(x1)+ ns(Mydata[,i], 3), data=Mydata)
> >
> > newd <- data.frame(x0=(0:399)/30,x1=(0:399)/30,x2=(0:399)/30,x3=(0:399)/30)
> > pred <- predict.gam(b,newd)
> > }
> > Erreur dans model.frame(formula, rownames, variables, varnames, extras, extranames,  :
> >        type (list) incorrect pour la variable 'Mydata'
> > De plus : Warning message:
> > not all required variables have been supplied in  newdata!
> >  in: predict.gam(b, newd)
> >
> > #Defining the name for the variable as in the gam function doesn't solve the problem
> >  newd <- data.frame(x0=(0:399)/30,x1=(0:399)/30,x2=(0:399)/30,"Mydata[,i]"=(0:399)/30)
> >
> > Erreur dans model.frame(formula, rownames, variables, varnames, extras, extranames,  :
> >        type (list) incorrect pour la variable 'Mydata'
> > De plus : Warning message:
> > not all required variables have been supplied in  newdata!
> >  in: predict.gam(b, newd)
> >
> > How should i define my new dataset to be able to get my predictions ?
> >
> > Thanks in advance
> >
> >
> > O__ ---- Alain Le Tertre
> >  c/ /'_ --- Institut de Veille Sanitaire (InVS)/ Département Santé Environnement
> > (*) \(*) -- Responsable de l'unité Systèmes d'Information & Statistiques
> > ~~~~~~~~~~ - 12 rue du val d'Osne
> > 94415 Saint Maurice cedex FRANCE
> > Voice: 33 1 41 79 68 76 Fax: 33 1 41 79 67 68
> > email: a.letertre at invs.sante.fr
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



More information about the R-help mailing list