[R] Format of a data frame

Thomas L Jones, PhD jones3745 at verizon.net
Mon Oct 22 05:02:31 CEST 2007


The goal is to smooth a scatterplot using the LOESS locally weighted 
regression program and a gam. There are 156 points. Thus x can have the 
value 1, or 2, etc., up to a maximum of x = 156. The y values are random, 
with a Poisson distribution, or the next thing to it.

After reading in the data, I was able to generate a model, named mod, as 
follows:

mod <- gam(y~lo(x), family=poisson, x = TRUE)

Next, I want to look at some values of the fitted curve: Specifically x =1, 
x = 2, and x = 3. Upon looking up predict.gam, I see the following:

Usage

predict.gam (object, newdata, type, dispension, se.fit = FALSE, na.action, 
terms ...)

One of the arguments of the function is named newdata. I see:

newdata A data frame containing the values at which predictions are
        requested. [snip] Only those predictors, referred to in the
        right side of the formula, need be present by name in newdata.

I am having difficulty figuring out the format of the data frame. For 
example, how many columns should it have? Should it have a column for the 
three values of x? Probably there is a rather standard format for data 
frames, but I am having trouble looking it up. Perhaps some one would point 
me to the place in the documentation where this is discussed.

Tom Jones



More information about the R-help mailing list