[R] problem with predict()

ripley@stats.ox.ac.uk ripley at stats.ox.ac.uk
Fri Jun 21 21:41:09 CEST 2002


On Fri, 21 Jun 2002, Czerminski, Ryszard wrote:

> --- first problem
>
> If I store 'simulated' data in data frames:
> # train.data <- data.frame(matrix(rnorm(164*119), nrow = 164))
> # test.data <- data.frame(matrix(rnorm(35*119), nrow = 35))
> it still works the same way i.e. the code below works fine
> for simulated data and fails for 'real' data the only
> difference being in actual numeric values stored in data
> structures of the same shape and type.
>
> Any suggestions why this happens ?

Yes. You are *still* using a matrix in a data frame.  Please do read more
carefully.

> --- second problem
>
> > As Andy Liaw pointed out, xr is a matrix.  Take a look at the names of
> > train.  Hint: they do not contain `x'.
>
> Following your hint I am guessing that the fact that names do not contain
> 'x'
> explains why lm(y~., train) form works and lm(y~x, train) fails
> and "lm(y~., train)" means roughly: correlate column "y" to all other colums

No, it means regress y on all the remaining colums in the data argument.

>
> Where I can find more detail specification of this syntax ?
> In help(lm) I find this paragraph:
>
>      Models for `lm' are specified symbolically.  A typical model has
>      the form `response ~ terms' where `response' is the (numeric)...
>
> which does not quite cover this case.

In any good book on the subject.

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list