[R] Basic question: why does a scatter plot of a variable against itself works like this?

Barry Rowlingson b.rowlingson at lancaster.ac.uk
Wed Nov 6 18:38:22 CET 2013


Interestingly, fitting an LM with x on both sides gives a warning, and
then drops it from the RHS, leaving you with just an intercept:

> lm(x~x,data=d)

Call:
lm(formula = x ~ x, data = d)

Coefficients:
(Intercept)
          4

Warning messages:
1: In model.matrix.default(mt, mf, contrasts) :
  the response appeared on the right-hand side and was dropped
2: In model.matrix.default(mt, mf, contrasts) :
  problem with term 1 in model.matrix: no columns are assigned

there's no numerical problem fitting a line through the points:

 > d$xx=d$x
 > lm(x~xx,data=d)

Call:
lm(formula = x ~ xx, data = d)

Coefficients:
(Intercept)           xx
  5.128e-16    1.000e+00

It seems to be R saying "Ummm did you really mean to do this? It's kinda dumb".

I suppose this could occur if you had a nested loop over all columns
in a data frame, fitting an LM with every column, and didn't skip if
i==j

Except of course it doesn't:

 - fit with two indexes set to one:

> i=1;j=1
> lm(d[,i]~d[,j])

Call:
lm(formula = d[, i] ~ d[, j])

Coefficients:
(Intercept)       d[, j]
  5.128e-16    1.000e+00

- fit with two ones:

> lm(d[,1]~d[,1])

Call:
lm(formula = d[, 1] ~ d[, 1])

Coefficients:
(Intercept)
          4

Warning messages:
1: In model.matrix.default(mt, mf, contrasts) :
  the response appeared on the right-hand side and was dropped
2: In model.matrix.default(mt, mf, contrasts) :
  problem with term 1 in model.matrix: no columns are assigned

Obviously this can all be explained in terms of R (or lm's, or
model.matrix's) evaluation schemes, but it seems far from intuitive.

Barry



On Wed, Nov 6, 2013 at 4:59 PM, William Dunlap <wdunlap at tibco.com> wrote:
> It probably happens because plot(formula) makes one call to terms(formula) to
> analyze the formula.  terms() says there is one variable in the formula,
> the response, so plot(x~x) is the same a plot(seq_along(x), x).
> If you give it plot(~x) , terms() also says there is one variable, but
> no response, so you get the same plot as plot(x, rep(1,length(x))).
> This is also the reason that plot(y1+y2 ~ x1+x2) makes one plot of the sum of y1 and y2
> for each term on the right side instead of 4 plots, plot(x1,y1), plot(x1,y2),plot(x2,y1),
> and plot(x2,y2).
>
> One could write a plot function that called terms separately on the left and
> right sides of the formula.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
>> Of Tal Galili
>> Sent: Wednesday, November 06, 2013 8:40 AM
>> To: r-help at r-project.org
>> Subject: [R] Basic question: why does a scatter plot of a variable against itself works like
>> this?
>>
>> Hello all,
>>
>> I just noticed the following behavior of plot:
>> x <- c(1,2,9)
>> plot(x ~ x) # this is just like doing:
>> plot(x)
>> # when maybe we would like it to give this:
>> plot(x ~ c(x))
>> # the same as:
>> plot(x ~ I(x))
>>
>> I was wondering if there is some reason for this behavior.
>>
>>
>> Thanks,
>> Tal
>>
>>
>>
>> ----------------Contact
>> Details:-------------------------------------------------------
>> Contact me: Tal.Galili at gmail.com |
>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
>> www.r-statistics.com (English)
>> ----------------------------------------------------------------------------------------------
>>
>>       [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list