[R] Inconsistent linear model calculations

Charilaos Skiadas cskiadas at gmail.com
Thu May 15 23:44:24 CEST 2008


On May 15, 2008, at 5:37 PM, e-letter wrote:

> Below is direct copy from command terminals of both pcs (mandrake 92
> with r 171; mandriva 2008 with r 251, respectively).
>
> R : Copyright 2003, The R Development Core Team
> Version 1.7.1  (2003-06-16)
>
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type `license()' or `licence()' for distribution details.
>
> R is a collaborative project with many contributors.
> Type `contributors()' for more information.
>
> Type `demo()' for some demos, `help()' for on-line help, or
> `help.start()' for a HTML browser interface to help.
> Type `q()' to quit R.
>
>> dubious<-read.table('/path/to/file/dodgy.csv')
>> dubious
>
>   V1  V2    V3    V4    V5       V6     V7
> 1  1 300 39.87 39.85 39.90 39.87333  90000
> 2  2 400 45.16 45.23 45.17 45.18667 160000
> 3  3 500 50.72 51.03 50.90 50.88333 250000
> 4  4 600 56.85 56.80 57.02 56.89000 360000
> 5  5 700 63.01 63.09 63.14 63.08000 490000
> 6  6 800 69.52 69.82 69.63 69.65667 640000

This is exactly why we asked you for a reproducible example and full  
code. That last entry in the V6 column is 69.65667 in this case, but  
66.27667 in the other cases. So you clearly are working with two  
slightly different dodgy files, and consequently two slightly  
different dubious data sets, and lm rightfully produces two slightly  
different accurate coefficients.

Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

>> lm(V6~V2+V7,data=dubious)
>
> Call:
> lm(formula = V6 ~ V2 + V7, data = dubious)
>
> Coefficients:
> (Intercept)           V2           V7
>   2.553e+01    4.332e-02    1.480e-05
>
>>
>
> R version 2.5.1 (2007-06-27)
> Copyright (C) 2007 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
>
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>
>   Natural language support but running in an English locale
>
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
>
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
>
>> dubious<-read.table('/path/to/file/dodgy.csv')
>> dubious
>   V1  V2    V3    V4    V5       V6     V7
> 1  1 300 39.87 39.85 39.90 39.87333  90000
> 2  2 400 45.16 45.23 45.17 45.18667 160000
> 3  3 500 50.72 51.03 50.90 50.88333 250000
> 4  4 600 56.85 56.80 57.02 56.89000 360000
> 5  5 700 63.01 63.09 63.14 63.08000 490000
> 6  6 800 69.52 59.68 69.63 66.27667 640000
>> lm(V6~V2+V7,data=dubious)
>
> Call:
> lm(formula = V6 ~ V2 + V7, data = dubious)
>
> Coefficients:
> (Intercept)           V2           V7
>   1.937e+01    7.168e-02   -1.537e-05
>
>>
>
> Below is the csv file itself:
>
> 1 300 39.87 39.85 39.90 39.87333  90000
> 2 400 45.16 45.23 45.17 45.18667 160000
> 3 500 50.72 51.03 50.90 50.88333 250000
> 4 600 56.85 56.80 57.02 56.89000 360000
> 5 700 63.01 63.09 63.14 63.08000 490000
> 6 800 69.52 59.68 69.63 66.27667 640000
>
> Enjoy! :)
>



More information about the R-help mailing list