[R] glm expand model to more values

Jarek Jasiewicz jarekj at amu.edu.pl
Sun Jan 13 18:27:06 CET 2008


Henric Nilsson (Public) wrote:
> Jarek Jasiewicz wrote:
>> Charles Annis, P.E. wrote:
>>> Jarek:
>>>
>>> Although it is not universally agreed on, I believe the first step 
>>> in any
>>> data analysis is to PLOT YOUR DATA.
>>>
>>> dd <- data.frame(a=c(1, 2, 3, 4, 5, 6), b=c(3,  5,  6,  7,  9, 10))
>>> plot(b ~ a, data=dd)
>>> simple.model <- lm(b~a,data=dd)
>>> abline(simple.model)
>>>
>>> Why to you think you need a cubic model to describe 6 observations?
>>>
>>> Your model is overparameterized - it has two more parameters than 
>>> the number
>>> of observations can reasonably justify, something that would be 
>>> obvious from
>>> your plot.
>>>
>>> The summary of the simple.linear model shows both the intercept and the
>>> slope are statistically meaningful.  (That's what the asterisks mean.)
>>>
>>> Call:
>>> lm(formula = b ~ a, data = dd)
>>>
>>> Residuals:
>>>        1        2        3        4        5        6 -0.23810  
>>> 0.39048  0.01905 -0.35238  0.27619 -0.09524
>>> Coefficients:
>>>             Estimate Std. Error t value Pr(>|t|)    (Intercept)  
>>> 1.86667    0.30132   6.195  0.00345 ** a            1.37143    
>>> 0.07737  17.725 5.95e-05 ***
>>> ---
>>> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>>> Residual standard error: 0.3237 on 4 degrees of freedom
>>> Multiple R-Squared: 0.9874,     Adjusted R-squared: 0.9843 
>>> F-statistic: 314.2 on 1 and 4 DF,  p-value: 5.952e-05
>>>
>>> I think you should invest a small amount of your time, and an even 
>>> smaller
>>> amount of your money to purchase and read - cover-to-cover - one of the
>>> several very good books on elementary statistics and R.  My 
>>> recommendation
>>> is _Introductory Statistics with R_ by Peter Dalgaard (Paperback - 
>>> Jan 9,
>>> 2004).  Amazon.com carries it.
>>>
>>> Best wishes.
>>>
>>>
>>>
>>> Charles Annis, P.E.
>>>
>>> Charles.Annis at StatisticalEngineering.com
>>> phone: 561-352-9699
>>> eFax:  614-455-3265
>>> http://www.StatisticalEngineering.com
>>>  
>>>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org 
>>> [mailto:r-help-bounces at r-project.org] On
>>> Behalf Of Jarek Jasiewicz
>>> Sent: Saturday, January 12, 2008 2:06 PM
>>> To: Charles.Annis at statisticalengineering.com
>>> Cc: R-help at r-project.org
>>> Subject: Re: [R] glm expand model to more values
>>>
>>> Charles Annis, P.E. wrote:
>>>  
>>>> How many parameters are you trying to estimate?  How many 
>>>> observations do
>>>> you have?
>>>>
>>>> What is wrong is that half of your parameter estimates are 
>>>> statistically
>>>> meaningless:
>>>>
>>>> dd <- data.frame(a=c(1, 2, 3, 4, 5, 6), b=c(3,  5,  6,  7,  9, 10))
>>>>
>>>> overparameterized.model <- glm(b~poly(a,3),data=dd)
>>>>
>>>> summary(overparameterized.model)
>>>>
>>>>
>>>> Coefficients:
>>>>             Estimate Std. Error t value Pr(>|t|)   
>>>> (Intercept)   6.6667     0.1725  38.644 0.000669 ***
>>>>
>>>> poly(a, 3)1   5.7371     0.4226  13.576 0.005382 **
>>>> poly(a, 3)2  -0.1091     0.4226  -0.258 0.820395   
>>>> poly(a, 3)3   0.2236     0.4226   0.529 0.649562 
>>>>
>>>>
>>>>
>>>> Charles Annis, P.E.
>>>>
>>>> Charles.Annis at StatisticalEngineering.com
>>>> phone: 561-352-9699
>>>> eFax:  614-455-3265
>>>> http://www.StatisticalEngineering.com
>>>>  
>>>>
>>>> -----Original Message-----
>>>> From: r-help-bounces at r-project.org 
>>>> [mailto:r-help-bounces at r-project.org]
>>>>     
>>> On
>>>  
>>>> Behalf Of Jarek Jasiewicz
>>>> Sent: Saturday, January 12, 2008 11:50 AM
>>>> To: R-help at r-project.org
>>>> Subject: [R] glm expand model to more values
>>>>
>>>> Hi
>>>>
>>>> I have the problem with fitting curve to data with lm and glm. When 
>>>> I use polynominal dependiency, fitted values from model are OK, but 
>>>> I cannot  recive proper values when I use coefficents to caltulate 
>>>> this.  Let me present simple example:
>>>>
>>>> I have simple data.frame: (dd)
>>>>  a: 1 2 3 4 5 6
>>>>  b:  3  5  6  7  9 10
>>>>
>>>> I try to fit it to model:
>>>>
>>>> model=glm(b~poly(a,3),data=dd)
>>>>  I have following data fitted to model (as I expected)
>>>>  > fitted(model)
>>>>         1         2         3         4         5         6
>>>>  3.095238  4.738095  6.095238  7.333333  8.619048 10.119048
>>>>
>>>> and coef(model)
>>>> (Intercept) poly(a, 3)1 poly(a, 3)2 poly(a, 3)3
>>>>   6.6666667   5.7370973  -0.1091089   0.2236068
>>>>
>>>> so when I try to expand the model to other data (simple 
>>>> extrapolation), let say: s=seq(1:10,by=1)
>>>>
>>>> I do:
>>>> extra=sapply(s,function(x) coef(model) %*% x^(0:3))
>>>> and here is result:
>>>> [1]  12.51826  19.49328  28.93336  42.18015  60.57528  85.46040 
>>>> 118.17714
>>>>  [8] 160.06715 212.47207 276.73354
>>>>
>>>> the data form expanding coefs are completly differnd from fitted
>>>>
>>>> What's going wrong?
>>>>
>>>> Jarek
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>>     
>>> http://www.R-project.org/posting-guide.html
>>>  
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>       
>>> sorry but I cannot understand. What does it means data are 
>>> statistically meanningless?
>>>
>>> It is examle with very simple data which I use according to simpleR 
>>> manual example to check why I cannot recive expected result. I need 
>>> simple model y~x^3+x^2....+z to extrapolate data
>>> Jarek
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>   
>> I understand that data are not well example. But I try to find rather 
>> general solution.
>> Original data are list 98 dataframes and are calculated by over 100 
>> lines R script I thought that it is too much to attach them, so I 
>> typed few digits to ilustrate problem.
>>
>> The question was asked wrong. It shoud be:
>>
>> if formulas:
>> pol3_model=lm(b~poly(a,3))
>> p3_model=lm(b~a+I(a^2)+I(a^3))
>>
>>  are the same? according R documetation - Yes
>
> They're not using the same parameterization. You need to read `?poly'.
>
>> both gives the same fitted() values, but completly different coef()
>
> `poly' returns *orthogonal* polynomials unless told otherwise -- see 
> the `raw' argument. Try setting `raw = TRUE', but for numerical 
> reasons you may prefer the default.
>
>
> HTH,
> Henric
>
>
>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
Yes, it was the problem!
Thanks
Jarek




More information about the R-help mailing list