[R] glm expand model to more values

Henric Nilsson (Public) nilsson.henric at gmail.com
Sun Jan 13 17:17:40 CET 2008


Jarek Jasiewicz wrote:
> Charles Annis, P.E. wrote:
>> Jarek:
>>
>> Although it is not universally agreed on, I believe the first step in any
>> data analysis is to PLOT YOUR DATA.
>>
>> dd <- data.frame(a=c(1, 2, 3, 4, 5, 6), b=c(3,  5,  6,  7,  9, 10))
>> plot(b ~ a, data=dd)
>> simple.model <- lm(b~a,data=dd)
>> abline(simple.model)
>>
>> Why to you think you need a cubic model to describe 6 observations?
>>
>> Your model is overparameterized - it has two more parameters than the number
>> of observations can reasonably justify, something that would be obvious from
>> your plot.
>>
>> The summary of the simple.linear model shows both the intercept and the
>> slope are statistically meaningful.  (That's what the asterisks mean.)
>>
>> Call:
>> lm(formula = b ~ a, data = dd)
>>
>> Residuals:
>>        1        2        3        4        5        6 
>> -0.23810  0.39048  0.01905 -0.35238  0.27619 -0.09524 
>>
>> Coefficients:
>>             Estimate Std. Error t value Pr(>|t|)    
>> (Intercept)  1.86667    0.30132   6.195  0.00345 ** 
>> a            1.37143    0.07737  17.725 5.95e-05 ***
>> ---
>> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
>>
>> Residual standard error: 0.3237 on 4 degrees of freedom
>> Multiple R-Squared: 0.9874,     Adjusted R-squared: 0.9843 
>> F-statistic: 314.2 on 1 and 4 DF,  p-value: 5.952e-05
>>
>> I think you should invest a small amount of your time, and an even smaller
>> amount of your money to purchase and read - cover-to-cover - one of the
>> several very good books on elementary statistics and R.  My recommendation
>> is _Introductory Statistics with R_ by Peter Dalgaard (Paperback - Jan 9,
>> 2004).  Amazon.com carries it.
>>
>> Best wishes.
>>
>>
>>
>> Charles Annis, P.E.
>>
>> Charles.Annis at StatisticalEngineering.com
>> phone: 561-352-9699
>> eFax:  614-455-3265
>> http://www.StatisticalEngineering.com
>>  
>>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
>> Behalf Of Jarek Jasiewicz
>> Sent: Saturday, January 12, 2008 2:06 PM
>> To: Charles.Annis at statisticalengineering.com
>> Cc: R-help at r-project.org
>> Subject: Re: [R] glm expand model to more values
>>
>> Charles Annis, P.E. wrote:
>>   
>>> How many parameters are you trying to estimate?  How many observations do
>>> you have?
>>>
>>> What is wrong is that half of your parameter estimates are statistically
>>> meaningless:
>>>
>>> dd <- data.frame(a=c(1, 2, 3, 4, 5, 6), b=c(3,  5,  6,  7,  9, 10))
>>>
>>> overparameterized.model <- glm(b~poly(a,3),data=dd)
>>>
>>> summary(overparameterized.model)
>>>
>>>
>>> Coefficients:
>>>             Estimate Std. Error t value Pr(>|t|)    
>>>
>>> (Intercept)   6.6667     0.1725  38.644 0.000669 ***
>>>
>>> poly(a, 3)1   5.7371     0.4226  13.576 0.005382 ** 
>>>
>>> poly(a, 3)2  -0.1091     0.4226  -0.258 0.820395    
>>>
>>> poly(a, 3)3   0.2236     0.4226   0.529 0.649562  
>>>
>>>
>>>
>>>
>>> Charles Annis, P.E.
>>>
>>> Charles.Annis at StatisticalEngineering.com
>>> phone: 561-352-9699
>>> eFax:  614-455-3265
>>> http://www.StatisticalEngineering.com
>>>  
>>>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>>>     
>> On
>>   
>>> Behalf Of Jarek Jasiewicz
>>> Sent: Saturday, January 12, 2008 11:50 AM
>>> To: R-help at r-project.org
>>> Subject: [R] glm expand model to more values
>>>
>>> Hi
>>>
>>> I have the problem with fitting curve to data with lm and glm. When I 
>>> use polynominal dependiency, fitted values from model are OK, but I 
>>> cannot  recive proper values when I use coefficents to caltulate this.  
>>> Let me present simple example:
>>>
>>> I have simple data.frame: (dd)
>>>  a: 1 2 3 4 5 6
>>>  b:  3  5  6  7  9 10
>>>
>>> I try to fit it to model:
>>>
>>> model=glm(b~poly(a,3),data=dd)
>>>  I have following data fitted to model (as I expected)
>>>  > fitted(model)
>>>         1         2         3         4         5         6
>>>  3.095238  4.738095  6.095238  7.333333  8.619048 10.119048
>>>
>>> and coef(model)
>>> (Intercept) poly(a, 3)1 poly(a, 3)2 poly(a, 3)3
>>>   6.6666667   5.7370973  -0.1091089   0.2236068
>>>
>>> so when I try to expand the model to other data (simple extrapolation), 
>>> let say: s=seq(1:10,by=1)
>>>
>>> I do:
>>> extra=sapply(s,function(x) coef(model) %*% x^(0:3))
>>> and here is result:
>>> [1]  12.51826  19.49328  28.93336  42.18015  60.57528  85.46040 118.17714
>>>  [8] 160.06715 212.47207 276.73354
>>>
>>> the data form expanding coefs are completly differnd from fitted
>>>
>>> What's going wrong?
>>>
>>> Jarek
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>>     
>> http://www.R-project.org/posting-guide.html
>>   
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>   
>>>     
>> sorry but I cannot understand. What does it means data are statistically 
>> meanningless?
>>
>> It is examle with very simple data which I use according to simpleR 
>> manual example to check why I cannot recive expected result. I need 
>> simple model y~x^3+x^2....+z to extrapolate data
>> Jarek
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>   
> I understand that data are not well example. But I try to find rather 
> general solution.
> Original data are list 98 dataframes and are calculated by over 100 
> lines R script I thought that it is too much to attach them, so I typed 
> few digits to ilustrate problem.
> 
> The question was asked wrong. It shoud be:
> 
> if formulas:
> pol3_model=lm(b~poly(a,3))
> p3_model=lm(b~a+I(a^2)+I(a^3))
> 
>  are the same? according R documetation - Yes

They're not using the same parameterization. You need to read `?poly'.

> both gives the same fitted() values, but completly different coef()

`poly' returns *orthogonal* polynomials unless told otherwise -- see the 
`raw' argument. Try setting `raw = TRUE', but for numerical reasons you 
may prefer the default.


HTH,
Henric



> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list