[R] [example code] RE: AIC() vs. mle.aic() vs. step()?

Alexandra Thorn alexandra.thorn at tufts.edu
Thu Jun 23 15:29:48 CEST 2011


Ok, here's some example code showing how I get different output for AIC
vs. mle.aic().  Now that I've taken another look at the independent
variables, I'm wondering whether missing values in one of the variables
might be what is messing me up.  I'm going to see if the behavior
changes when I remove those...

Alexandra

#Code with outputs
R> require(wle)
R> xA # 1st independent variable (categorical)
  [1] Diffuse Diffuse Diffuse Diffuse Diffuse Diffuse Diffuse Diffuse
Diffuse
[10] Diffuse Diffuse Ring    Diffuse Diffuse Ring    Diffuse Diffuse
Diffuse
[19] Diffuse Diffuse Ring    Diffuse Diffuse Diffuse Diffuse Diffuse
Diffuse
[28] Diffuse Diffuse Ring    Ring    Diffuse Diffuse Ring    Ring
Ring   
[37] Diffuse Diffuse Ring    Ring    Ring    Ring    Ring    Ring
Diffuse
[46] Other   Ring    Ring    Ring    Ring    Ring    Other   Ring
Ring   
[55] Ring    Ring    Ring    Ring    Ring    Ring    Ring    Diffuse
Diffuse
[64] Diffuse Ring    Ring    Ring    Diffuse Diffuse Diffuse Diffuse
Diffuse
[73] Diffuse Diffuse Diffuse Other   Ring    Ring    Ring    Ring
Diffuse
[82] Diffuse Diffuse Diffuse Ring    Ring    Ring    Ring    Ring
Diffuse
[91] Other   Other   Ring    Ring    Ring    Other   Ring    Other
Diffuse
[100] Diffuse Diffuse Ring    Ring   
Levels: Diffuse Other Ring
R> x5 # 2nd independent variable
  [1]  35.1890163  22.8565556  15.2969944   9.6002241  25.0393843
21.1797882
  [7]   9.2677660  14.5228280   6.6982274   5.7889657  21.4854297
20.5942436
[13]  20.2180106   0.4442017   5.0414991  26.9849474  14.7613970
10.3045834
[19]  13.4192478  13.9074085   6.7219989  13.2569404  18.1492698
8.9814628
[25]  14.2575003  21.8982503   8.5661574  15.3434996   7.4060632
10.2824613
[31]  23.4777018  35.3389594  51.5448186   6.9571801  23.3166747
35.2280399
[37]  53.3812646  44.7933630  25.5658796   9.6980968   2.9003139
4.8073814
[43]   6.9274067   8.6178642  43.9578503   0.0000000  44.1995269
14.6878355
[49]   5.6385462   0.0000000  21.1687124  20.5669418   0.0000000
0.0000000
[55]  28.4924849   8.7184163  18.8744437  20.9748315  21.3849539
163.1436925
[61]  10.8565582   9.9297861   0.0000000   0.0000000  41.9369100
121.7625948
[67]  13.5709398  20.1040412  14.1449650   8.2172524  10.1649988
19.5981176
[73]  20.3028117  17.0104638  12.6129991   8.2051932   6.4293587
22.1598564
[79]  13.9703385  23.0206302  15.2590230  14.4778824   2.4819054
21.8293460
[85]  25.1515167  32.1050850  12.5154914  11.6927538   9.4048632
38.4559899
[91]  53.1959167  14.4917170  10.2548528   8.8227194  12.8573515
10.0589965
[97]  12.8868929   9.6626724   5.9826061   3.2581190  13.4467376
8.8065840
[103]  17.7734493
> x15 # 3rd independent variable
  [1]  1.69924629 -1.63414400  0.71415169  4.17480342  1.52512663
1.73541068
  [7] -5.47498002  0.95681283 -1.48092555  1.51101949 -2.25838766
2.12958863
[13]  1.43795703 -4.48003373 -3.65963009 -0.76346388 -2.44019863
1.32552847
[19]  1.89863804  1.80655970 -0.74175682  1.30112633 -1.06424643
-1.47852202
[25]  0.09035915          NA          NA  1.82385292 -0.15308708
1.04685322
[31]  2.45599032  1.36474093 -2.39863477 -0.21220447 -2.50255033
-1.92296430
[37] -0.24577578 -1.96756216  0.43349997  0.88459859 -0.12755905
2.31771322
[43] -1.21846731  1.75082992 -3.02346893 -4.15582445  1.09946460
4.30008522
[49]  4.37542383          NA -1.93641862 -0.01919492 -2.39609318
-3.12228102
[55]  0.48804606 -1.42886437 -3.52078266  3.22115286  0.87942540
-0.29385365
[61]  0.40030867  0.84382607 -0.14445408 -0.61903527          NA
1.53158894
[67] -1.01595045  0.18857375 -1.24703875 -0.53766035 -0.43305094
1.30035414
[73]  0.08256647 -0.01008154 -1.89151834  0.60161181  1.38339048
1.70782208
[79]  0.48995599          NA  0.71774340          NA  0.35578308
-1.30038021
[85]  0.18170942 -0.76999772 -0.52860127 -0.58713905  2.45770818
-3.79345760
[91] -0.73700348  1.85916858  0.48523489 -2.24404921 -3.71691741
-0.80525820
[97]  0.20768561 -0.05588210          NA -0.50332833  0.70407465
-0.57391160
[103] -1.11740646
> y1 #response variable
  [1] 0.11736407 0.12793015 0.06627390 0.03385292 0.05111586 0.12896867
  [7] 0.21030113 0.10661115 0.02321079 0.06035170 0.17966075 0.22120809
[13] 0.16367033 0.07062699 0.11563063 0.62809888 0.13571557 0.14366535
[19] 0.16453117 0.04030618 0.29904079 0.13865458 0.25814464 0.09636693
[25] 0.14262893 0.12619897 0.15919200 0.10713175 0.18137740 0.37961763
[31] 0.16831734 0.02425770 0.12793015 0.23174790 0.16384251 0.41976893
[37] 0.12498691 0.18960957 0.33873792 0.19594614 0.44510411 0.45554491
[43] 0.70821663 0.20739951 0.07828510 0.07393444 0.12290867 0.22614130
[49] 0.49742825 0.04013179 0.58127117 0.05216166 0.27597288 0.14090123
[55] 0.22120809 0.49090375 0.33216113 0.03437621 0.12031011 0.10261893
[61] 0.58141318 0.06244214 0.03594604 0.17966075 0.09345085 0.43887815
[67] 0.51929244 0.20501885 0.04663966 0.33104604 0.28841287 0.26924687
[73] 0.29495726 0.23675230 0.33385065 0.02814909 0.25281753 0.21240608
[79] 0.15204576 0.18288671 0.32867804 0.19813360 0.17379109 0.20755404
[85] 0.10898273 0.10303441 0.19145080 0.38541988 0.29372153 0.19337137
[91] 0.06810569 0.06357357 0.15778877 0.21364239 0.33999760 0.13670444
[97] 0.11900238 0.01315180 0.30599263 0.05201595 0.30131938 0.22017956
[103] 0.23811364
R> summary(mle.aic(lm(y1~xP+x5+x15)),max.num=30) # mle.aic output

Call:
mle.aic(formula = lm(y1 ~ xP + x5 + x15))


Akaike Information Criterion (AIC):
      (Intercept) xPNA xPRing x5 x15     aic
[1,]           1    1      1  0   0 -113.60
[2,]           1    1      1  0   1 -112.80
[3,]           1    1      1  1   0 -112.20
[4,]           1    0      1  0   0 -112.10
[5,]           1    1      1  1   1 -111.30
[6,]           1    0      1  0   1 -111.20
[7,]           1    0      1  1   0 -110.60
[8,]           1    0      1  1   1 -109.60
[9,]           1    1      0  0   0  -98.05
[10,]           1    1      0  0   1  -96.66
[11,]           1    1      0  1   0  -96.28
[12,]           1    1      0  1   1  -94.86
[13,]           1    0      0  0   0  -90.92
[14,]           1    0      0  0   1  -89.32
[15,]           1    0      0  1   0  -89.06
[16,]           1    0      0  1   1  -87.45
[17,]           0    0      1  1   1  -59.09
[18,]           0    0      1  1   0  -57.98
[19,]           0    1      1  1   1  -57.34
[20,]           0    1      1  1   0  -56.35

Printed the first  20  best models 
R> AIC(lm(y1~xA)) # Model 1 above
[1] -120.3801
R> AIC(lm(y1~xA+x15)) # Model 2 above
[1] -110.8642
R> AIC(lm(y1~xA+x5)) # Model 3 above
[1] -118.9906


On Thu, 2011-06-23 at 09:05 -0400, Alexandra Thorn wrote: 
> The packages is wle.
> 
> I'll put together some code that shows the behavior I'm talking about,
> and send it to the list.
> 
> Alexandra
> 
> On Thu, 2011-06-23 at 13:51 +0200, Rubén Roa wrote: 
> > I don't find the mle.aic function. Thus it does not ship with R and it's in some contributed package.
> > What package is that?
> > If you had asked for help providing minimal, self-contained, reproducible code, you'd have realized that you need to tell people what package you are using.
> > 
> > ___________________________________________________________________________________ 
> > 
> > Dr. Rubén Roa-Ureta
> > AZTI - Tecnalia / Marine Research Unit
> > Txatxarramendi Ugartea z/g
> > 48395 Sukarrieta (Bizkaia)
> > SPAIN
> > 
> >  
> > 
> > > -----Mensaje original-----
> > > De: r-help-bounces at r-project.org 
> > > [mailto:r-help-bounces at r-project.org] En nombre de Alexandra Thorn
> > > Enviado el: miércoles, 22 de junio de 2011 22:38
> > > Para: r-help at r-project.org
> > > Asunto: [R] AIC() vs. mle.aic() vs. step()?
> > > 
> > > I know this a newbie question, but I've only just started 
> > > using AIC for model comparison and after a bunch of different 
> > > keyword searches I've failed to find a page laying out what 
> > > the differences are between the AIC scores assigned by AIC() 
> > > and mle.aic() using default settings.  
> > > 
> > > I started by using mle.aic() to find the best submodels, but 
> > > then I wanted to also be able to make comparisons with a 
> > > couple of submodels that were nowhere near the top, so I 
> > > started calculating AIC values using AIC().  What I found was 
> > > that not only the scores, but also the ranking of the models 
> > > was different.  I'm not sure if this has to do with the fact 
> > > that mle.aic() scores are based on the full model, or some 
> > > sort of difference in penalties, or something else.
> > > 
> > > Could anybody enlighten me as to the differences between 
> > > these functions, or how I can use the same scoring system to 
> > > find the best models and also compare to far inferior models?
> > > 
> > > Failing that, could someone point me to an appropriate 
> > > resource that might help me understand?
> > > 
> > > Thanks in advance,
> > > Alexandra
> > > 
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide 
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > > 
> 



More information about the R-help mailing list