[R] Tip: I() can designate constants in a regression

Peter Dalgaard p.dalgaard at biostat.ku.dk
Thu Sep 8 19:23:00 CEST 2005


David James <djames at frontierassoc.com> writes:

> Just thought I would share a tip I learned:
> The function I() is useful for specifying constants to formulas and  
> regressions.
> 
> It will prevent nls (for example) from trying to treat the variable  
> inside I() as something it needs to estimate.  An example is below.
> 
> -David
> 
> P.S.  This may be obvious to some, but it is not made clear to be by  
> the documentation or common books that I reviewed.  These books, of  
> course, do tend to mention others aspects of I(), which seems to be a  
> very diverse function.  For example:
> * ISwR by Dalgaard (p. 160, 177)
> * MASwS by Venables and Ripley (p.18)
> 
> However, the books I looked at do not mention the specific tip here:  
> Wrapping I() around a variable will make it a constant from the  
> perspective of a regression.
> 
> A humble suggestion to the many authors of the many great R and S  
> books out there: I would find it helpful if more R books had the word  
> "constants" in the index.  Perhaps there could be a brief section  
> that explained how to create constants in a regression.  These sorts  
> of problems, I would guess, occur more commonly with nls models than  
> lm models.

First check whether your claim is actually correct:

>      x = 1:10
>      y = x                                  # perfect fit
>      yeps = y + rnorm(length(y), sd = 0.01) # added noise
>      nls(yeps ~ a + b*x, start = list(a = 0.12345, b = 0.54321),
+           trace = TRUE)
74.2686 :  0.12345 0.54321
0.0006529895 :  -0.002666984  1.000334031
Nonlinear regression model
  model:  yeps ~ a + b * x
   data:  parent.frame()
           a            b
-0.002666984  1.000334031
 residual sum-of-squares:  0.0006529895
> a <- 0
>      nls(yeps ~ a + b*x, start = list(b = 0.54321),trace=TRUE)
80.31713 :  0.54321
0.0006682311 :  0.999953
Nonlinear regression model
  model:  yeps ~ a + b * x
   data:  parent.frame()
       b
0.999953
 residual sum-of-squares:  0.0006682311

I.e., turning a into a constant works quite happily without the I().


> Here is the example that motivated my tip:
> 
> > weather.df : a data frame, where each row is one hour
> > weather.df$temp : the temperature
> > weather.df$annual : time offset, adjusted so that its period is one  
> > year
> > weather.df$daily : time offset, adjusted so that its period is one day
> >
> > # I want a1,a2 to be constants from the point of view of nls
> > a1 <- 66
> > a2 <- -18
> > nls.example  <- nls( temp ~ I(a1) + I(a2)*sin( ts.annual ) + a3*sin 
> > ( ts.daily ), data=weather.df, start=c(a3=1) )
> > # leaving out the I() will cause nls to estimate values for a1 and a2

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907




More information about the R-help mailing list