[R] Am failing on making lagged residual after regression

Ajay Shah ajayshah at mayin.org
Mon Mar 8 17:37:15 CET 2004


Folks,

I'm most confused in trying to do something that (I thought) out to be
mainstream and straightforward R. :-) Could you please help?

I am doing an ordinary linear regression. My goal is: After a
regression, to make residuals, and make a new variable which is the
lagged residuals (lagged by 1). I will use this variable in a 2nd
stage regression (for an error-correcting model).

This sounds simple and reasonable, and should be right up R's alley,
but I am just not able to do this. Can I please show you the steps
which I'm trying and failing in?

I start with:

> m = lm(NNDA ~ NFA + NFA.x.d1 + NFA.x.d2 + IIP.n + CRR, D.f)
> e = residuals(m)
> print(e)
          34           35           36           37           38           39 
 -5073.24843  -4210.27886  -8218.01782  -1489.10583  -4426.11738 -11332.56052 
  (lines deleted)
          64           65           66           67           68           69 
  8362.93776   7564.14324   2311.41208   7660.00638  -1271.04645 -10917.29418 
  (lines deleted)
         160          161          162          163          164          165 
  3858.94591 -11783.04370 -21438.33646   1859.49628  -4988.82853 -25172.43241 

Here, the residuals only started at the 34th observation owing to
missing data in my data frame. This is correct and sensible. The
dataset is 167 observations, but 166 and 167 are also missing data and
dropped.

I tried to use lag(e,1) to make a new vector and failed. I think I am
just not understanding the R concept of lag(). In my notion of a
lagged vector, I want a vector f where f[35] is e[34], i.e. is the
first residual above of -5073.24843. This is just not what I get by
saying lag(e,1) - I am just not understanding lag(). I would be very
happy if someone could educate me on how to utilise lag().

Okay, I try to get my way in a different way:

> print(T)
[1] 167
> f = numeric(T)
> f[1] = NA
> f[2:T] = e[1:(T-1)]

This looks reasonable? I thought this should do the trick. I am
hand-initialising a T-length vector with NA in the 1st elem, and I
copy out the values of e[] from 1 till 166 into f[2:T]. I thought this
should give me a lagged e. It doesn't --

> print(f)
  [1]           NA  -5073.24843  -4210.27886  -8218.01782  -1489.10583
  (lines deleted)
[131]   1859.49628  -4988.82853 -25172.43241           NA           NA
  (lines deleted)
[166]           NA           NA

I thought "Okay, what seems to be happening is that the e[1] that I
have is `actually' the e[34] of my thoughts". So I try:

> f=rep(NA, T)               # zap out f
> f[35:T] = e[34:(T-1)]      # copy out useful stuff into 35..T
> print(f)
  [1]           NA           NA           NA           NA           NA
  (lines deleted)
 [31]           NA           NA           NA           NA   7660.00638
 [36]  -1271.04645 -10917.29418 -11111.60144  -1597.98355  -1066.01901
  (lines deleted)
[131]   1859.49628  -4988.82853 -25172.43241           NA           NA
  (lines deleted)
[166]           NA           NA

This is wrong!!

Recall (from upstairs) that e[34] was -5073.24843. That value seems to
have mysteriously vanished. Instead, the first non-NA in f - which is
f[35] - is 7660.00638, which (incidentally) was e[67]. I just don't
know how that value got here. And, the values in f[] seem to peter out
at 133!  After 133, they are all NA until the end.

I guess I'm _just_ not understanding what is the animal that is
returned by residual(lm()). I know I am missing something basic,
because lots of people must be doing what I am trying: I.e. to run a
regression, extract a residual, lag it, and use it for a 2nd stage
regression.

I know that the vector e (returned by residual(lm())) is different
from a simple vector, for when I say:

> print(f[35])
[1] 7660.006
> print(e[35])
       68 
-1271.046 

the two animals seem to be different. I don't understand e[35] - why
is it not just a number - there seems to be some index tagging along?
How do I get at the pure numbers of the residuals?

Thanks much,

       -ans.

-- 
Ajay Shah                                                   Consultant
ajayshah at mayin.org                      Department of Economic Affairs
http://www.mayin.org/ajayshah           Ministry of Finance, New Delhi




More information about the R-help mailing list