[R] Newey West HAC for pooled cross-section data

Wed Mar 27 11:49:58 CET 2013

On Tue, 26 Mar 2013, SHISHIR MATHUR wrote:

> Thanks for the reply Achim. The reason I suspect autocorrelation is 
> because I think that  within the same neighborhood, homes sold a few 
> months back are likely to impact the price of homes sold subsequently.

This may well be spatial (auto)correlation rather than temporal 
autocorrelation.

> In fact the DW test and Breusch-Pagan test come out to be significant. 
> So even though the data is not time series (that is, I do not have 
> repeated observations for the same house),   however, the houses sold 
> close in time to each other are in the data set.

If there is a unique ordering of all observations by time, then you could 
in principle apply an autocorrelation correction for the data, e.g., via 
Newey-West.

But from what you describe above, it seems to be more important to capture 
spatial effects in the data, e.g., by using a spatial lag model (see 
lagsarlm in "spdep") or by using an additive spatial effect (see e.g. gam 
in "mgcv").

> Thanks,
> Shish
> 
> On Tue, Mar 26, 2013 at 3:51 PM, Achim Zeileis <Achim.Zeileis at uibk.ac.at>
> wrote:
>       On Tue, 26 Mar 2013, SHISHIR MATHUR wrote:
>
>             Hello:
>             My dataset set contains several thousand rows of
>             data, with each row
>             containing information for a house. The variables
>             include the sale price of
>             the house, the quarter and year of sale, the
>             attributes of the house, and
>             the attributes of the neighborhood and the city in
>             which the house is
>             located. The data is for a 10-year period. No house
>             is repeated in the
>             dataset. In summary, the dataset can be termed
>             pooled cross-section data.
>
>             My question: Can I estimate Newey-West HAC standard
>             errors for a model that
>             estimates the effect of various independent
>             variables on the sale price of
>             the house?  My understanding is that Newey-West can
>             be used for time series
>             and panel data. However, I am not sure whether it
>             can be used for pooled
>             cross-section data.  If yes, can you refer me to a
>             specific source, such as
>             a paper or a book?
> 
>
>       The result of your aggregation is a cross-section data set.
>       Thus, there should be no correlation between the different
>       observations - or in other terms, the ordering of your
>       observations is completely arbitrary.
>
>       Consequently, there may be heteroskedasticity but not
>       autocorrelation. So you may use HC standard errors but HAC
>       should not be necessary. (Using HAC standard errors will still
>       be consistent but less efficient.)
> 
>
>             --
>             Best,
>             Shish
>
>                     [[alternative HTML version deleted]]
>
>             ______________________________________________
>             R-help at r-project.org mailing list
>             https://stat.ethz.ch/mailman/listinfo/r-help
>             PLEASE do read the posting guide
>             http://www.R-project.org/posting-guide.html
>             and provide commented, minimal, self-contained,
>             reproducible code.
> 
> 
> 
> 
> --
> Best,
> Shishir
> 
>