[R] model syntax processed --- probably common

David Winsemius dwinsemius at comcast.net
Mon Aug 19 21:48:01 CEST 2013


On Aug 19, 2013, at 9:45 AM, ivo welch wrote:

> dear R experts---I was programming a fama-macbeth panel regression (a
> fama-macbeth regression is essentially T cross-sectional regressions, with
> statistics then obtained from the time-series of coefficients), partly
> because I wanted faster speed than plm, partly because I wanted some
> additional features.
> 
> my function starts as
> 
> fama.macbeth <- function( formula, din ) {
>   names <- terms( formula )
>  ## omitted : I want an immediate check that the formula refers to
> existing variables in the data frame with English error messages
>    

Look the structure of a terms result from a formula argument with str():

 fama.macbeth <- function( formula, din ) {
   fnames <- terms( formula ) ; str(fnames)
 }

> fama.macbeth( x ~ y, data.frame(x=rnorm(10), y=rnorm(10) ) )
Classes 'terms', 'formula' length 3 x ~ y
  ..- attr(*, "variables")= language list(x, y)
  ..- attr(*, "factors")= int [1:2, 1] 0 1
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : chr [1:2] "x" "y"
  .. .. ..$ : chr "y"
  ..- attr(*, "term.labels")= chr "y"
  ..- attr(*, "order")= int 1
  ..- attr(*, "intercept")= int 1
  ..- attr(*, "response")= int 1
  ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 

Then extract the dimnames from the "factors" attribute to compare to the names in hte data-object:

> fama.macbeth <- function( formula, din ) {
  fnames <- terms( formula ) ;  dnames <- names( din)
  dimnames(attr(fnames, "factors"))[[1]] %in%  dnames
}
#[1] TRUE TRUE


I couldn't tell if this was the main thrust of you question. It seems to meander a bit.

-- 
David.

> monthly.regressions <- by( din, as.factor(din$month), function(dd)
> coef(lm(model.frame( formula, data=dd )))
>   as.m <- do.call("rbind", monthly.regressions)
>   colMeans(as.m)  ## or something like this.
> }
> say my data frame mydata has columns named month, r, laggedx and ... .  I
> can call this function
> 
>   fama.macbeth( r ~ laggedx, din=mydata )
> 
> but it fails

What fails?


> if I want to compute my x variables.  for example,
> 
>   myx <- d[,"laggedx"]
>   fama.macbeth( r ~ myx)
> 
> I also wish that the computed myx still remembered that it was really
> laggedx.  it's almost as if I should not create a vector myx but a data
> frame myx to avoid losing the column name.

I wouldn't say "almost"... rather that is exactly what you should do. R regression methods almost always work better when formulas are interpreted in the environment of the data argument.

>  I wonder why such vectors don't
> keep a name attribute of some sort.
> 
> there is probably an "R way" of doing this.  is there?
> 
> /iaw
> 
> ----
> Ivo Welch (ivo.welch at gmail.com)
> 
> 	[[alternative HTML version deleted]]

Still posting HTML?

> 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

And do explain what the goal is.

-- 

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list