[Rd] (PR#8877) predict.lm does not have a weights argument for

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed May 24 11:36:11 CEST 2006


On Wed, 24 May 2006, Peter Dalgaard wrote:

> ripley at stats.ox.ac.uk writes:
>
>> (a) case weights:  w_i = 3 means `I have three observations like (y, x)'
>>
>> (b) inverse-variance weights, most often an indication that w_i = 1/3
>> means that y_i is actually the average of 3 observations at x_i.
>>
>> (c) multiple imputation, where a case with missing values in x is split
>> into say 5 parts, with case weights less than and summing to one.
>>
>> (d) Heteroscedasticity, where the model is rather
>>
>>          y = x\beta + \epsilon, \epsilon \sim N(0, \sigma^2(x))
>>
>> And there may well be other scenarios, but those are the most common (in
>> decreasing order) in my experience.
>
> I'd have (d) higher on the list, but never mind. There's also

I find that if you detect heteroscedasticity, then one of the following 
applies:

- a transformation of y would be beneficial

- a non-normal model, e.g. a Poisson regression, is more appropriate

- the error variance really depends on y or Ey not x, as in most
   measurement-error scenarios (and the example in ?nls and the example
   in the addendum to the bug report).

- in analytical chemistry as in the example on the addendum to the bug
   report, there are errors in both y and x to consider, and a functional
   relationship model is better.

So I very rarely actually get as far as predicting from a heteroscedastic 
regression model.

> (e) Inverse probability weights: Knowing that part of the population
> is undersampled and wanting results that are compatible with what you
> would have gotten in a balanced sample. Prototypically: You sample X,
> taking only a third of those with X > c; find population mean of X,
> (or univariate regression on some other variable, which is only
> recorded in the subsample).

I would call this an example of case weights (you are just weighting cases 
and saying `I have 1/p like this', and in rlm there is a difference 
between (a) and (b) and you would want to use wt.method="case" for (e)).

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list