[R] Looking for transformation to overcome heterogeneity ofvariances
p.dalgaard at biostat.ku.dk
Thu Aug 3 20:43:58 CEST 2006
[Resending -- recipient list length issue]
"John Sorkin" <jsorkin at grecc.umaryland.edu> writes:
Erm, that was Paul's question, not mine! If you want to help, please
look at the pattern of residuals which he put up on the web on my
> You question is difficult to answer without more information about the
> distribution of your residuals. Different residual patterns call for
> different transformations to stabilize the variance. One very common
> form of heterocedasticity is increasing variance with increasing values
> of an independent predictor, i.e. the variance of the residuals of y=x
> increase as x increases. In this case a log transformation of some, or
> all, of the independent variables of the helps. Please also note the
> comment by Bert Gunter (included below) in which some important points
> are raised, particularly about extreme values.
> If you want more help, please describe the pattern of your residuals.
> John Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> Baltimore VA Medical Center GRECC,
> University of Maryland School of Medicine Claude D. Pepper OAIC,
> University of Maryland Clinical Nutrition Research Unit, and
> Baltimore VA Center Stroke of Excellence
> University of Maryland School of Medicine
> Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> jsorkin at grecc.umaryland.edu
> >>> Berton Gunter <gunter.berton at gene.com> 8/3/2006 11:56:28 AM >>>
> I know I'm coming late to this, but ...
> > > Is someone able to suggest to me a transformation to overcome the
> > > problem of heterocedasticity?
> It is not usually useful to worry about this. In my experience, the
> gain in
> efficiency from using an essentially ideal weighted analysis vs. an
> approximate unweighted one is usually small and unimportant
> to simplify a model is another issue ...). Of far greater importance
> is the loss in efficiency due to the presence of a few "unusual"
> values; have you carefully checked to make sure that none of the large
> sample variances you have are due merely to the presence of a small
> of highly discrepant values?
> -- Bert Gunter
> Genentech Non-Clinical Statistics
> South San Francisco, CA
> "The business of the statistician is to catalyze the scientific
> process." - George E. P. Box
> R-help at stat.math.ethz.ch mailing list
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained, reproducible code.
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help