[R] Looking for transformation to overcome heterogeneity ofvariances

John Sorkin jsorkin at grecc.umaryland.edu
Thu Aug 3 19:51:18 CEST 2006

You question is difficult to answer without more information about the
distribution of your residuals. Different residual patterns call for
different transformations to stabilize the variance. One very common
form of  heterocedasticity is increasing variance with increasing values
of an independent predictor, i.e. the variance of the residuals of y=x
increase as x increases. In this case a log transformation of some, or
all, of the independent variables of the helps. Please also note the
comment by Bert Gunter (included below) in which some important points
are raised, particularly about extreme values. 

If you want more help, please describe the pattern of your residuals. 

John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
jsorkin at grecc.umaryland.edu

>>> Berton Gunter <gunter.berton at gene.com> 8/3/2006 11:56:28 AM >>>
I know I'm coming late to this, but ...

> > Is someone able to suggest to me a transformation to overcome the
> > problem of heterocedasticity?

It is not usually useful to worry about this. In my experience, the
gain in
efficiency from using an essentially ideal weighted analysis vs. an
approximate unweighted one is usually small and unimportant
to simplify a model is another issue ...). Of far greater importance
is the loss in efficiency due to the presence of a few "unusual"
values; have you carefully checked to make sure that none of the large
sample variances you have are due merely to the presence of a small
of highly discrepant values?

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
"The business of the statistician is to catalyze the scientific
process."  - George E. P. Box

R-help at stat.math.ethz.ch mailing list
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list