[R] Problem about Box-Cox transformation (topic in html form)

John Fox jfox at mcmaster.ca
Mon Jan 11 14:56:50 CET 2010


Dear Saji Ren,

Dieter Menne has already pointed out that you lost the negative values in
the transformation. Another point is that since you selected the
transformation based on the "started" data c888.dl.ma080 + 1200, then you
should transform c888.dl.ma080 + 1200 and not c888.dl.ma080. But as Dieter
also pointed out, the 0.95 power isn't going to change the distribution of
the data much. As well, the problem here is that the distribution is more
heavy-tailed than asymmetric, and a Box-Cox transformation isn't going to
help.

Regards,
 John

--------------------------------
John Fox
Senator William McMaster 
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On
> Behalf Of Saji Ren
> Sent: January-11-10 12:09 AM
> To: r-help at r-project.org
> Subject: [R] Problem about Box-Cox transformation (topic in html form)
> 
> 
> Hi:
> 
> Recently, I want to perform a transformation on my data to make it more
> normal, meanwhile the order statistics is unchanged. So I decided to use a
> box-cox transformation.
> below is the qq-plot of the original data
> http://n4.nabble.com/file/n1011015/start%2Bvalue%2Bproblem%2B02.jpeg
> Note that the min of my data is -1099, so I add a fix value 1200 to the
> original sample.
> 
> I choose the "box.cox.powers" function in package 'car'. Here is the
result:
> > box.cox.powers(na.exclude(c888.dl.ma080+1200))
> Box-Cox Transformation to Normality
> 
>  Est.Power Std.Err. Wald(Power=0) Wald(Power=1)
>     0.9526   0.0237       40.2638       -2.0036
> 
> L.R. test, power = 0:  2014.192   df = 1   p = 0
> L.R. test, power = 1:  3.9807   df = 1   p = 0.046
> 
> Then I compared the result with original data, and it really confused me:
> http://n4.nabble.com/file/n1011015/start%2Bvalue%2Bproblem.jpeg
> The left is my original data sample, you can see that it is symetric and
the
> mean is close to 0. It just that the spread is large (there are outliers).
> The right is the transformed data, and the distribution is obviously no
> normal.
> 
> Can anyone explain that to me?
> 
> Thank you in advanced.
> 
> 
> -----
> ------------------------------------------------------------------
> Saji Ren
> from Shanghai China
> GoldenHeart Investment Group
> ------------------------------------------------------------------
> --
> View this message in context: http://n4.nabble.com/Problem-about-Box-Cox-
> transformation-topic-in-html-form-tp1011015p1011015.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list