[R] Box-Cox Transformation: Drastic differences when varying added constants

Greg Snow Greg.Snow at imail.org
Wed May 19 05:41:56 CEST 2010


Have you read the BoxCox paper?  It has the theory in there for dealing with an offset parameter (though I don't know of any existing functions that help in estimating both lambdas at the same time).  Though another important point (in the paper as well) is that the lambda values used should be based on sound science, not just what fits best. 

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Holger Steinmetz
> Sent: Sunday, May 16, 2010 6:22 AM
> To: r-help at r-project.org
> Subject: [R] Box-Cox Transformation: Drastic differences when varying
> added constants
> 
> 
> Dear experts,
> 
> I tried to learn about Box-Cox-transformation but found the following
> thing:
> 
> When I had to add a constant to make all values of the original
> variable
> positive, I found that
> the lambda estimates (box.cox.powers-function) differed dramatically
> depending on the specific constant chosen.
> 
> In addition, the correlation between the transformed variable and the
> original were not 1 (as I think it should be to use the transformed
> variable
> meaningfully) but much lower.
> 
> With higher added values (and a right skewed variable) the lambda
> estimate
> was even negative and the correlation between the transformed variable
> and
> the original varible was -.91!!?
> 
> I guess that is something fundmental missing in my current thinking
> about
> box-cox...
> 
> Best,
> Holger
> 
> 
> P.S. Here is what i did:
> 
> # Creating of a skewed variable X (mixture of two normals)
> x1 = rnorm(120,0,.5)
> x2 = rnorm(40,2.5,2)
> X = c(x1,x2)
> 
> # Adding a small constant
> Xnew1 = X +abs(min(X))+ .1
> box.cox.powers(Xnew1)
> Xtrans1 = Xnew1^.2682 #(the value of the lambda estimate)
> 
> # Adding a larger constant
> Xnew2 = X +abs(min(X)) + 1
> box.cox.powers(Xnew2)
> Xtrans2 = Xnew2^-.2543 #(the value of the lambda estimate)
> 
> #Plotting it all
> par(mfrow=c(3,2))
> hist(X)
> qqnorm(X)
> qqline(X,lty=2)
> hist(Xtrans1)
> qqnorm(Xtrans1)
> qqline(Xtrans1,lty=2)
> hist(Xtrans2)
> qqnorm(Xtrans2)
> qqline(Xtrans2,lty=2)
> 
> #correlation among original and transformed variables
> round(cor(cbind(X,Xtrans1,Xtrans2)),2)
> --
> View this message in context: http://r.789695.n4.nabble.com/Box-Cox-
> Transformation-Drastic-differences-when-varying-added-constants-
> tp2218490p2218490.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list