[R] Box-Cox Transformation: Drastic differences when varying added constants
billpikounis at gmail.com
Mon May 17 18:23:07 CEST 2010
I would also highly recommend you look at the ?boxcox and ?logtrans
functions in the MASS package. There is also a very illuminating,
concise discussion about their use on Pages 170 - 172 of
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics
with S. Fourth edition.
Hope that helps,
On Sun, May 16, 2010 at 13:01, Peter Ehlers <ehlers at ucalgary.ca> wrote:
> On 2010-05-16 6:22, Holger Steinmetz wrote:
>> Dear experts,
>> I tried to learn about Box-Cox-transformation but found the following
>> When I had to add a constant to make all values of the original variable
>> positive, I found that
>> the lambda estimates (box.cox.powers-function) differed dramatically
>> depending on the specific constant chosen.
> Let's say that x is such that 1/x has a Normal distribution,
> i.e. lambda = -1.
> Then y = (1/x) + b also has a Normal distribution.
> But you're expecting 1/(x+b) to also have a Normal distribution.
>> In addition, the correlation between the transformed variable and the
>> original were not 1 (as I think it should be to use the transformed
>> meaningfully) but much lower.
> Again, your expectation is faulty. The relationship between the
> original and transformed variables is not linear (otherwise,
> why do the transformation?), but cor() computes the Pearson
> correlation coefficient by default. Try method='spearman'.
> Better yet, plot the transformed variables vs the original
> variable for further enlightenment.
> -Peter Ehlers
>> With higher added values (and a right skewed variable) the lambda estimate
>> was even negative and the correlation between the transformed variable and
>> the original varible was -.91!!?
>> I guess that is something fundmental missing in my current thinking about
>> P.S. Here is what i did:
>> # Creating of a skewed variable X (mixture of two normals)
>> x1 = rnorm(120,0,.5)
>> x2 = rnorm(40,2.5,2)
>> X = c(x1,x2)
>> # Adding a small constant
>> Xnew1 = X +abs(min(X))+ .1
>> Xtrans1 = Xnew1^.2682 #(the value of the lambda estimate)
>> # Adding a larger constant
>> Xnew2 = X +abs(min(X)) + 1
>> Xtrans2 = Xnew2^-.2543 #(the value of the lambda estimate)
>> #Plotting it all
>> #correlation among original and transformed variables
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help