[R] Using MASS::boxcox for a single variable gives different results than the original paper

Tal Galili tal.galili at gmail.com
Mon Oct 12 15:32:12 CEST 2015


After trying this with the function "estimateTransform" from {car}, it
returns values similar to my solution rather than the one from MASS::boxcox:


# Toy data
################
set.seed(13241089)
x <- rnorm(1000, 10)
x2 <- x**2 # we want to transform x2 to something more normal



# using MASS::boxcox
################

mle <- function(BC) {
with(BC, x[which.max(y)])
}

ONES <- rep(1, length(x2))
a <- MASS::boxcox(lm(x2 ~ ONES))
mle(a)
# lambda:
# 0.42424



# using estimateTransform from car
################

# Same result as the paper: !
library(car)
ONES <- rep(1, length(x2))
estimateTransform(X=data.frame(x = ONES), Y = x2)
# lambda:
# 0.40782

(just as with my own function in the previous email)



What am I missing?









----------------Contact
Details:-------------------------------------------------------
Contact me: Tal.Galili at gmail.com |
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------


On Mon, Oct 12, 2015 at 12:17 PM, Tal Galili <tal.galili at gmail.com> wrote:

> Hello all,
>
> Given a set of observations, I would like to find lambda for a boxcox
> transformation so to get a symmetric/normal result as possible.
>
> I try to use MASS::boxcox, but get different results than when using the
> formula from the original box-cox paper (link
> <http://www.jstor.org/stable/2984418?seq=1#page_scan_tab_contents>).
>
> I probably have made an error somewhere, but I can't figure out where.
>
> Here is an example in which the lambda by MASS::boxcox is 0.42424, while
> by the formula from the paper I get 0.40782.
>
>
>
>
>
>
>
>
> # Toy data
> ################
> set.seed(13241089)
> x <- rnorm(1000, 10)
> x2 <- x**2 # we want to transform x2 to something more normal
> plot(density(x2))
>
> # using MASS::boxcox
> ################
>
> zpoints <- function(y) {
> n <- length(y)
> qnorm(ppoints(n))[order(order(y))]
> }
> mle <- function(BC) {
> with(BC, x[which.max(y)])
> }
>
> a <- MASS::boxcox(x2 ~ zpoints(x2))
> mle(a)
> # lambda:
> # 0.42424
>
>
>
> # using formula from the paper
> ################
>
> loglik_lambda <- function(l, y) {
> GM <- exp(mean(log(y)))
> if(l==0) x <- log(y)*GM else x <- (y^l-1)/ (l * GM^(l-1) )
> #  if(l==0) x <- log(y) else x <- (y^l-1)/ (l )
> sd(x)
> }
> fo <- function(l) loglik_lambda(l, y = x2)
> V_fo <- Vectorize(fo)
> V_fo(2)
> curve(V_fo, -.5,1.5)
> optimize(V_fo, c(-3,3))
> # lambda:
> # 0.40782
>
>
>
>
>
>
>
>
>
>
>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list