[R] powerTransform warning message?

Brittany Demmitt demmitba at gmail.com
Fri Jul 17 18:08:45 CEST 2015


Thank you so much for the explanation.  That was very helpful! :-)  

Thanks!

Brittany


> On Jul 16, 2015, at 6:16 PM, John Fox <jfox at mcmaster.ca> wrote:
> 
> Dear Brittany,
> 
> On Thu, 16 Jul 2015 17:35:38 -0600
> Brittany Demmitt <demmitba at gmail.com> wrote:
>> Hello,
>> 
>> I have a series of 40 variables that I am trying to transform via the boxcox method using the powerTransfrom function in R.  I have no zero values in any of my variables.  When I run the powerTransform function on the full data set I get the following warning. 
>> 
>> Warning message:
>> In sqrt(diag(solve(res$hessian))) : NaNs produced
>> 
>> However, when I analyze the variables in groups, rather than all 40 at a time I do not get this warning message.  Why would this be? And does this mean this warning is safe to ignore?
>> 
> 
> No, it is not safe to ignore the warning, and the problem has nothing to do with non-positive values in the data -- when you say that there are no 0s in the data, I assume that you mean that the data values are all positive. The square-roots of the diagonal entries of the Hessian at the (pseudo-) ML estimates are the SEs of the estimated transformation parameters. If the Hessian can't be inverted, that usually implies that the maximum of the (pseudo-) likelihood isn't well defined. 
> 
> This isn't surprising when you're trying to transform as many as 40 variables at a time to multivariate normality. It's my general experience that people often throw their data into the Box-Cox black box and hope for the best without first examining the data, and, e.g., insuring a reasonable ratio of maximum/minimum values for each variable, checking for extreme outliers, etc. Of course, I don't know that you did that, and it's perfectly possible that you were careful.
> 
>> I would like to add that all of my lambda values are in the -5 to 5 range.  I also get different lambda values when I analyze the variables together versus in groups.  Is this to be expected?
>> 
> 
> Yes. It's very unlikely that both are right. If, e.g., the variables are multivariate normal within groups then their marginal distribution is a mixture of multivariate normals, which almost surely isn't itself normal.
> 
> I hope this helps,
> John
> 
> ------------------------------------------------
> John Fox, Professor
> McMaster University
> Hamilton, Ontario, Canada
> http://socserv.mcmaster.ca/jfox/
> 	
> 	
>> Thank you so much!
>> 
>> Brittany
>> 	[[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 	



More information about the R-help mailing list