[R] deviance vs entropy
remoteapl at obninsk.com
Thu Feb 15 14:19:08 CET 2001
The question looks like simple. It's probably even stupid. But I spent several hours
searching Internet, downloaded tons of papers, where deviance is mentioned and...
And haven't found an answer.
Well, it is clear for me the using of entropy when I split some node of a classification tree.
The sense is clear, because entropy is an old good measure of how uniform is distribution.
And we want, for sure, the distribution to be uniform, represent one class only as the best.
Where deviance come from at all? I look at a formula and see that the only difference to
entropy is use of *number* of each class points, instead of *probability* as a multiplier
of log(Pik). So, it looks like the deviance and entropy differ by factor 1/N (or 2/N), where
N is total number of cases. Then WHY to say "deviance"? Any historical reason?
Or most likely I do not understand something very basic. Please, help.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the R-help