> Indeed, this topic has got me wondering how many times I may have
> blindly used sd(x) in the past, as if it were going to give me the
> standard (sum(x - mean(x))^2)/length(x) result!
At the risk of flogging a horse that has been dead for the better part of a century, I don't think there is anything "standard" about an SD with a divisor of N, and the biasedness of the version with N-1 divisor is not really the crucial issue. Rather, the distinction is between
- one sample from a known finite distribution
- multiple samples from an unknown distribution
and in particular between whether the mean is estimated or known.
One argument for the N-1 divisor in the normal case is that you can transform data to one observation with unknown mean and N-1 independent observations with mean known to be 0. The variance estimate will be a function of the N-1 variables, and thus there is no reason to let the mere existence of the uninformative Nth variable change the estimator.
Of course few people really care about N vs. N-1 but in larger linear models, it becomes N-p and p can be a sizeable fraction of N.
