[Rd] [R] computing the variance

Peter Dalgaard p.dalgaard at biostat.ku.dk
Mon Dec 5 20:33:16 CET 2005


Martin Maechler <maechler at stat.math.ethz.ch> writes:

> It seems Insightful at some point in time have given in to
> this user request, and S-plus nowadays has
> an argument  "unbiased = TRUE"
> where the user can choose {to shoot (him/her)self in the leg and}
> require 'unbiased = FALSE'.
> {and there's also 'SumSquraes = FALSE' which allows to not
>  require any division (by N or N-1)}
> 
> Since in some ``schools of statistics'' people are really still
> taught to use a 1/N variance, we could envisage to provide such an
> argument to var() {and cov()} as well.  Otherwise, people define
> their own variance function such as  
>       VAR <- function(x,....) .. N/(N-1)*var(x,...)
> Should we?

Using the biased variance just because it is the MLE (if that is the
argument) seems confused to me. However, there's another point:

> var(sample(1:3, 100000, replace=TRUE))
[1] 0.6680556

i.e. if we are considering x as the entire population, then the
variance when sampling from it is indeed 1/N*E(X-EX)^2, which is why
some presentations distinguish between the "population" and "sample"
variances. We might want to support this distinction somehow.

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-devel mailing list