[R] A vector of normal distributed values with a sum-to-zero constraint

Tue Apr 1 16:58:17 CEST 2014

It seems so simple to me, that I must be missing something.

Subject to Jeff Newmiller's reminder of FAQ 7.31; if the sum is zero 
then the mean is zero and vice versa.

The OP's original attempt of:
-------------
l <- 1000000
aux <- rnorm(l,0,0.5)
s <- sum(aux)/l
aux2 <- aux-s
sum(aux2)
-------------
is equivalent to

   aux2 <- rnorm(l,0,0.5)
   aux2 <- aux2-mean(aux2)

If calculations were exact then aux2 would have mean, and thus sum, 
equal to zero - any difference from zero is attributable entirely to 
machine precision.

On 01/04/2014 15:25, Boris Steipe wrote:
> But the result is not Normal. Consider:
>
> set.seed(112358)
> N <- 100
> x <- rnorm(N-1)
> sum(x)
>
> [1] 1.759446   !!!
>
> i.e. you have an outlier at 1.7 sigma, and for larger N...
>
> set.seed(112358)
> N <- 10000
> x <- rnorm(N-1)
> sum(x)
> [1] -91.19731
>
> B.
>
>
> On 2014-04-01, at 10:14 AM, JLucke at ria.buffalo.edu wrote:
>
>> The sum-to-zero constraint imposes a loss of one degree of freedom.  Of  N samples, only (N-1) can be random.   Thus the solution is
>>> N <- 100
>>> x <- rnorm(N-1)
>>> x <- c(x, -sum(x))
>>> sum(x)
>> [1] -7.199102e-17
>>
>>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Boris Steipe <boris.steipe at utoronto.ca>
>> Sent by: r-help-bounces at r-project.org
>> 04/01/2014 09:29 AM
>>
>> To
>> Marc Marí Dell'Olmo <marceivissa at gmail.com>,
>> cc
>> "r-help at r-project.org" <r-help at r-project.org>
>> Subject
>> Re: [R] A vector of normal distributed values with a sum-to-zero        constraint
>>
>>
>>
>>
>>
>> Make a copy with opposite sign. This is Normal, symmetric, but no longer random.
>>
>>   set.seed(112358)
>>   x <- rnorm(5000, 0, 0.5)
>>   x <- c(x, -x)
>>   sum(x)
>>   hist(x)
>>
>> B.
>>
>> On 2014-04-01, at 8:56 AM, Marc Marí Dell'Olmo wrote:
>>
>>> Dear all,
>>>
>>> Anyone knows how to generate a vector of Normal distributed values
>>> (for example N(0,0.5)), but with a sum-to-zero constraint??
>>>
>>> The sum would be exactly zero, without decimals.
>>>
>>> I made some attempts:
>>>
>>>> l <- 1000000
>>>> aux <- rnorm(l,0,0.5)
>>>> s <- sum(aux)/l
>>>> aux2 <- aux-s
>>>> sum(aux2)
>>> [1] -0.000000000006131392
>>>>
>>>> aux[1]<- -sum(aux[2:l])
>>>> sum(aux)
>>> [1] -0.00000000000003530422
>>>
>>>
>>> but the sum is not exactly zero and not all parameters are N(0,0.5)
>>> distributed...
>>>
>>> Perhaps is obvious but I can't find the way to do it..
>>>
>>> Thank you very much!
>>>
>>> Marc
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>