[R] Generating uniformly distributed correlated data.

Mike Marchywka marchywka at hotmail.com
Mon Feb 21 13:51:18 CET 2011








----------------------------------------
> Date: Mon, 21 Feb 2011 13:03:53 +0100
> From: erich.neuwirth at univie.ac.at
> To: soren.faurby at biology.au.dk; r-help at r-project.org
> Subject: Re: [R] Generating uniformly distributed correlated data.
>
> hw<-function(r){
> (3-sqrt(1+8*r))/4
> }
>
>
> x<-runif(1000)
> y<-(x+runif(1000,-hw(0.5),hw(0.5))) %% 1
>
>
> x and y will have correlation 0.5 and will be uniformly distributed
> on the unit interval.
> Replacing 0.5 by any nonnegative number r between 0 and 1 will
> create correlated uniformly distributed random numbers with correlation r.
>
> plot(x,y) will show the construction of the joint distribution of these
> random numbers.
> The rest is simple algebra.

Let me see if I can explain this since I puzzled over it for a whlie.
To decorrelate x and y, the general strategy is to add noise. This of
course makes the resulting distribtuion the convolution of the two 
source distro's, may not be simple algrebra to some :) To get back
the uniform, you could consider things like warping the output values.

This is rather clever, at least I hadn't seen it, as you are convolving
a large rectangle with a smaller one. This is uniform already in most
places except the ends where you can translate the out of range 
part back into place, exactly adding back the the uniform distribution.

I guess it it still isn't obvious to me what the mod
does to cor. Do you have an exact relation between your innovation 
amplitude and the resulting cor? I'd try it myself but can see
i need more coffee first. It is probably easy to show by suitable
calculation of E(x) and E(x^2) ? 


> for ( a in (1:10)/10 ) {
+ yaa<-(xaa+runif(100000,-hw(a),hw(a))) %% 1
+ print(a); print (cor(xaa,yaa));
+ }
[1] 0.1
[1] 0.09678492
[1] 0.2
[1] 0.2013333
[1] 0.3
[1] 0.3017615
[1] 0.4
[1] 0.3948470
[1] 0.5
[1] 0.4995593
[1] 0.6
[1] 0.6023712
[1] 0.7
[1] 0.7005193
[1] 0.8
[1] 0.8010425
[1] 0.9
[1] 0.8992287
[1] 1
[1] 1
>













>
>
>
>
>
>
> On 2/20/2011 3:17 AM, Søren Faurby wrote:
> > I wish to generate a vector of uniformly distributed data with a defined
> > correlation to another vector
> >
> > The only function I have been able to find doing something similar is
> > corgen from the library ecodist.
> >
> > The following code generates data with the desired correlation to the
> > vector x but the resulting vector y is normal and not uniform distributed
> >
> > library(ecodist)
> > x <- runif(10^5)
> > y <- corgen(x=x, r=.5)$y
> >
> > Do anyone know a similar function generating uniform distributed data or
> > a way of transforming y to the desired distribution while keeping the
> > correlation between x and y
> >
> > Kind regards, Soren
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
 		 	   		  


More information about the R-help mailing list