[R] simulate correlated binary, categorical and continuous variable

Greg Snow 538280 at gmail.com
Wed Apr 4 18:23:41 CEST 2012


How are you calculating the correlations?  That may be part of the
problem, when you categorize a continuous variable you get a factor
whose internal representation is a set of integers.  If you try to get
a correlation with that variable it will not be the polychoric
correlation.

Also do you need your data to have the exact proportions and means
that you show below? or represent random samples from those
populations and therefore the actual proportions and means will vary a
bit from what is specified?

If you are interested in tetrachoric and polychoric correlations, then
generating the latent normals and categorizing seems the most
straightforward method.

Also, which function (from which package) are you using to generate
your normal variables?  That may have some effect.

On Sun, Apr 1, 2012 at 7:00 PM, Burak Aydin <burak235813 at hotmail.com> wrote:
> Hello Greg,
> Sorry for the confusion.
> Lets say, I have a population.  I have 6 variables. They are correlated to
> each other. I can get you pearson correlation, tetrachoric or polychoric
> correlation coefficients.
> 2 of them continuous, 2 binary, 2 categorical.
> Lets assume following conditions;
> Co1 and Co2 are normally distributed continuous random variables. Co1-- N
> (0,1), Co2--N(100,15)
> Ca1 and Ca2 are categorical variables. Ca1 probabilities
> =c(.02,.18,.28,.22,.30), Ca2 probs =c(.06,.18,.76)
> Bi1 and Bi2 are binaries, Marginal probabilities Bi1 p= 0.4,  Bi2 p=0.5.
> And , again, I have the correlations.
>
> When I try to simulate this population I fail. If I keep the means and
> probabilities same I lost the correct correlations. When I keep
> correlations, I loose precision on means and frequencies/probabilities.
> See these links please
> http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/copulademo.html
> http://stats.stackexchange.com/questions/22856/how-to-generate-correlated-test-data-that-has-bernoulli-categorical-and-contin
> http://www.springerlink.com/content/011x633m554u843g/
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/simulate-correlated-binary-categorical-and-continuous-variable-tp4516433p4524863.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538280 at gmail.com



More information about the R-help mailing list