[R] Help with simulation of unbalanced clustered data

Wed Dec 16 14:50:40 CET 2020

This is R-help, not R-do-my-work-for-me. It is also not a homework help line. The Posting Guide is required reading. Assuming this is not homework, since each step in your problem definition can be mapped to a fairly basic operation in R (the sample function and indexing being key tools), you should be showing your work with a reproducible example that illustrates where you are stuck or why the result you are getting does not exhibit the desired properties.

On December 15, 2020 6:48:12 PM PST, Chao Liu <psychaoliu using gmail.com> wrote:
>Dear R experts,
>
>I want to simulate some unbalanced clustered data. The number of
>clusters
>is 20 and the average number of observations is 30. However, I would
>like
>to create an unbalanced clustered data per cluster where there are 10%
>more
>observations than specified (i.e., 33 rather than 30). I then want to
>randomly exclude an appropriate number of observations (i.e., 60) to
>arrive
>at the specified average number of observations per cluster (i.e., 30).
>The
>probability of excluding an observation within each cluster was not
>uniform
>(i.e., some clusters had no cases removed and others had more
>excluded).
>Therefore in the end I still have 600 observations in total. How to
>realize
>that in R? Thank you for your help!
>
>Best,
>
>Liu
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.