[R] dataframe, simulating data

David Winsemius dwinsemius at comcast.net
Fri Dec 31 14:32:29 CET 2010


On Dec 31, 2010, at 8:05 AM, Sarah wrote:

>
> I'm sorry, I don't think I've made myself clear enough.
>
> Cases have been randomly assigned to one of the two groups, with  
> certain
> probabilities (based on other variables).
> So, if there are too many people (i.e., more than 34) assigned to  
> group 0, I
> would like to sample 34 cases from group 0, and give the rest of the  
> cases a
> value 1. My dataframe would contain 40 cases; 34 with mar.y==0 and  
> the rest
> given (or some already had) a value mar.y==1.
> If, however, too few cases have been assigned to group 0, I need to  
> randomly
> select cases from group 1 and put them in group 0 (i.e., give them a  
> value
> 0). My dataframe would contain the previous selected cases  
> (mar.y==0), PLUS
> cases from group 1 who are now assigned to group 0 (mar.y==0), PLUS  
> the
> remaining cases who stayed in group 1 (mar.y==1).
> (In other words, how can I change the value for df$mar.y from 1 to 0  
> or vice
> versa for some cases)?
>
> With the script I've designed, only 34 cases would remain in the  
> dataframe
> (the cases assigned to group 0)...
>
> if (length(which(df$mar.y==0))>34) {
> df <- df[sample(which(df$mar.y==0),34), ]
> } else {
> df <- df[c(which(df$mar.y==0),
> sample(which(df$mar.y==1),34-length(which(df$mar.y==0)))), ]
> }

Just work on indices.
set.seed(321)
df$newgrp= 0
if (length(which(df$mar.y==0)) > 34) {  ## too many in group 0
df$newgrp[ !sample(which(df$mar.y==0),34) ] <- 1  # random excess to  
group 1
                                     }
# leave alone the situation with 34 in group 0

if (length(which(df$mar.y==0)) < 34) { ## too few in group 0
df$newgrp[ !sample(which(df$mar.y==1),6) ] <- 0  # random excess to  
group 0

#  df$newgrp is now your corrected group variable.

-- 
David.

>
> ....while 40 cases are needed.
> Thanks for your replies.
> Sarah.
> -- 
> View this message in context: http://r.789695.n4.nabble.com/dataframe-simulating-data-tp3169246p3169354.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list