[R] dataframe, simulating data

David Winsemius dwinsemius at comcast.net
Fri Dec 31 14:52:15 CET 2010


On Dec 31, 2010, at 8:32 AM, David Winsemius wrote:

>
> On Dec 31, 2010, at 8:05 AM, Sarah wrote:
>
>>
>> I'm sorry, I don't think I've made myself clear enough.
>>
>> Cases have been randomly assigned to one of the two groups, with  
>> certain
>> probabilities (based on other variables).
>> So, if there are too many people (i.e., more than 34) assigned to  
>> group 0, I
>> would like to sample 34 cases from group 0, and give the rest of  
>> the cases a
>> value 1. My dataframe would contain 40 cases; 34 with mar.y==0 and  
>> the rest
>> given (or some already had) a value mar.y==1.
>> If, however, too few cases have been assigned to group 0, I need to  
>> randomly
>> select cases from group 1 and put them in group 0 (i.e., give them  
>> a value
>> 0). My dataframe would contain the previous selected cases  
>> (mar.y==0), PLUS
>> cases from group 1 who are now assigned to group 0 (mar.y==0), PLUS  
>> the
>> remaining cases who stayed in group 1 (mar.y==1).
>> (In other words, how can I change the value for df$mar.y from 1 to  
>> 0 or vice
>> versa for some cases)?
>>
>> With the script I've designed, only 34 cases would remain in the  
>> dataframe
>> (the cases assigned to group 0)...
>>
>> if (length(which(df$mar.y==0))>34) {
>> df <- df[sample(which(df$mar.y==0),34), ]
>> } else {
>> df <- df[c(which(df$mar.y==0),
>> sample(which(df$mar.y==1),34-length(which(df$mar.y==0)))), ]
>> }
>
> Just work on indices.
> set.seed(321)
> df$newgrp= 0
> if (length(which(df$mar.y==0)) > 34) {  ## too many in group 0
> df$newgrp[ !sample(which(df$mar.y==0),34) ] <- 1  # random excess to  
> group 1
>                                    }
> # leave alone the situation with 34 in group 0
>
> if (length(which(df$mar.y==0)) < 34) { ## too few in group 0
> df$newgrp[ !sample(which(df$mar.y==1),6) ] <- 0  # random excess to  
> group 0
                                    } # forgot a closing curley brace
>
> #  df$newgrp is now your corrected group variable.
>
> -- 
> David.
>
>>
>> ....while 40 cases are needed.
>> Thanks for your replies.
>> Sarah.
>> -- 
>> View this message in context: http://r.789695.n4.nabble.com/dataframe-simulating-data-tp3169246p3169354.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list