[R] How to do multi-factor stratified sampling in R

David Winsemius dwinsemius at comcast.net
Sat Mar 8 21:54:17 CET 2008


"Robert A. LaBudde" <ral at lcfltd.com> wrote in
news:0JXF00LSO864ATE0 at vms040.mailsrvcs.net: 

> Given a set of data with a number of variables plus a response, I'd 
> like to obtain a randomized subset of the rows such that the
> marginal proportions of each variable are maintained closely in the
> subset to that of the dataset, and possibly maintaining as well the
> two-factor interaction marginal proportions as well for some pairs.
> 
> This must be a common problem in data mining, but I don't seem to be
> able to locate the proper library or function for doing this in R.
> 
> Thanks for any help.

Have you looked at the "sampling" package? I have never used it, but the 
strata() function appears to be capable.

-- 
David Winsemius



More information about the R-help mailing list