[R] sampling from data frame

Maria Wolters maria at rhetorical.com
Thu Jun 6 09:24:04 CEST 2002


after searching through the archives and
not finding a thread that answers this question,
I thought I'd pass it on to the list.

Given a data frame and given a factor variable
that assigns a class to each case in the data frame,
what is the most efficient way to sample
a given number of cases from each class?

I've found a roundabout solution that works as follows:
for each class:
    assign unique index to each class member
    chosen_cases <-  sample(n,indexvariable)
    extract chosen_cases from data frame
    (i.e. chosen <- subset(data, indexvariable %in% chosen_cases))

this solution relies on the Hmisc library and is
horribly inefficient. Any ideas on how to make it better
would be greatly appreciated.

Best from Edinburgh,


Maria Wolters		maria.wolters
Development Engineer    AT
Rhetorical Systems Ltd. rhetorical.com

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list