[R] sampling from data frame

Maria Wolters maria at rhetorical.com
Thu Jun 6 09:24:04 CEST 2002


Hello,

after searching through the archives and
not finding a thread that answers this question,
I thought I'd pass it on to the list.

Given a data frame and given a factor variable
that assigns a class to each case in the data frame,
what is the most efficient way to sample
a given number of cases from each class?

I've found a roundabout solution that works as follows:
for each class:
    assign unique index to each class member
    chosen_cases <-  sample(n,indexvariable)
    extract chosen_cases from data frame
    (i.e. chosen <- subset(data, indexvariable %in% chosen_cases))

this solution relies on the Hmisc library and is
horribly inefficient. Any ideas on how to make it better
would be greatly appreciated.

Best from Edinburgh,

Maria

-- 
Maria Wolters		maria.wolters
Development Engineer    AT
Rhetorical Systems Ltd. rhetorical.com
		   Edinburgh

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list