[R] strata -- really slow performance

Jonathan Greenberg greenberg at ucdavis.edu
Sun Jul 12 03:42:59 CEST 2009


I'm a bit confused why the following command is taking an extrodinarily 
long time (> 2-3 hours) to run on an 3.06ghz iMac (brand new).  I'm 
trying to do a stratified random sample, drawing only a single value per 
UniqueID from the patch_summary data frame:

uniqueids <- unique(patch_summary$UniqueID)
uniqueids_stratasize <- c(1:length(uniqueids))*0+1
temp_species_patch_random_strata <- 
strata(patch_summary,c("UniqueID"),size=uniqueids_stratasize)

The patch_summary data frame is too big to include in this email, but 
I'll give some summary results for the inputs.  patch_summary has 48253 
rows, UniqueID has 661 levels, and uniqueid_stratasize is a vector if 
661 "1"s (corresponding to the 661 levels from UniqueID -- as I said I 
only want one per UniqueID).

Am I doing something silly here?  Am I using the wrong stratified 
sampling algorithm?  Any help would be appreciated -- thanks -- I don't 
see why a stratified sampling would take hours to run -- I am about to 
recode this as a series of subset statements which I'm sure will take 
only minutes...

--j

-- 

Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Cell: 415-794-5043
AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307




More information about the R-help mailing list