[R] Questions about permutation tests and error when using clusterCall (snow package)

Adam Zeilinger arz at berkeley.edu
Wed May 13 06:30:26 CEST 2015


Hello R help,

I am trying to run a permutation (or randomization) test on a mixed 
effects model.  The randomization is somewhat complex because I want to 
maintain the nested and unbalanced structure of the data.  I am trying 
two different methods.  The first is the PermTest() function in the 
pgirmess package.  The function works fine but the documentation is 
sparse, so I don't know whether it is maintaining the structure of the 
data.  Can anyone help elucidate what is happening in this function?

Second, I wrote my own function for the permutation test.  I would like 
to run it in parallel using clusterCall() from the snow package.  But I 
am having trouble understanding how to specify clusterCall().  The 
attempt described below returns an error:
"Error in checkForRemoteErrors(lapply(cl, recvResult)) : 3 nodes 
produced errors; first error: invalid 'length' argument".

I am working in R 3.2.0 in Windows 8.1 with following versions of the 
required packages: nlme 3.1-120; snow 0.3-13; pgirmess 1.6.0.
Any help would be much appreciated.

#### Permutation test on cluster

require(nlme); require(snow); require(pgirmess)

# Data frame
data <- structure(list(year = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("2009", "2010",
"2011"), class = "factor"), region = structure(c(1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("1",
"2"), class = "factor"), site = structure(c(1L, 1L, 1L, 2L, 2L,
2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 1L, 1L,
1L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L,
2L, 2L, 1L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 1L,
1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 3L,
3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), .Label = c("1",
"2", "3"), class = "factor"), crop = structure(c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("1",
"2", "3", "4"), class = "factor"), field = structure(c(1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 1L, 2L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 2L, 3L,
1L, 2L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 1L, 2L, 3L, 1L, 2L, 3L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("1",
"2", "3"), class = "factor"), lnlambda = c(-0.211969959, 0.119059554,
0.248738781, 0.390197821, 0.719815401, -0.643549723, 1.098612289,
0.416209183, 0.208254863, 0.358886297, 0.182792279, 0.981156325,
0.336352229, -1.098745298, 0.693147181, 0.69311718, -0.887324338,
-0.054062388, 0.250599173, -0.693177181, 0.223128351, -0.539010787,
-0.080041541, 0.328540866, 0.048785402, -0.287756742, -0.213609149,
0.470018629, -2.385966702, -0.731888009, -2.137070655, 0.035367144,
-1.027222293, 0.287432041, 0.462475363, 0.347129531, 0.762206716,
0.721248611, -1.789761467, 0, 1.098612289, 0.78845736, -0.336872317,
0.182321557, 0.287432041, 0.470003629, 0, -0.693147181, -0.693147181,
-1.386294361, 0.405465108, -0.095410185, -0.423120043, -0.655851396,
1.148671489, -0.759286983, 0.442760893, 0.783901544, 2.257169229,
0.63180355, -2.253794929, -0.570929548, -1.111697528, 0.251536626,
-1.099612789, 0.693147181, -0.811930717, -0.510825624, -0.287682072,
0.133656385, -0.251028755, -0.693147181, -1.789761467, -1.251763468,
0.336472237, -1.448169765, -1.703748592, 1.386294361, 0.251536626,
-0.916290732)), .Names = c("year", "region", "site", "crop",
"field", "lnlambda"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 9L, 10L, 11L, 12L, 13L, 14L, 16L, 17L, 19L, 20L, 21L, 22L,
23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 49L, 51L, 52L, 53L, 54L,
55L, 56L, 57L, 58L, 59L, 60L, 61L, 65L, 66L, 67L, 68L, 69L, 70L,
71L, 72L, 74L, 75L, 76L, 77L, 97L, 98L, 99L, 100L, 101L, 102L,
103L, 104L, 106L, 107L, 108L, 109L, 110L, 111L, 113L, 114L, 115L,
116L, 117L, 118L, 119L, 120L, 121L, 122L, 123L, 124L, 125L, 126L
), class = "data.frame")

# random effects model
mod <- lme(lnlambda ~ year*region*crop, data = data, random = list(site 
= ~1|year:region))

# Randomization test using PermTest()
# Not sure what it's doing
permtest1 <- PermTest(mod, B = 100)

# DIY permutation test with cluster
ti <- proc.time() # Initial time
cl <- makeCluster(3, type = "SOCK") # Make cluster
   require(nlme)
   clusterExport(cl, "data") # Send "data" to each node
   # Set up function to randomize response
   randTest <- function(data){
     # Randomize response variable, thereby preserving nested and 
unbalanced structure
     data$randomLambda <- sample(data$lnlambda)
     # Fit LME model
     lmeResults <- anova(lme(randomLambda ~ year*region*crop, data = 
data, random = list(site = ~1|year:region)))
     return(lmeResults)
   }
   # Call to cluster with replicate() function
   rtoutCluster <- clusterCall(cl, function(n.iter = 100) 
replicate(n.iter, randTest(data),
simplify = FALSE),
                               data)
stopCluster(cl)
tf <- proc.time() # End time
# Time it took
print(tf - ti)

Thanks in advance for your help,
Adam


-- 
Adam Zeilinger
Postdoctoral scholar
Berkeley Initiative for Global Change Biology
University of California Berkeley



More information about the R-help mailing list