[R] Latin hyper cube sampling from expand.grid()

Wed Jan 24 04:49:06 CET 2007

Prasanna <prasannaprakash <at> gmail.com> writes:

> 
> Dear R experts
> 
> I am looking for a package which gives me latin hyper cube samples
> from the grid of values produced from the command "expand.grid". Any
> pointers to this issue might be very useful. Basically, I am doing the
> following:
> 
> > a<-(1:10)
> > b<-(20:30)
> > dataGrid<-expand.grid(a,b)
> 
> Now, is there a way to use this "dataGrid" in the package "lhs" to get
> latin hyper cube samples.
> 
> Thanking you
> Prasanna
> 
> ______________________________________________
> R-help <at> stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

Prasanna,

I think I understand your question, please let me know if this explanation is 
not what you need.  Since, lhs is a contributed package, you could contact me 
directly first.

a <- 1:10
b <- 20:30
dataGrid <- expand.grid(a, b)

I believe that you want a Latin hypercube sample from the integers 1-10 in the 
first variable and 20-30 in the second.  I will offer a way to do something 
similar with the lhs package, but then also offer alternatives way which may 
meet your needs better.

The lhs package returns a uniformly distributed stratified sample from the 
unit hypercube.  The marginal distributions can then be transformed to your 
distribution of choice.  If you wanted a uniform Latin hypercube on [1,10] and 
[20,30] with 22 samples, you could do:

require(lhs)
X <- randomLHS(22, 2)
X[,1] <- 1+9*X[,1]
X[,2] <- 20+10*X[,2]
X

OR

X <- randomLHS(22, 2)
X[,1] <- qunif(X[,1], 1, 9)
X[,2] <- qunif(X[,2], 20, 30)
X

Since I think you want integers (which I haven't thought about before now), 
then I think we must be careful about what we mean by a Latin hypercube 
sample.  If you wanted exactly 3 points, then you could divide up the range 
[1,10] into three almost equal parts and sample from 1:3, 4:6, and 7:10.  The 
problem is that it wouldn't be uniform sample across the range. (7 would be 
sampled less often than 2 for example)

I think that to do a Latin hypercube sample on the intgers, you should have a 
number of integers on the margins which have the number of points sampled as a 
common factor.  For example if you sample 3 points from 1:9, and 21:32 then 
you could sample as follows:

a <- c(sample(1:3,1), sample(4:6, 1), sample(7:9, 1))
b <- c(sample(21:24,1), sample(25:28, 1), sample(29:32,1))

and then randomly permute the entries of a and b.

Or more generally, take n samples from the list of integer groups:

integerLHS <- function(n, intGroups)
{
  stopifnot(all(lapply(intGroups, function(X) length(X)%%n)==0))
  stopifnot(require(lhs))
  stopifnot(is.list(intGroups))
  ranges <- lapply(intGroups, function(X) max(X)-min(X))
  A <- matrix(nrow=n, ncol=length(intGroups))
  for(j in 1:length(ranges))
  {
    sequ <- order(runif(n))
    if(length(intGroups[[1]]) > 1)
    {
      spacing <- intGroups[[j]][2]-intGroups[[j]][1]
    } else stop("must have more than 1 intGroup")
    for(k in 1:n)
    {
      i <- sequ[k]
      a <- min(intGroups[[j]])+(i-1)*(ranges[[j]]+spacing)/n
      b <- min(intGroups[[j]])+i*(ranges[[j]]+spacing)/n-1
      if(a < b)
      {
        A[k,j] <- sample(seq(a,b,spacing), 1)
      } else if(a==b)
      {
        A[k,j] <- a
      } else stop("error")
    }
  }
  return(A)
}

integerLHS(10, list(1:10, 31:40))
integerLHS(5, list(1:10, 31:40))
integerLHS(2, list(1:10, 31:40))
integerLHS(5, list(1:20, 31:60, 101:115))
integerLHS(5, list(seq(2,20,2), 31:60, 101:115))

The function above is neither efficient nor tested, but it is a place for you 
to start.

Rob