[R] Generating groupings of ordered observations

Charles C. Berry cberry at tajo.ucsd.edu
Sat Jun 21 20:30:33 CEST 2008


On Sat, 21 Jun 2008, Gavin Simpson wrote:

> Dear List,
>
> I have a problem I'm finding it difficult to make headway with.
>
> Say I have 6 ordered observations, and I want to find all combinations
> of splitting these 6 ordered observations in g groups, where g = 1, ...,
> 6. Groups can only be formed by adjacent observations, so observations 1
> and 4 can't be in a group on their own, only if 1,2,3&4 are all in the
> group.
>

Right. And in the example below there are 32 distinct patterns.

Which arises from sum( choose( 5, 0:5 ) ) different placements of 0:5 
split positions.

You can represent the splits as a binary number with n-1 bits: 00000 
implies no splits, 10000 implies a split between 1 and 2, 10100 implies 
splits between 1 and 2 and between 3 and 4, et cetera.

So, 32 arises as 2^5, too.

Something like this:

> base10 <- seq(0, length=2^(n-1) )
> base2.bits <- outer(0:(n-2), base10, function(y,x) ( x %/% (2^y)) %%2 )
> sapply(apply( base2.bits==1, 2, which ), function(x) rep(1:(1+length(x)), diff(c(0,x,n))))

Getting this in the same column order as your example is left as an 
exercise for the reader.

HTH,

Chuck

> For example, with 6 observations, the columns of the matrices below
> represent the groups that can be formed by placing the 6 ordered
> observations into 2-5 groups. Think of the columns of these matrices as
> being an indicator of group membership. We then cbind these matrices
> with the trivial partitions into 1 and 6 groups:
>
> mat2g <- matrix(c(1,1,1,1,1,
>                  2,1,1,1,1,
>                  2,2,1,1,1,
>                  2,2,2,1,1,
>                  2,2,2,2,1,
>                  2,2,2,2,2),
>                nrow = 6, ncol = 5, byrow = TRUE)
>
> mat3g <- matrix(c(1,1,1,1,1,1,1,1,1,1,
>                  2,2,2,2,1,1,1,1,1,1,
>                  3,2,2,2,2,2,2,1,1,1,
>                  3,3,2,2,3,2,2,2,2,1,
>                  3,3,3,2,3,3,2,3,2,2,
>                  3,3,3,3,3,3,3,3,3,3),
>                nrow = 6, ncol = 10, byrow = TRUE)
>
> mat4g <- matrix(c(1,1,1,1,1,1,1,1,1,1,
>                  2,2,2,2,2,2,1,1,1,1,
>                  3,3,3,2,2,2,2,2,2,1,
>                  4,3,3,3,3,2,3,3,2,2,
>                  4,4,3,4,3,3,4,3,3,3,
>                  4,4,4,4,4,4,4,4,4,4),
>                nrow = 6, ncol = 10, byrow = TRUE)
>
> mat5g <- matrix(c(1,1,1,1,1,
>                  2,2,2,2,1,
>                  3,3,3,2,2,
>                  4,4,3,3,3,
>                  5,4,4,4,4,
>                  5,5,5,5,5),
>                nrow = 6, ncol = 5, byrow = TRUE)
>
> cbind(rep(1,6), mat2g, mat3g, mat4g, mat5g, 1:6)
>
> I'd like to be able to do this automagically, for any (reasonable,
> small, say n = 10-20) number of observations, n, and for g = 1, ..., n
> groups.
>
> I can't see the pattern here or a way forward. Can anyone suggest an
> approach?
>
> Thanks in advance,
>
> Gavin
>
> -- 
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
> ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
> Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
> Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
> UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901



More information about the R-help mailing list