[R] data manipulation/subsetting and relation matrix

Juliet Hannah juliet.hannah at gmail.com
Tue Dec 8 02:38:11 CET 2009


Hi List,

Here is some example data.

myDat <- read.table(textConnection("group id
1 101
1 201
1 301
2 401
2 501
2 601
3 701
3 801
3 901"),header=TRUE)
closeAllConnections()

corr_mat <-read.table(textConnection("1 1   .5  0   0   0   0   0   0   0
2 .5   1  0   0   0   0   0   0   0
3 0    0  1.0   0   0   0   0   0   0
4 0    0  0   1   .5  .5  0   0   0
5 0    0  0   .5  1    .5  0   0   0
6 0    0  0   .5  .5   1 0    0   0
7 0    0  0   0    0   0  1   0  0
8 0   0   0   0    0   0   0  1  .5
9 0   0   0   0   0    0   0  .5 1"),header=FALSE)
closeAllConnections()

corr_mat <- corr_mat[,-1]
colnames(corr_mat) <- myDat$id
rownames(corr_mat) <- myDat$id

I need to subset this data such that observations within a group are not
related, which is indicated by a 0 in corr_mat.

For example, within group 1, 101 and 201 are related, so one of these
has to be selected, say
101. 301 is not related to 101 or 201, so the final set for group 1
consists of 101 and 301. There will always be at least 2 members in
each group. I need to carry this task on all groups.

One possible final data set looks like:

  group  id
1     1 101
3     1 301
4     2 401
7     3 701
8     3 801

Any suggestions? Thanks!

Juliet




More information about the R-help mailing list