[R] Measure the frequencies of pairs in a matrix

William Dunlap wdunlap at tibco.com
Wed Oct 7 18:39:11 CEST 2015


You could also call table() on the columns of the input matrix, first
converting them
to factors with levels 1:max.  Then add together the upper and lower
triangles of
the table if order is not important.  E.g.,
f2 <- function (mat)
{
    maxMat <- max(mat)
    stopifnot(is.matrix(mat), all(mat %in% seq_len(maxMat)))
    L <- split(factor(mat, levels = seq_len(maxMat)), col(mat))
    Table <- do.call(table, unname(L))
    ignoreOrder <- function(M) {
        stopifnot(length(dim(M)) == 2)
        lower <- lower.tri(M, diag = FALSE)
        upper <- upper.tri(M, diag = FALSE)
        M[lower] <- M[lower] + t(M)[lower]
        M[upper] <- t(M)[upper]
        M
    }
    ignoreOrder(Table)
}

> mat <- structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
> f2(mat)

     1  2  3  4  5  6  7
  1  0  0  0  0  0  0  0
  2  0  0  0  0  0  0  0
  3  0  0  0  2  0  0  2
  4  0  0  2  0  4  0  0
  5  0  0  0  4  2 10  4
  6  0  0  0  0 10  0  2
  7  0  0  2  0  4  2  0
Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Oct 7, 2015 at 6:09 AM, Boris Steipe <boris.steipe at utoronto.ca> wrote:
> Still not sure I understand. But here is what I think you might mean:
>
> # Your data
> mat <- structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
> 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
> 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
>
> # Create a square matrix with enough space to have an element for each pair. Since
> # order is not important, only the upper triangle is used. If the matrix is
> # large and sparse, a different approach might be needed.
> freq <- matrix(numeric(max(mat) * max(mat)),  nrow = max(mat), ncol = max(mat))
>
> # Loop over your input
> for (i in 1:nrow(mat)) {
>     # Sort the elements of a row by size.
>     x <- sort(mat[i,])
>     # Increment the corresponding element of the frequency matrix
>     freq[x[1], x[2]] <- freq[x[1], x[2]] + 1
> }
>
> freq
>
>
> Cheers,
> B.
>
>
>
>
>
> On Oct 7, 2015, at 1:17 AM, Hermann Norpois <hnorpois at gmail.com> wrote:
>
>> Ok, this was misleading. And was not that important. My result matrix should look like this:
>>
>>   1    2   3   4   5   6   7 ...
>> 1 p1 p2
>> 2 p
>> 3
>> 4
>>
>> p1 etc are the frequencies of the combinations
>>
>> 1 and 2 for instance do not appear in my example. So the values would be zero. Actually, this part is not too important. I would be happy enough to solve the challenge with the frequencies of the pairs.
>> Thanks Hermann
>>
>> 2015-10-07 2:40 GMT+02:00 Boris Steipe <boris.steipe at utoronto.ca>:
>> Since order is not important to you, you can order your pairs (e.g. decreasing) before compiling the frequencies.
>> But I don't understand the second part about values "that do not appear in the matrix". Do you mean you want to assess all combinations? If that's the case I would think about a hash table or other indexed data structure, rather than iterating through a matrix.
>>
>>
>> B.
>>
>>
>>
>> On Oct 6, 2015, at 4:59 PM, Hermann Norpois <hnorpois at gmail.com> wrote:
>>
>> > Hello,
>> >
>> > I have a matrix mat (see dput(mat))
>> >
>> >> mat
>> >      [,1] [,2]
>> > [1,]    5    6
>> > [2,]    6    5
>> > [3,]    5    4
>> > [4,]    5    5
>> > ....
>> >
>> > I want the frequencies of the pairs in a new matrix, whereas the
>> > combination 5 and 6 is the same as 6 and 5 (see the first two rows of mat).
>> > In other words: What is the probability of each combination (each row)
>> > ignoring the order in the combination. As a result I would like to have a
>> > matrix that includes rows and cols 0, 1, 2 ... max (mat) that do not appear
>> > in my matrix.
>> >
>> > dput (mat)
>> > structure(c(5, 6, 5, 5, 4, 3, 6, 7, 4, 7, 5, 5, 5, 5, 6, 5, 5,
>> > 4, 3, 6, 7, 4, 7, 5, 5, 5, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7,
>> > 6, 6, 5, 4, 5, 5, 7, 5, 6, 3, 5, 6, 7, 6), .Dim = c(26L, 2L))
>> >
>> > Thanks
>> > Hermann
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list