[R] error messages in matrix multiplication

Bob Green bgreen at dyson.brisnet.org.au
Mon Oct 24 22:51:51 CEST 2005


Hello,

I am hoping for some advice on using R - my experience with statistical 
programs has been limited to SPSS.

I have been using a textual analysis program and wanted to add some rigour 
to making a choice between two models of self-reported cannabis effects. To 
do this, I need to compare the two resulting word co-occurence matrices. 
The program itself,doesn't offer this as an option and the person who wrote 
the program, told me where the numerical data was located and offered this 
advice:

"For two vectors a and b, the cosine similarity is: therefore cos   theta = 
a . b / magn(a)*magn(b) & that the formula is really identical   for 
matrices. The dot product (or inner product) is calculated by   multiplying 
each pair of corresponding elements from the two matrices,   and summing 
these products. Calculating the magnitude of a matrix is   really the same 
as a vector:  square each element of the matrix, sum the   squares, then 
take the square root of the sum."

I have been advised that when matrices are multiplied I should use %*%, 
whereas if I want a point estimate I omit the %.


I have tried to run syntax with and without the %, however my efforts at 
either syntax below (a) or syntax (b) remain unsuccessful.

With (a) I obtain the message - Warning message: Error in A %*% B : 
non-conformable arguments

With (b) I obtain the message - Warning message:NAs produced by integer 
overflow in: sum(A * A) * sum(B * B) :


(a) Matrix

testA <-read.table("c:\\matrixA.txt",header=T)
testB <-read.table("c:\\matrixB.txt",header=T)

A<-as.matrix(testA)
B<-as.matrix(testB)

cosineDissimilarity <- sum(A%*%B)/sqrt(sum(A%*%A)*sum(B%*%B))



(b) pointwise

testA <-read.table("c:\\matrixA.txt",header=T)
testB <-read.table("c:\\matrixB.txt",header=T)

A<-as.matrix(testA)
B<-as.matrix(testB)

cosineDissimilarity <- sum(A*B)/sqrt(sum(A*A)*sum(B*B))



Any suggestions are appreciated, regarding either the above logic about 
analysis selection or the necessary syntax.

regards

Bob




More information about the R-help mailing list