[R] Comparing two matrices

Duncan Murdoch murdoch at stats.uwo.ca
Thu Jul 6 14:46:51 CEST 2006


On 7/6/2006 8:18 AM, Srinivas Iyyer wrote:
> hi:
> 
> I have matrix with dimensions(200 X 20,000). I have
> another file, a tab-delim file where first column
> variables are row names and second column variables
> are column names. 
> 
> 
> For instance:
> 
>> tmat
>   Apple Orange Mango Grape Star
> A     0      0     0     0    0
> O     0      0     0     0    0
> M     0      0     0     0    0
> G     0      0     0     0    0
> S     0      0     0     0    0
> 
> 
> 
>> tb # tab- delim file. 
>       V1 V2
> 1  Apple  S
> 2  Apple  A
> 3  Apple  O
> 4 Orange  A
> 5 Orange  O
> 6 Orange  S
> 7  Mango  M
> 8  Mango  A
> 9  Mango  S
> 
> 
> I have to read each line of the 'tb' (tab delim file),
> take the first variable, check if matches any rowname
> of the matrix. Take the second variable of the row in
> and check if it matches any column name.  If so,  put
> 1 else leave it. 
> 
> 
> The following is a small piece of code that, I felt is
> a solutions. However, since my original matrix and
> tab-delim file is very very huge, I am not sure if it
> is really doing the correct thing. Could any one
> please help me if I am doing this correct. 
> 
> 
> 
>> for(i in 1:length(tb[,1])){
> +  r = tb[i,1]
> +  c = as.character(tb[i,2])
> +  tmat[rownames(tmat)==c,colnames(tmat)==r] <-1
> + }

I think that works, but it's not as fast as some other ways of doing the 
same thing.  For example, table(tb) will give you a table of the counts 
of each pair of entries in tb.  pmin(table(tb), 1) will set the maximum 
count to 1.

An advantage of this approach is that it will show you if there are any 
entries in tb that aren't in your tmat (typos, etc.).  A disadvantage is 
that if there are any missing categories (e.g. G, Grape, Star in your 
sample) they won't show up at all, and you may need some manipulations 
to get things to look exactly the way you asked.  For example,

 > pmin(table(tb))
         V2
V1       A M O S
   Apple  1 0 1 1
   Mango  1 1 0 1
   Orange 1 0 1 1
 > pmin(table(tb[,2:1]))
    V1
V2  Apple Mango Orange
   A     1     1      1
   M     0     1      0
   O     1     0      1
   S     1     1      1


Duncan Murdoch



> 
> 
> 
>> tmat
>   Apple Orange Mango Grape Star
> A     1      1     1     0    0
> O     1      1     0     0    0
> M     0      0     1     0    0
> G     0      0     0     0    0
> S     1      1     1     0    0
> 
> 
> 
> Thanks.
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



More information about the R-help mailing list