[R] How to assign scores to rows based on column values

David Winsemius dwinsemius at comcast.net
Sun Apr 25 16:27:53 CEST 2010


On Apr 25, 2010, at 1:08 AM, burgundy wrote:

>
> Hi,
>
> I'm trying to assign a score to each row which allow me to identify  
> which
> rows differ. In the example file below, I've used "," to indicate  
> column
> separators. In this example, I'd like to identify that row 1 and row  
> 5 are
> the same, and row 2 and row 4 are teh same.
> Any help much appreciated. Also, any comments on what the command  
> lines do
> would be fantastic.
> Thanks!!
>
> example file:
> 0,0,1,0,1,0,0
> 0,1,0,0,0,0,1
> 0,0,0,0,0,0,0
> 0,1,0,0,0,0,1
> 0,0,1,0,1,0,0
> 0,0,0,1,0,0,0
>
> example request output:
> 1
> 2
> 3
> 2
> 1
> 4

If you use apply by rows with paste and a collapse argument you can  
get a text column. Using factor on that text column and then setting  
levels=unique(fac) one can extract the ordered elements with  
as.numeric(fac).

On a dataframe, rrr,  with those elements and such a factor, fac:

 > as.numeric(factor(rrr$fac, levels=unique(rrr$fac)))
[1] 1 2 3 2 1 4

One needs to use factor a second time because the levels after the  
first call were set to an alpha-sorted version of fac.

-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list