[R] Trouble retrieving the second largest value from each row of a data.frame

David Winsemius dwinsemius at comcast.net
Sat Jul 24 14:40:05 CEST 2010


On Jul 23, 2010, at 9:20 PM, <mpward at illinois.edu> wrote:

> I have a data frame with a couple million lines and want to retrieve  
> the largest and second largest values in each row, along with the  
> label of the column these values are in. For example
>
> row 1
> strongest=-11072
> secondstrongest=-11707
> strongestantenna=value120
> secondstrongantenna=value60
>
> Below is the code I am using and a truncated data.frame.  Retrieving  
> the largest value was easy, but I have been getting errors every way  
> I have tried to retrieve the second largest value.  I have not even  
> tried to retrieve the labels for the value yet.
>
> Any help would be appreciated
> Mike
Using Holtman's extract of your data with x as the name and the order  
function to generate an index to names of the dataframe:
 > t(apply(x, 1, sort, decreasing=TRUE)[1:3, ])
      [,1]   [,2]   [,3]
1  -11072 -11707 -12471
2  -11176 -11799 -12838
3  -11113 -11778 -12439
4  -11071 -11561 -11653
5  -11067 -11638 -12834
6  -11068 -11698 -12430
7  -11092 -11607 -11709
8  -11061 -11426 -11665
9  -11137 -11736 -12570
10 -11146 -11779 -12537

Putting it all together:

  matrix( paste( t(apply(x, 1, sort, decreasing=TRUE)[1:3, ]),
                 names(x)[ t(apply(x, 1, order, decreasing=TRUE) 
[1:3, ]) ]),
          ncol=3)

       [,1]              [,2]              [,3]
  [1,] "-11072 value120" "-11707 value60"  "-12471 value180"
  [2,] "-11176 value120" "-11799 value180" "-12838 value0"
  [3,] "-11113 value120" "-11778 value60"  "-12439 value180"
  [4,] "-11071 value120" "-11561 value240" "-11653 value60"
  [5,] "-11067 value120" "-11638 value180" "-12834 value0"
  [6,] "-11068 value0"   "-11698 value60"  "-12430 value120"
  [7,] "-11092 value120" "-11607 value240" "-11709 value180"
  [8,] "-11061 value120" "-11426 value240" "-11665 value60"
  [9,] "-11137 value120" "-11736 value60"  "-12570 value180"
[10,] "-11146 value300" "-11779 value0"   "-12537 value180"

-- 
David.

>
>
>> data<-data.frame(value0,value60,value120,value180,value240,value300)
>> data
>   value0 value60 value120 value180 value240 value300
> 1  -13007  -11707   -11072   -12471   -12838   -13357
> 2  -12838  -13210   -11176   -11799   -13210   -13845
> 3  -12880  -11778   -11113   -12439   -13089   -13880
> 4  -12805  -11653   -11071   -12385   -11561   -13317
> 5  -12834  -13527   -11067   -11638   -13527   -13873
> 6  -11068  -11698   -12430   -12430   -12430   -12814
> 7  -12807  -14068   -11092   -11709   -11607   -13025
> 8  -12770  -11665   -11061   -12373   -11426   -12805
> 9  -12988  -11736   -11137   -12570   -13467   -13739
> 10 -11779  -12873   -12973   -12537   -12973   -11146
>> #largest value in the row
>> strongest<-apply(data,1,max)
>>
>>
>> #second largest value in the row
>> n<-function(data)(1/(min(1/(data[1,]-max(data[1,]))))+  
>> (max(data[1,])))
>> secondstrongest<-apply(data,1,n)
> Error in data[1, ] : incorrect number of dimensions
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list