[R] sum on column using apply

David Winsemius dwinsemius at comcast.net
Mon Apr 26 05:07:46 CEST 2010


On Apr 25, 2010, at 10:15 PM, robert lee wrote:

> I have two data frames ( x and y -- sample values below).  The rows  
> have HH:MM:SS and columns have names of devices.
>
> I am trying to find a list of 5 least used devices during recorded  
> time period.  When apply function is used to sum on the column, I  
> get the correct answer on data frame called x, but not for y.  The  
> data type of return answer is different and I cannot figure out  
> why.   Any insight into what is happening of possibly different  
> simpler ways to do this would be appreciated.
>
>> colnames(t(sort(apply(x,2,sum))[1:5]))
> [1] "5x"   "6x"   "7x"   "4x"   "103x"

Why are you getting colnames on the transpose of x?
>
> # the above returns the answer I am trying to get
>
>> colnames(t(sort(apply(y,2,sum))[1:5]))
> NULL

Again. Why get colnames on transpose?

>
> # the above does not
>
> below is sample data and output of the apply for each
>
>> x
>         0x 1x 2x 3x 4x 5x 6x 7x 32x 33x 34x 35x 36x 37x 38x 39x 64x  
> 65x 66x 67x 68x
> 13:55:24 21 18 18 18 17 16 16 17  29  29  25  23  19  18  18  21   
> 24  22  22  21  20
> 13:55:54 16  3  3  4  4  1  1  6  40  29  16   9   6  23  19  21   
> 27  19  29  15  19
> 13:56:24  3  2  6  2  1  1  4  1  33  40  28  13  10   2   4  15   
> 25  17   8  14  11
> [ truncated .... ]

You have not told us how you constructed "x" and it is undoubtedly  
important. We probably need at the very least the results of str(x)  
and str(y). This looks like a zoo object which is not a data.frame.


>
>> apply(x,2,sum)
>   0x    1x    2x    3x    4x    5x    6x    7x   32x   33x   34x    
> 35x   36x   37x
> 12042 11411 11343 11237 10937 10811 10909 10911 18341 16055 14406  
> 13770 12252 12003
>  38x   39x   64x   65x   66x   67x   68x
> 12266 13450 15426 14163 13913 13615 12972

So where did these extra columns come from>

> 69x   70x   71x   96x   97x   98x   99x
>  12656 13089 13329 12671 12562 12336 12045
> 100x  101x  102x  103x
> 11476 11212 11066 10997
>
>> y
>         nfs6 sd0 sd1 sd30 sd31 sd36 sd6 ssd100 ssd101 ssd102 ssd103  
> ssd104 ssd105
> 13:55:54    0   2   0    0    0    0   0      0      0      0       
> 0      0      0
> 13:56:54    0   3   0    0    0    0   0      0      0      0       
> 0      0      0
> 13:57:54    0   1   0    0    0    0   0      0      0      0       
> 0      0      0
> 13:58:54    0   1   0    0    0    0   0      0      0      0       
> 0      0      0
> [ truncated .... ]

This might or might not be a data.frame. Doing a transpose on a  
data.frame might have the side-effect of NULLing out the resulting  
colnames.
>
>> apply(y,2,sum)
>  [1]    0  515    0    0    0    0    0   96    0    0    0   90     
> 0    0    0    0
> [17]    0    1   13   96    0   31    0    0    0    0    0    0     
> 0    0   11    0
> [33]    0    0   12    0    0    0    0    0    0    0   16    0     
> 0    0    0    0
> [49]   31    0    0    0    0    0    0    0    0   10    0    0     
> 0    0    0    0
> [65]    0    0    7    0    0   84  337  642 1005    0  605  518     
> 0    0    5    0
> [81]    0   86  335  646 1014    0  606  513    0  737  418  306   
> 277  607  301  410
> [97] 1690  445  432    0  738  424  315  283  608  302  411 1688   
> 446  431    0    0
> [113]    0    0    0   93    0    0    0    0    0    1   12    0    0
>>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list