[R] Use of the "by" command (clarification)

Chuck Cleland ccleland at optonline.net
Sun Jun 17 02:37:03 CEST 2007


Economics Guy wrote:
> Well apparently this has nothing to do with the gini() command.
> 
> I cannot get it to work for something as simple as sum()
> 
> Here is the little example I am playing with, maybe someone can help me find
> my error:
> 
> a<-c("A","B","C","A","B","C","A","A","C","B")
> 
> b<-c(23,6534,456,234,7,567,345,9,565,345)
> 
> c<-cbind(a,b)
> 
> by(c, a, function(x) sum(b))
> 
> and I get the output
> 
> INDICES: A
> [1] 9085
> ------------
> INDICES: B
> [1] 9085
> --------------
> INDICES: C
> [1] 9085
> 
> 
> Same problem as before. It is summing over the whole b vector rather than by
> the groups.
> 
> Anybody have any ideas on what I am doing wrong?

  Try this:

a <- c("A","B","C","A","B","C","A","A","C","B")

b <- c(23,6534,456,234,7,567,345,9,565,345)

c <- data.frame(a,b)

by(c, a, function(x) sum(x$b))

a: A
[1] 611
---------------------------------------------------------------------------------------------------------------------

a: B
[1] 6886
---------------------------------------------------------------------------------------------------------------------

a: C
[1] 1588

  Also, consider this:

> with(c, tapply(b, list(a), sum))
   A    B    C
 611 6886 1588

> Thanks,
> 
> EG
> 
> On 6/16/07, Economics Guy <economics.guy at gmail.com> wrote:
>> I have a data set that contains income data and a group identifier. Sort
>> of like:
>>
>>
>>        DATA
>>
>> Group,Income
>> A,2300
>> B,6776
>> A,6668
>> A,6768
>> B,9879
>> C,5577
>> A,7867
>> (etc),(etc)
>>
>> I am trying to compute the gini coefficient for each group.
>>
>> I have tried the following and none seem to do the trick:
>>
>> 1)
>>
>> attach(DATA)
>>
>> by(DATA, group, function(x) gini(income))
>>
>>
>> 2)
>>
>> attach(data)
>>
>> tapply(income, group, function(x) gini(income))
>>
>> Both of these return the same value for all groups. Like:
>>
>> group: A
>> [1] 0.2422496
>> ------------------------------------------------------------
>> group: B
>> [1] 0.2422496
>> ------------------------------------------------------------
>> group: C
>> [1] 0.2422496
>> ------------------------------------------------------------
>> group: D
>> [1] 0.2422496
>>
>> Any ideas on how I can make this work? I need the fastest way since I am
>> gonna run a monte carlo based on this routine once I get the basics working.
>>
>>
>> Thanks,
>>
>> EG
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894



More information about the R-help mailing list